Integration Engineer

Full-Time

India, Remote

1 Opening

About the Role

LakeFusion is seeking an Integration Engineer to build and maintain third-party data integrations for our Master Data Management platform, built natively on the Databricks Data Intelligence Platform. In this role, you will own the end-to-end development of integrations with enterprise data providers such as Dun & Bradstreet, ZoomInfo, and Smarty, delivering reliable connectors that enrich customer data at scale.

This is a hybrid role spanning full-stack development and data engineering. You will build Spark Structured Streaming pipelines to ingest and process third-party data, develop the backend services and APIs that orchestrate these integrations, and create the frontend configuration experiences that let customers set them up. Most integrations follow a standard architectural pattern — your job is to execute that pattern consistently, reliably, and at a steady cadence while adapting to the quirks of each provider.

Success in this role depends heavily on communication and partnership. You will work directly with third-party vendors to understand their APIs, authentication models, data schemas, rate limits, and delivery mechanisms, then translate those requirements into working integrations. You'll also collaborate closely with Product, Platform, and Delivery teams to ensure each integration meets customer needs.

This role suits someone who enjoys shipping well-scoped, repeatable work end-to-end, takes ownership of external relationships, and values consistency and reliability over novelty.

What you'll do

  • Own integrations end-to-end: Design, build, test, and maintain third-party data provider integrations from API contract through streaming pipeline to UI configuration, following LakeFusion's standard integration pattern.
  • Build streaming data pipelines: Develop PySpark Structured Streaming pipelines on Databricks to ingest, transform, and land enriched data from third-party providers.
  • Develop backend services: Build Python/FastAPI services that orchestrate API calls to third-party providers, handle authentication, manage rate limiting and retries with exponential backoff, and expose integration configuration endpoints.
  • Build frontend configuration UIs: Develop Node.js/React interfaces that let customers configure, monitor, and manage their third-party integrations within the LakeFusion product.
  • Partner with third-party vendors: Act as the primary technical point of contact with data providers. Read API documentation carefully, ask the right questions, run test transactions, and validate that integrations meet both vendor and LakeFusion requirements.
  • Translate requirements into specs: Take vendor API documentation and customer requirements and produce clear, implementable integration specs that follow LakeFusion's standard patterns.
  • Ensure reliability: Build integrations that handle provider-side failures gracefully — rate limits (429s), auth expiration, schema drift, partial outages — with observable logging, alerting, and idempotent retry behavior.
  • Maintain and extend existing integrations: Monitor production integrations, respond to provider API changes, and iterate on the shared integration framework as patterns evolve.

What we're looking for

  • 5+ years of hands-on experience across both application development and data engineering, with demonstrated ability to ship features end-to-end.
  • Strong Python skills, with experience building production services using FastAPI or similar frameworks.
  • Practical PySpark experience, including Structured Streaming, Delta Lake, and common patterns like CDC/CDF, and checkpointing. Databricks experience strongly preferred.
  • Node.js and modern JavaScript/TypeScript experience, with working knowledge of a frontend framework (React preferred).
  • Proven track record integrating with third-party REST APIs, including handling authentication (OAuth, API keys, token refresh), pagination, rate limiting, webhooks, and bulk/batch endpoints.
  • Excellent written and verbal communication skills in English. This role requires directly emailing, meeting with, and problem-solving alongside third-party vendor technical teams — the ability to ask precise questions, document answers clearly, and follow through is essential.
  • Strong reading comprehension for technical documentation. You can work through a 200-page API spec, identify the relevant sections, and pull out what matters for the integration.
  • Solid SQL skills and comfort with relational data modeling.
  • Ownership mindset: you treat each integration as yours, from vendor kickoff through production monitoring, and you don't drop things between handoffs.

Nice-to-have

  • Experience integrating with enterprise data enrichment providers (Dun & Bradstreet, ZoomInfo, Smarty, Experian, LexisNexis, or similar).
  • Experience with Master Data Management, entity resolution, or data quality platforms.
  • Familiarity with Databricks Unity Catalog, Workflows, and Jobs.
  • Experience with Azure or AWS.
  • Exposure to CI/CD pipelines, Docker, and infrastructure-as-code (Terraform).
  • Experience building integrations as a reusable framework rather than one-off implementations.

About LakeFusion

LakeFusion is the modern Master Data Management (MDM) company. Global enterprises across industries ranging from retail to manufacturing and financial services rely on the LakeFusion platform to unify, govern, and deliver trusted data entities such as customers, products, suppliers, and employees. Built natively on the Databricks Lakehouse, LakeFusion creates a single source of truth that powers analytics and AI. LakeFusion enables organizations worldwide to accelerate innovation with trusted and governed data.

Join us

Help build the future of master data

Join a Databricks-native team building the trusted data foundation powering AI-ready enterprises.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.