AI/ML Engineer

AI/ML Engineer – Develop LLM-powered solutions to enhance RAG, entity resolution, and MDM workflows.

India, Remote
Full-Time
1 Opening
Apply Now
About the Role

LakeFusion is seeking an AI/ML Engineer to lead the development of advanced machine learning solutions that power our AI-driven Master Data Management (MDM) platform. In this role, you will design and optimize prompt engineering strategies, refine Retrieval Augmented Generation (RAG) architectures, and advance our entity resolution process to deliver higher accuracy, speed, and cost efficiency at scale. You will work on building production-grade AI workflows and data science tools that give business users transparency and control over AI-driven matching. This includes implementing multi-stage evaluation strategies with LLMs, monitoring performance through telemetry, and continuously improving models to mitigate drift. Collaboration will be central to your role. You will partner with product managers, data engineers, and data stewards to translate complex business requirements into robust, scalable AI/ML solutions. This is a hands-on position for an experienced machine learning professional who thrives at the intersection of Generative AI, entity resolution, and large-scale data platforms, and who is motivated to deliver impactful, production-ready AI systems.

What you’ll do
  • Lead the design, development, and optimization of prompt engineering strategies for LakeFusion's LLM-based entity matching to improve accuracy, reduce bias, and enhance interpretability.
  • Drive the continuous improvement of our Retrieval-Augmented Generation (RAG) architecture, refining the interplay between Vector Search candidate generation and LLM evaluation for superior match results.
  • Iterate on LakeFusion's entity resolution process, exploring novel approaches to enhance match performance (precision, recall, F1-score) and operational efficiency (speed, flexibility, cost).
  • Investigate and implement advanced LLM evaluation strategies, including multi-stage processing with potentially less powerful models to balance performance, cost, and output quality.
  • Contribute to the design and development of production-grade, business-user-facing data science tools and workflows that provide transparency and control over AI matching.
  • Collaborate closely with product managers, data engineers, and data stewards to translate complex business requirements into robust, scalable AI/ML solutions.
  • Monitor and analyze AI model performance using telemetry from AI Gateway Inference Tables and custom logs, identifying opportunities for continuous improvement and drift mitigation.
What We're Looking For
  • 5+ years of hands-on experience as an ML Engineer, Data Scientist, or similar role, specifically building and deploying machine learning solutions in a production environment.
  • Deep expertise in Entity Resolution and Master Data Management (MDM), understanding the nuances of data matching, deduplication, and survivorship.
  • Extensive practical experience with Generative AI (GenAI) concepts, Large Language Models (LLMs), Vector Search, and Retrieval-Augmented Generation (RAG) architectures.
  • Strong proficiency in Python and its ecosystem for data science and machine learning (e.g., PyTorch, TensorFlow, scikit-learn).
  • Demonstrated ability to deploy, manage, and optimize modern AI/ML models in production, with a focus on latency, throughput, and cost.
  • Proven track record of building production-grade data science tools or applications that directly enable business users to interact with and leverage AI/ML insights (not solely internal analytics or proof-of-concept chatbots).
  • Solid foundation in machine learning fundamentals, including experience with diverse model types (e.g., linear models, neural networks, support vector machines, clustering) and strong statistical analysis skills. You understand the "why" behind different algorithms.
  • Experience working with the Databricks platform (e.g., Delta Lake, MLflow, Databricks SQL Analytics) is highly desirable.
  • Excellent problem-solving skills and the ability to debug complex AI systems, understanding the interplay between data, models, and prompts.
  • Strong communication skills, capable of articulating complex technical concepts to both engineering and non-technical stakeholders.
Nice-to-Have
  • Experience with MLOps practices, CI/CD for ML pipelines.
  • Knowledge of distributed computing frameworks beyond Databricks.
  • Experience with other MDM platforms or enterprise data quality tools.
  • Familiarity with cloud platforms (AWS, Azure) for AI/ML deployments.
About the LakeFusion

LakeFusion is the modern Master Data Management (MDM) company. Global enterprises across industries ranging from retail to manufacturing and financial services  rely on the LakeFusion platform to unify, govern, and deliver trusted data entities such as customers, products, suppliers, and employees. Built natively on the Databricks Lakehouse, LakeFusion creates a single source of truth that powers analytics and AI. LakeFusion enables organizations worldwide to accelerate innovation with trusted and governed data.

Apply Now