Principal Backend Engineer

Full-Time

USA, Remote

1 Opening

About the role

LakeFusion is seeking a Principal Backend Engineer to scale our real-time API to enterprise production workloads. This is a deeply technical role focused on building resilient, fault-tolerant, high-throughput distributed systems that run reliably in diverse customer environments.

A critical dimension of this role: we ship into customer environments. You won't just write code that works on our infrastructure — you'll build systems robust enough to run in Azure, AWS, and customer-managed Kubernetes clusters where you don't control the network, the identity provider, the observability stack, or the operator. That requires a particular kind of engineering discipline: defensive design, strong observability, minimal environmental assumptions, and thorough testing under failure conditions.

What you'll do

  • Scale the real-time API: Own the architecture and implementation of our high-throughput, low-latency API built in Python/FastAPI and deployed on Kubernetes.
  • Build for resilience and fault tolerance: Design for graceful degradation under upstream failures, rate limits, partial outages, and retry storms. Eliminate silent failure modes.
  • Engineer for autoscaling: Tune Kubernetes autoscaling behavior (HPA, KEDA, cluster autoscaler) to handle bursty production workloads across diverse customer environments.
  • Build streaming integrations: Develop streaming pipelines that sync data between service, transactional, and analytical layers.
  • Design for portability: Architect services that ship into customer environments and work reliably without serverless dependencies, with customer-controlled identity, restricted network egress, and varied observability stacks.
  • Own performance: Profile, benchmark, and optimize for latency and throughput. Understand and tune the full stack — from FastAPI request handling through connection pooling, caching, and downstream service calls.
  • Lead technical architecture: As a principal-level engineer, set direction on distributed systems patterns, deployment architecture, and operational tooling.

What we're looking for

  • 10+ years of backend engineering experience, with a strong track record at the principal or staff level building high-performance distributed systems in production.
  • Deep Python and FastAPI expertise, including async patterns, performance profiling, and production-grade service development.
  • Proven experience building high-throughput, low-latency distributed systems — concrete examples of systems handling demanding production workloads with strict latency requirements.
  • Strong Kubernetes experience, particularly around autoscaling (HPA, KEDA, cluster autoscaler), resource tuning, and production operations.
  • Resilience and fault tolerance engineering: circuit breakers, retries with backoff and jitter, bulkheads, timeouts, idempotency, and graceful degradation patterns.
  • Streaming systems experience: Kafka, Pulsar, Kinesis, or similar. Spark Structured Streaming is a strong bonus.
  • Deployment portability experience: you've shipped software that runs in environments you don't control, and you understand what that demands of the code.
  • Strong observability instincts: structured logging, metrics, tracing, and the discipline to make systems debuggable from the outside.
  • Excellent technical communication, including the ability to work effectively across US and India-based engineering teams.

Nice-to-have

  • Experience with Databricks or lakehouse architectures.
  • Experience deploying into enterprise customer environments with strict security/compliance constraints (customer-managed identity, private networking, air-gapped deployments).
  • Healthcare, financial services, or other regulated industry experience.
  • Contributions to open-source distributed systems or streaming projects.

About LakeFusion

LakeFusion is the modern Master Data Management (MDM) company. Global enterprises across industries ranging from retail to manufacturing and financial services rely on the LakeFusion platform to unify, govern, and deliver trusted data entities such as customers, products, suppliers, and employees. Built natively on the Databricks Lakehouse, LakeFusion creates a single source of truth that powers analytics and AI. LakeFusion enables organizations worldwide to accelerate innovation with trusted and governed data.

Join us

Help build the future of master data

Join a Databricks-native team building the trusted data foundation powering AI-ready enterprises.

Thank you! Your submission has been received!
Oops! Something went wrong while submitting the form.