Staff ML Infrastructure Engineer
Location: United States
About the Company
We are a fast-growing AI company building next-generation large language models at scale. Our mission is to bring powerful, reliable AI systems into production environments used by thousands of customers. We value technical excellence, deep collaboration, and engineers who thrive on solving real-world problems at scale.
Role Overview
We are seeking a Staff / Principal ML Infrastructure Engineer to lead the design, deployment, and scaling of our large language model infrastructure. This role sits at the intersection of machine learning, systems engineering, and platform design, enabling teams to train, serve, and monitor models efficiently and reliably.
This is not a prompt engineering role – it is focused on building robust, production-grade ML infrastructure and operational pipelines.
Responsibilities
- Design, implement, and maintain high-performance infrastructure for training and serving LLMs
- Optimize model pipelines for efficiency, latency, and cost at scale
- Collaborate with ML researchers, platform engineers, and product teams to deploy models safely into production
- Build monitoring, alerting, and tooling to ensure reliability and observability of large-scale ML systems
- Evaluate and integrate new frameworks, tools, and architectures to improve ML workflows
- Provide technical leadership and mentorship to other engineers on the team
Qualifications
- 7+ years of software engineering experience, including 3+ years building production ML systems
- Deep experience with distributed training and inference frameworks (e.g., PyTorch, JAX, TensorFlow)
- Familiarity with model serving technologies and orchestration (e.g., Triton, Ray, Kubernetes)
- Strong understanding of GPU/TPU infrastructure, performance optimization, and scalability challenges
- Proven experience solving reliability, latency, and cost trade-offs in production ML systems
- Excellent collaboration, communication, and problem-solving skills
- Experience mentoring or leading engineering teams is a plus
Why You’ll Enjoy This Role
- Work on cutting-edge LLM infrastructure at scale
- Influence the design of systems that power real-world AI applications
- Collaborate with some of the most talented engineers in AI
- Flexible work arrangements and competitive compensation
Darwin Recruitment is acting as an Employment Agency in relation to this vacancy.
Reece Waldon
Recommended Jobs
Lead Developer / Technical Architect (.NET & Azure)
Title: Lead Developer/ Technical Architect Location: Remote Duration: Fulltime Salary: $Open Requirements About the Role We are looking for a hands-on Lead Developer with deep .NET an…
Bilingual Spanish speaking Board Certified Behavior Analyst BCBA Excellent Opportunity (New York)
Mission-driven. Supportive. Proudly different. If thats how you want to feel as a BCBA, welcome to United Care ABA. At United Care ABA, were more than just a companywere a community. We are act…
Vendor Management Director
SMBC Group is a top-tier global financial group. Headquartered in Tokyo and with a 400-year history, SMBC Group offers a diverse range of financial services, including banking, leasing, securities, c…
Finance and Restructuring Litigation Associate Attorney
FINANCE & RESTRUCTURING LITIGATION ASSOCIATE ATTORNEY Open to Multiple Locations...full list found at the bottom of this description HYBRID A leading AmLaw 100 international law firm i…
Speech Language Pathologist, Per Diem
St. Peter's Health Partners, a recognized leader in healthcare services in the New York State Capital District, is seeking a Per Diem Speech Language Pathologist to join our dedicated team. This role …
Mgr Program Finance
Regeneron Pharmaceuticals is looking for a Manager to be an integral part of the dynamic Program Finance team supporting clinical development studies and programs and providing financial analyses for…
Data Scientist, Research
Dstillery is the leading AI ad targeting company. We empower brands and agencies to target their best prospects for high-performing programmatic advertising campaigns. Backed by our award-winning Dat…
Sales Associate - NYC Shop Parlor
Job Post Sales Associate - NYC Shop Parlor Apply » Company: Nickey Kehoe Duration: Full-time Date Posted: 01/06/26 Category: Sales & Marketing Location: New York Salary: 50-100k…
Software Engineer 3
Who We Are The Farmer’s Dog was born from a mission to change the landscape of pet health, providing dogs and their humans with honest, smart, and simple care. We’re starting by radically improv…
Travel Nurse RN - PACU - Post Anesthesia Care - $2,693 per week
LanceSoft is seeking a travel nurse RN PACU - Post Anesthesia Care for a travel nursing job in Bronx, New York. Job Description & Requirements ~ Specialty: PACU - Post Anesthesia Care ~ Disci…