Posted on 2025/12/01

AI Squared is hiring: Machine Learning Engineer (Washington) in Washington DC

AI Squared

Washington, DC, United States

Full-time

Apply Promote

Qualifications

5+ years of experience as a Machine Learning Engineer, MLOps Engineer, or similar role
Proven experience deploying and maintaining machine learning models in production at scale
Hands-on experience with ML lifecycle tooling (MLflow, Kubeflow, SageMaker, Vertex AI, or similar)
Strong proficiency in Python; familiarity with ML frameworks such as PyTorch or TensorFlow
Deep knowledge of containerization (Docker) and orchestration (Kubernetes) for production ML systems
Expertise with cloud platforms (AWS, GCP, Azure) for ML deployment and scaling
Strong understanding of MLOps best practices, monitoring, and automation
Excellent problem-solving skills, with an emphasis on building reliable, scalable systems
Strong communication and collaboration skills across technical and non-technical teams

Responsibilities

In this role, you will focus on deploying, maintaining, and monitoring the AI/ML systems that power our platform
You will work closely with data scientists, data engineers, and product teams to ensure scalable, reliable, and production-grade AI solutions
Youll play a critical role in operationalizing large language models (LLMs) and other ML systems, ensuring they run efficiently, securely, and with robust monitoring in place
Design, implement, and maintain ML deployment pipelines for scalable production systems
Operationalize large language models (LLMs) and other AI/ML models, ensuring high availability and reliability
Build robust model monitoring, logging, and alerting systems to track performance and detect drift
Partner with data scientists to transition models from research/prototype into production-ready deployments
Develop CI/CD pipelines for ML workflows, integrating testing, validation, and automated deployment
Optimize runtime performance of ML models across cloud platforms (AWS, GCP, Azure) and distributed systems
Apply containerization and orchestration (Docker, Kubernetes) to enable reproducible, scalable systems
Collaborate with cross-functional teams to ensure ML systems align with platform goals and business requirements

Full Description

OVERVIEW

We are seeking a highly skilled Machine Learning Engineer to join our core AI team.

In this role, you will focus on deploying, maintaining, and monitoring the AI/ML systems that power our platform.

You will work closely with data scientists, data engineers, and product teams to ensure scalable, reliable, and production-grade AI solutions.

Youll play a critical role in operationalizing large language models (LLMs) and other ML systems, ensuring they run efficiently, securely, and with robust monitoring in place.

KEY RESPONSIBILITIES

• Design, implement, and maintain ML deployment pipelines for scalable production systems.

• Operationalize large language models (LLMs) and other AI/ML models, ensuring high availability and reliability.

• Build robust model monitoring, logging, and alerting systems to track performance and detect drift.

• Partner with data scientists to transition models from research/prototype into production-ready deployments.

• Develop CI/CD pipelines for ML workflows, integrating testing, validation, and automated deployment.

• Optimize runtime performance of ML models across cloud platforms (AWS, GCP, Azure) and distributed systems.

• Apply containerization and orchestration (Docker, Kubernetes) to enable reproducible, scalable systems.

• Collaborate with cross-functional teams to ensure ML systems align with platform goals and business requirements.

QUALIFICATIONS

• 5+ years of experience as a Machine Learning Engineer, MLOps Engineer, or similar role.

• Proven experience deploying and maintaining machine learning models in production at scale.

• Hands-on experience with ML lifecycle tooling (MLflow, Kubeflow, SageMaker, Vertex AI, or similar).

• Strong proficiency in Python; familiarity with ML frameworks such as PyTorch or TensorFlow.

• Deep knowledge of containerization (Docker) and orchestration (Kubernetes) for production ML systems.

• Expertise with cloud platforms (AWS, GCP, Azure) for ML deployment and scaling.

• Strong understanding of MLOps best practices, monitoring, and automation.

• Excellent problem-solving skills, with an emphasis on building reliable, scalable systems.

• Strong communication and collaboration skills across technical and non-technical teams.

#J-18808-Ljbffr

Apply Promote

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Learn More