< More Jobs

Posted on 2025/12/01

AI Squared is hiring: Machine Learning Engineer (Washington) in Washington DC

AI Squared

Washington, DC, United States

Full-time

Qualifications

  • 5+ years of experience as a Machine Learning Engineer, MLOps Engineer, or similar role
  • Proven experience deploying and maintaining machine learning models in production at scale
  • Hands-on experience with ML lifecycle tooling (MLflow, Kubeflow, SageMaker, Vertex AI, or similar)
  • Strong proficiency in Python; familiarity with ML frameworks such as PyTorch or TensorFlow
  • Deep knowledge of containerization (Docker) and orchestration (Kubernetes) for production ML systems
  • Expertise with cloud platforms (AWS, GCP, Azure) for ML deployment and scaling
  • Strong understanding of MLOps best practices, monitoring, and automation
  • Excellent problem-solving skills, with an emphasis on building reliable, scalable systems
  • Strong communication and collaboration skills across technical and non-technical teams

Responsibilities

  • In this role, you will focus on deploying, maintaining, and monitoring the AI/ML systems that power our platform
  • You will work closely with data scientists, data engineers, and product teams to ensure scalable, reliable, and production-grade AI solutions
  • Youll play a critical role in operationalizing large language models (LLMs) and other ML systems, ensuring they run efficiently, securely, and with robust monitoring in place
  • Design, implement, and maintain ML deployment pipelines for scalable production systems
  • Operationalize large language models (LLMs) and other AI/ML models, ensuring high availability and reliability
  • Build robust model monitoring, logging, and alerting systems to track performance and detect drift
  • Partner with data scientists to transition models from research/prototype into production-ready deployments
  • Develop CI/CD pipelines for ML workflows, integrating testing, validation, and automated deployment
  • Optimize runtime performance of ML models across cloud platforms (AWS, GCP, Azure) and distributed systems
  • Apply containerization and orchestration (Docker, Kubernetes) to enable reproducible, scalable systems
  • Collaborate with cross-functional teams to ensure ML systems align with platform goals and business requirements

Full Description

OVERVIEW

We are seeking a highly skilled Machine Learning Engineer to join our core AI team.

In this role, you will focus on deploying, maintaining, and monitoring the AI/ML systems that power our platform.

You will work closely with data scientists, data engineers, and product teams to ensure scalable, reliable, and production-grade AI solutions.

Youll play a critical role in operationalizing large language models (LLMs) and other ML systems, ensuring they run efficiently, securely, and with robust monitoring in place.

KEY RESPONSIBILITIES

• Design, implement, and maintain ML deployment pipelines for scalable production systems.

• Operationalize large language models (LLMs) and other AI/ML models, ensuring high availability and reliability.

• Build robust model monitoring, logging, and alerting systems to track performance and detect drift.

• Partner with data scientists to transition models from research/prototype into production-ready deployments.

• Develop CI/CD pipelines for ML workflows, integrating testing, validation, and automated deployment.

• Optimize runtime performance of ML models across cloud platforms (AWS, GCP, Azure) and distributed systems.

• Apply containerization and orchestration (Docker, Kubernetes) to enable reproducible, scalable systems.

• Collaborate with cross-functional teams to ensure ML systems align with platform goals and business requirements.

QUALIFICATIONS

• 5+ years of experience as a Machine Learning Engineer, MLOps Engineer, or similar role.

• Proven experience deploying and maintaining machine learning models in production at scale.

• Hands-on experience with ML lifecycle tooling (MLflow, Kubeflow, SageMaker, Vertex AI, or similar).

• Strong proficiency in Python; familiarity with ML frameworks such as PyTorch or TensorFlow.

• Deep knowledge of containerization (Docker) and orchestration (Kubernetes) for production ML systems.

• Expertise with cloud platforms (AWS, GCP, Azure) for ML deployment and scaling.

• Strong understanding of MLOps best practices, monitoring, and automation.

• Excellent problem-solving skills, with an emphasis on building reliable, scalable systems.

• Strong communication and collaboration skills across technical and non-technical teams.

#J-18808-Ljbffr

Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.