Posted on 2025/02/28
AI Training Systems Developer
Amazon
Seattle, WA, United States
Qualifications
-
3+ years of professional software development experience
-
2+ years of design or architecture experience, including design patterns, reliability, and scaling of new and existing systems
-
Experience programming with at least one software programming language
Benefits
- We offer a range of benefits, including competitive compensation, comprehensive health insurance, and opportunities for growth and development
Responsibilities
-
The successful candidate will collaborate with applied scientists on machine learning tasks ranging from ML code and data management to training and deployment of ML models
-
Key responsibilities include researching and developing stability and optimizations to continuously improve training KPIs, such as uptime, throughput, and good put
-
Collaborate with applied scientists on machine learning tasks, including ML code and data management, training, and deployment of ML models
-
Develop and implement stability and optimization techniques to improve training KPIs
-
Work with software engineering teams to operationalize and scale training improvements across experimentation and production workloads
Full Description
Amazon's Artificial General Intelligence (AGI) Team is at the forefront of advancing generative AI technologies, including Amazon's expansive multimodal Large Language Models.
We are seeking a highly motivated and skilled Machine Learning Engineer to join our mission by building scalable, resilient, and performant training code, systems, and infrastructure.
This individual will work closely withAGI engineers, applied scientists, and AWS to accelerate the delivery of state-of-the-art AGI models for Amazon businesses and customers.
About the Role
The successful candidate will collaborate with applied scientists on machine learning tasks ranging from ML code and data management to training and deployment of ML models.
Key responsibilities include researching and developing stability and optimizations to continuously improve training KPIs, such as uptime, throughput, and good put.
Key Responsibilities
• Collaborate with applied scientists on machine learning tasks, including ML code and data management, training, and deployment of ML models.
• Develop and implement stability and optimization techniques to improve training KPIs.
• Work with software engineering teams to operationalize and scale training improvements across experimentation and production workloads.
About the Team
As part of the Amazon General Intelligence team, AGI Modeling Services provides training capabilities and services to accelerate the invention of SoTA models across all modalities and their derivatives.
These services include high-performance ML infrastructure, modeling toolkits, and optimized MLOps workflows for AGI scientists to build, train, and release their models.
Requirements
• 3+ years of professional software development experience.
• 2+ years of design or architecture experience, including design patterns, reliability, and scaling of new and existing systems.
• Experience programming with at least one software programming language.
Benefits
At Amazon, we are committed to providing a diverse and inclusive workplace. We offer a range of benefits, including competitive compensation, comprehensive health insurance, and opportunities for growth and development.

Zero to AI Engineer
Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.
Find AI, ML, Data Science Jobs By Location
Find Jobs By Position