Posted on 2026/02/14
Staff Voice AI Engineer - Applied AI
Uber
San Francisco, CA, United States
Job description About the Role:
Applied AI at Uber builds intelligent systems that power next-generation product experiences for riders, drivers, merchants, and couriers.
As a Staff Voice AI Engineer, you will lead the design and deployment of large-scale, real-time Voice AI systems that enable natural, reliable, and intelligent voice interactions across Uber's ecosystem.
You will operate as a full-stack techni...cal leader across speech modeling, LLM-powered conversational intelligence, and low-latency backend infrastructure - owning Voice AI systems end-to-end, from model development and evaluation to highly available, distributed production services.
This includes advancing capabilities in automatic speech recognition (ASR), text-to-speech (TTS), spoken language understanding, and LLM-driven dialogue systems.
You will partner closely with product, design, and infrastructure teams to translate customer pain points into seamless voice-first experiences - setting the foundation for how Voice AI is built, deployed, and operated across Uber's global platform.
What You Will Do:
• Design and build end-to-end Voice AI solutions, from understanding customer pain points and defining product requirements to deploying LLM-powered, real-time voice interfaces in production.
• Benchmark and evaluate voice AI systems, including speech recognition, speech synthesis, and spoken language understanding, by designing evaluations, analyzing results, and identifying systematic weaknesses.
• Improve voice model performance through system prompt tuning, fine-tuning voice- and speech-specific models, and optimizing architectures for low-latency, real-time voice interactions.
• Analyze voice request logs, prompt traces, and audio inputs to diagnose failure modes, improve transcription accuracy, conversational quality, and overall user experience.
• Build and maintain internal tools and platforms to automate Voice AI workflows, such as large-scale transcription pipelines, real-time audio processing services, and evaluation harnesses for voice quality.
• Own Voice AI systems in production end-to-end, including rollout strategies, monitoring, alerting, quality regression detection, and on-call readiness.
• Collaborate closely with product, design, and research teams to translate user needs into Voice AI capabilities with measurable business and customer impact.
Basic Qualifications:
• 10+ years of experience in software engineering, data science, or machine learning, including a track record of shipping production AI systems.
• Deep understanding of large language models, including fine-tuning, prompt engineering, embeddings, and retrieval-augmented generation (RAG).
• Strong backend and distributed systems expertise, with experience designing and operating highly available, scalable services in production.
• Deep experience with ML infrastructure, including model training pipelines, online serving systems, feature stores, experiment platforms, and evaluation frameworks.
• Hands-on experience with distributed data processing systems (e.g., Spark, Flink, Ray) and workflow orchestration (e.g., Airflow or equivalent).
• Ability to analyze data, run experiments, and derive insights for model and product improvement.
• Excellent communication and collaboration skills across technical and non-technical teams.
Preferred Qualifications:
• Experience building evaluation frameworks for Voice AI, including metrics and human/LLM-assisted evaluations for speech recognition accuracy, latency, robustness, and naturalness of synthesized speech.
• Demonstrated expertise in machine learning fundamentals applied to voice, including model evaluation, training, and fine-tuning of ASR, TTS, or speech-language models.
• Proven experience deploying Voice AI systems to production, with an emphasis on low-latency, high-reliability, real-time environments.
• Experience writing developer documentation, creating voice-specific SDKs, or enabling internal teams to build on shared Voice AI platforms.
Hands-on work with large-scale audio datasets, including data curation, labeling strategies, and optimization of voice processing pipelines at scale.
For San Francisco, CA-based roles: The base salary range for this role is USD$232,000 per year - USD$258,000 per year.
For Sunnyvale, CA-based roles: The base salary range for this role is USD$232,000 per year - USD$258,000 per year.
For all US locations, you will be eligible to participate in Uber's bonus program, and may be offered an equity award & other types of comp.
All full-time employees are eligible to participate in a 401(k) plan.
You will also be eligible for various benefits. More details can be found at the following link https://jobs.uber.com/en/benefits. Show full description

Zero to AI Engineer
Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.
Find AI, ML, Data Science Jobs By Location
Find Jobs By Position