Posted on 2026/02/14

Staff Voice AI Engineer - Applied AI

Uber

San Francisco, CA, United States

Full-time

Job description About the Role:

Applied AI at Uber builds intelligent systems that power next-generation product experiences for riders, drivers, merchants, and couriers.

As a Staff Voice AI Engineer, you will lead the design and deployment of large-scale, real-time Voice AI systems that enable natural, reliable, and intelligent voice interactions across Uber's ecosystem.

You will operate as a full-stack techni...cal leader across speech modeling, LLM-powered conversational intelligence, and low-latency backend infrastructure - owning Voice AI systems end-to-end, from model development and evaluation to highly available, distributed production services.

This includes advancing capabilities in automatic speech recognition (ASR), text-to-speech (TTS), spoken language understanding, and LLM-driven dialogue systems.

You will partner closely with product, design, and infrastructure teams to translate customer pain points into seamless voice-first experiences - setting the foundation for how Voice AI is built, deployed, and operated across Uber's global platform.

What You Will Do:

• Design and build end-to-end Voice AI solutions, from understanding customer pain points and defining product requirements to deploying LLM-powered, real-time voice interfaces in production.

• Benchmark and evaluate voice AI systems, including speech recognition, speech synthesis, and spoken language understanding, by designing evaluations, analyzing results, and identifying systematic weaknesses.

• Improve voice model performance through system prompt tuning, fine-tuning voice- and speech-specific models, and optimizing architectures for low-latency, real-time voice interactions.

• Analyze voice request logs, prompt traces, and audio inputs to diagnose failure modes, improve transcription accuracy, conversational quality, and overall user experience.

• Build and maintain internal tools and platforms to automate Voice AI workflows, such as large-scale transcription pipelines, real-time audio processing services, and evaluation harnesses for voice quality.

• Own Voice AI systems in production end-to-end, including rollout strategies, monitoring, alerting, quality regression detection, and on-call readiness.

• Collaborate closely with product, design, and research teams to translate user needs into Voice AI capabilities with measurable business and customer impact.

Basic Qualifications:

• 10+ years of experience in software engineering, data science, or machine learning, including a track record of shipping production AI systems.

• Deep understanding of large language models, including fine-tuning, prompt engineering, embeddings, and retrieval-augmented generation (RAG).

• Strong backend and distributed systems expertise, with experience designing and operating highly available, scalable services in production.

• Deep experience with ML infrastructure, including model training pipelines, online serving systems, feature stores, experiment platforms, and evaluation frameworks.

• Hands-on experience with distributed data processing systems (e.g., Spark, Flink, Ray) and workflow orchestration (e.g., Airflow or equivalent).

• Ability to analyze data, run experiments, and derive insights for model and product improvement.

• Excellent communication and collaboration skills across technical and non-technical teams.

Preferred Qualifications:

• Experience building evaluation frameworks for Voice AI, including metrics and human/LLM-assisted evaluations for speech recognition accuracy, latency, robustness, and naturalness of synthesized speech.

• Demonstrated expertise in machine learning fundamentals applied to voice, including model evaluation, training, and fine-tuning of ASR, TTS, or speech-language models.

• Proven experience deploying Voice AI systems to production, with an emphasis on low-latency, high-reliability, real-time environments.

• Experience writing developer documentation, creating voice-specific SDKs, or enabling internal teams to build on shared Voice AI platforms.

Hands-on work with large-scale audio datasets, including data curation, labeling strategies, and optimization of voice processing pipelines at scale.

For San Francisco, CA-based roles: The base salary range for this role is USD$232,000 per year - USD$258,000 per year.

For Sunnyvale, CA-based roles: The base salary range for this role is USD$232,000 per year - USD$258,000 per year.

For all US locations, you will be eligible to participate in Uber's bonus program, and may be offered an equity award & other types of comp.

All full-time employees are eligible to participate in a 401(k) plan.

You will also be eligible for various benefits. More details can be found at the following link https://jobs.uber.com/en/benefits. Show full description

Apply Promote

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Learn More