Posted on 2025/12/05
AI Engineer - Voice Agent (Onsite 1pm-10pm)
Transcure
Lahore, Pakistan
Full Description
Experience:
2–5 years (hands-on)
Employment Type:
Full-time
Location:
Jail Road, Lahore
About Us
We're on a mission to build intelligent voice agents that can engage in natural, real-time conversations.
These agents are designed to make and receive calls, understand spoken language, and respond with human-like voices using LLMs, advanced STT, and TTS systems.
You'll help shape this future by building scalable, low-latency voice AI systems that interact seamlessly via voice.
What You'll Do
• Voice Agent Development
• Design, implement, and maintain end-to-end voice-agent pipelines: STT → NLU / LLM → TTS.
• Build multi-turn conversational flows, including context management, turn-taking, interruptions, and barge-in.
• Develop prompt-engineering strategies for LLMs to drive dialogues for various call types.
• Integrate voice agent with telephony systems (SIP, Twilio, WebRTC) to support real voice calling.
• AI / ML Engineering
• Train, fine-tune, and serve models for intent detection, NLU, or dialog management.
• Work with LLMs (OpenAI, Llama, etc.) for generating responses in voice conversations.
• Optimize models for inference cost and latency (quantization, batching, efficient serving).
• Implement MLOps practices: version control, model deployment, monitoring, and retraining pipelines.
• Speech & Voice Tech
• Use high-quality STT (e.g., Whisper, Deepgram, Google) and TTS (e.g., ElevenLabs, Azure, Google) systems.
• Implement voice activity detection (VAD) and manage audio streaming / chunking for real-time voice interaction.
• Ensure voice output is natural, expressive, and appropriate for conversation scenarios.
• System Design & Infrastructure
• Design scalable, low-latency architecture for real-time voice interactions.
• Deploy voice agents in cloud environments (e.g., AWS) and set up serving infrastructure.
• Build monitoring, logging, and analytics around conversation quality, latency, and errors.
• Ensure data privacy and voice data security.
What We're Looking For
Must-Have
• 2+ years of hands-on experience in AI / ML, especially conversational or voice AI systems.
• Demonstrated experience building voice agents (STT + TTS + dialog management).
• Strong Python experience and deep learning / ML framework knowledge.
• Experience with LLMs (prompting, fine-tuning) or NLU models.
• Knowledge of telephony integrations (SIP, Twilio, or WebRTC).
• Familiarity with MLOps practices and deploying ML models to production.
• Ability to build low-latency, scalable systems.
Nice-to-Have
• Previous projects or deployed systems as a calling / voice agent.
For instance, job descriptions for similar roles include working with STT, TTS, LLMs, and telephony. Jobaaj+1
• Experience with large-scale voice / conversational AI engines (Rasa, LangChain, etc.).
• Knowledge of emotion detection, voice biometrics, or multi-speaker voice modeling.
• Familiarity with deployment on cloud voice / telecom infrastructure.
• Understanding of AI ethics, data privacy (especially voice), and compliance for voice data.
Why Join Us?
• You'll
own the voice agent architecture
, building from the ground up.
• Work on
state-of-the-art AI+voice systems
combining STT, LLMs, and TTS.
• High impact
: your work will define how human-like, useful, and scalable our voice agents are.
• Great learning opportunity in MLOps, low-latency inference, and real-time dialogue systems.
• Close-knit team where your technical decisions will directly shape the product.
How to Apply
Please submit:
Your
resume / CV
A summary of your
past voice-AI or conversational agent projects
— especially deployments or real-time voice systems
(Optional but recommended) Links or demo video of any voice agent / conversational AI work you've done
Find AI, ML, Data Science Jobs By Location
Find Jobs By Position