Posted on 2025/11/24

Generative AI/LLM Engineer

ALETHIA AI

Lahore, Pakistan

Full-time

Apply Promote

Full Description

Generative AI / LLM Engineer

Are you a Machine Learning Engineer focused on building systems that power avatars and agents?

ALETHIA AI is looking for a Generative AI / LLM Engineer to train, tune, and deploy models across vision, audio, and LLMs, integrating them directly into products.

ALETHIA AI is leading the Agentic AI movement across industries.

Through partnerships with Solana Mobile, AWS,OpenSea, and many more, we’re building the foundation for decentralized AI.

These collaborations have enabled us to launch the world’s first intelligent NFTs (iNFTs), expand our AI capabilities, and bring on-chain AI agents to mobile.

As our Generative AI / LLM Engineer, you will ship pipelines, optimize latency, and support prompt tuning for LLM behaviors.

We need a builder who writes Python, understands tradeoffs, and delivers reliable models that improve experience, engagement, and growth.

Key Responsibilities

• Develop, fine-tune, and adapt LLMs/VLMs for conversational, multi-turn, and real-time avatar interactions with strong grasp of how LLMs learn and behave - including in-context learning, prompt sensitivity, function calling, fine-tuning, and multi-turn reasoning patterns

• Read research papers on LLM/VLM techniques (personas, multi-turn reasoning, contextual memory, RL agent fine-tuning, perception models), comprehend the approach, and implement prototypes that translate to production systems

• Build context engineering systems: short and long-term conversational memory, RAG pipelines with vector stores, and real-time data integration for grounded multi-turn conversations

• Develop methods for verbal and non-verbal communication in avatars—persona consistency, facial expressions, speech patterns, and real-time behavioral adaptation

• Build perception pipelines integrating vision, audio, and language modalities for real-time avatar systems - coordinating with Voice and CV Engineers on multimodal interaction flows

• Deploy LLMs and multimodal systems at scale - building APIs, inference endpoints, and serving infrastructure optimized for latency, throughput, GPU/CPU costs, and reliability for real-time avatar applications

• Build and maintain pipelines for privacy-preserved insights systems of structured/unstructured datasets, including conversational corpora, avatar data, audio, and multimodal datasets

• Build evaluation frameworks and monitoring systems to track reasoning quality, consistency, hallucination rates, persona alignment, memory fidelity, and detect drift - troubleshooting inference issues and iterating rapidly

• Collaborate with product, design, and creative teams to translate conversational and avatar requirements into prompt pipelines, memory systems, and behavior controls - providing technical guidance on feasibility, capabilities, and tradeoffs

• Apply prompting techniques for voice synthesis and image-to-video models to achieve natural prosody, pronunciation accuracy, and avatar generation

Requirements

Technical Skills:

• Bachelor's or Master's in CS, ML, AI, or related field - or equivalent hands-on experience

• Backend Python skills (FastAPI/Flask/Django) for writing clean, scalable APIs and microservices that orchestrate LLM workflows

• Hands-on with LLMs/VLMs and understanding of LLM agents: multi-turn reasoning, function calling, tool use, conversational memory, state management

• Strong grasp of context engineering: short/long-term memory management, RAG pipelines, information retrieval, vector stores,

• Experience building chat and dialogue agents - conversation management, contextual memory, multi-agent coordination

• Ability to architect end-to-end data and prompt engineering pipelines - ensuring rich user experience aligning with product requirements

• Experience with LLM serving frameworks (vLLM, TensorRT-LLM etc) and deployment on serverless GPU platforms (Modal, RunPod etc)

• Strong debugging mindset - profiling inference bottlenecks, optimizing latency, troubleshooting agent behavior

Soft Skills:

• Strong problem-solving with attention to detail

• Clear communication to cross-functional teams

• Collaborative mindset in fast-paced environments

Remuneration: Competitive, with role-aligned incentives and growth opportunities.

If you’re excited to build a community at the forefront of AI and Web3 innovation, we’d love to hear from you. Apply by sending your CV to careers@alethea.ai or feel free to reach out directly to discuss this opportunity.

Know someone who might be a fit?

Please share this post with them!

Apply Promote

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Learn More