Posted on 2026/03/05

English Transcription Specialist (Remote Job - Malaysia)

Chemin AI

Kuching, Sarawak, Malaysia

Full-time

Job description About the Client

Our client is a technology company specializing in the development of an end-to-end platform for creating, managing, and deploying volumetric avatars and videos.

They are revolutionizing digital identity with proprietary AI-powered 3D avatar technology that transforms 2D images into hyper-realistic, animatable 3D avatars for gaming, social media, enterprise, and generative AI.

Th...eir applications are available on major platforms, including iOS, Android, Steam, and Viveport.

They are expanding globally and are seeking a Cloud & AI Backend expert to architect the backbone of our scalable, GPU-accelerated infrastructure.

Key Responsibilities:

We are looking for an AI Infrastructure & Deployment Engineer to take ownership of our client’s neural network lifecycle—from architecture development to production-grade deployment. Successful candidate will be the bridge between AI research and a stable, scalable product, ensuring our 3D reconstruction LRM, voice generation and LLM services are high-performing and cost-efficient.

• AI Operations: Support and scale product servers dedicated to neural network operations, custom LRM and LLM infrastructure.

• Model Deployment: Develop new architectures, manage versioning, and lead the release of neural network updates.

• Infrastructure & Migration: Configure server environments for heavy AI loads; lead the strategic migration from dedicated GPU servers to scalable EC2 cloud architectures.

• Voice & Generative Tech: Maintain and optimize AI voice generation services and generative music tools.

• Documentation: Create internal deployment guides and external-facing documentation/sandboxes for the Avatai Public API.

Job Requirements:

• 5-10 years of deep experience in MLOps and deploying models (LLMs, TTS/STT) at scale.

• Proficiency in GPU resource management and cloud optimization (AWS/EC2).

• Experience building and documenting Public APIs for external developers.

• Strong Python skills and familiarity with frameworks like PyTorch or TensorFlow.

• Strong proficiency in Python (expert level required); C++ is an added advantage, especially for developing or optimizing custom kernels.

• Hands-on experience with PyTorch for deep learning model development and training.

• Practical experience with Hugging Face Transformers for building and fine-tuning LLM-based solutions.

• Experience with LangChain and/or LangGraph for LLM orchestration and agent-based workflows.

• Familiarity with vLLM or similar frameworks for efficient large language model (LLM) serving and optimization.

• Experience working with text-to-speech (TTS) models such as XTTS-v2, Bark, or Coqui TTS.

• Solid understanding of real-time audio processing and WebSocket streaming for low-latency voice applications.

• Strong cloud experience with AWS, including EC2 (P4/P5 instances), S3, and SageMaker.

• Experience with containerization and orchestration tools such as Docker, Kubernetes (EKS), and Ray for distributed AI workloads.

• Experience in model deployment using Triton Inference Server or BentoML.

• Experience building and managing Retrieval-Augmented Generation (RAG) pipelines.

• Familiarity with vector databases such as Pinecone, Weaviate, or pgvector.

• Experience with ML experiment tracking and monitoring tools such as Weights & Biases (W&B) or MLflow.

• THIS ROLE IS OPEN TO MALAYSIANS ONLY.

Kindly contact Dharsh via WhatsApp at 012-9191780 or via email at dharshini.nelamagan@peoplelake.asia if you are keen to explore this opportunity.

Thank you.

Show full description Report this listing Loading...

Apply Promote

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.

Learn More