Posted on 5/14/2025

Mid-Level Machine Learning Engineer

TetraMem - Accelerate The World

Fremont, CA, United States

Full-time

Qualifications

5+ years of experience or PhD in Computer Science, Electrical Engineering, or related fields
Strong experience in machine learning, with a focus on edge AI and lightweight model deployment
Expertise in ML frameworks such as PyTorch, TensorFlow, JAX
Proficiency in programming languages such as C/C++, Python, and experience with ML model optimization
Ability to work independently and collaboratively in a fast-paced startup environment
Experience in one or more of the following areas considered a strong plus:
Understanding of ML compiler and runtime design
Experience working with tools such as Optimum, ONNX, TensorRT, TFLite/LiteRT, ncnn, or CoreML
Familiarity with hardware acceleration techniques
Experience in embedded system development

Develop, optimize, and deploy lightweight machine learning models for edge AI applications, particularly for audio processing
Implement and optimize ML models on embedded platforms, including FPGA and custom ASIC solutions
Work closely with hardware and software teams to integrate ML models into production systems
Research and implement state-of-the-art ML techniques to enhance model efficiency, latency, and power consumption for embedded AI applications
Improve inference efficiency and model compression techniques, including quantization, pruning, and knowledge distillation
Collaborate with cross-functional teams to drive innovation and contribute to the overall system architecture
Provide technical leadership and mentorship to junior engineers
Publish research findings, present at conferences, and contribute to open-source projects when applicable

Responsibilities

• Develop, optimize, and deploy lightweight machine learning models for edge AI applications, particularly for audio processing.

• Implement and optimize ML models on embedded platforms, including FPGA and custom ASIC solutions.

• Work closely with hardware and software teams to integrate ML models into production systems.

• Research and implement state-of-the-art ML techniques toenhance model efficiency, latency, and power consumption for embedded AI applications.

• Improve inference efficiency and model compression techniques, including quantization, pruning, and knowledge distillation.

• Collaborate with cross-functional teams to drive innovation and contribute to the overall system architecture.

• Provide technical leadership and mentorship to junior engineers.

• Publish research findings, present at conferences, and contribute to open-source projects when applicable.

Requirements

• 5+ years of experience or PhD in Computer Science, Electrical Engineering, or related fields.

• Strong experience in machine learning, with a focus on edge AI and lightweight model deployment.

• Expertise in ML frameworks such as PyTorch, TensorFlow, JAX.

• Proficiency in programming languages such as C/C++, Python, and experience with ML model optimization.

• Ability to work independently and collaboratively in a fast-paced startup environment.

Experience in one or more of the following areas considered a strong plus:

• Understanding of ML compiler and runtime design.

• Experience working with tools such as Optimum, ONNX, TensorRT, TFLite/LiteRT, ncnn, or CoreML.

• Familiarity with hardware acceleration techniques.

• Experience in embedded system development.

Salary Range: $110,000 - $300,000 / year

Get top updates in AI to your inbox every weekend. It's free!