Posted on 6/21/2025

Member of Technical Staff, Research Engineer

Inflection AI

Stanford, CA, United States

Full-time

Apply Promote

Qualifications

If you have extensive experience deploying and optimizing large-scale language models for real-time inference, are skilled with performance-enhancing tools and frameworks such as ONNX, TensorRT, or TVM, thrive in fast-paced environments where real-world application performance is paramount, understand the intricate trade-offs between model accuracy, latency, and scalability, and are passionate about delivering robust, efficient, and scalable inference solutions, this could be the right opportunity for you

Benefits

We aim to attract and retain top talent and compensate them fairly based on individual contributions
For this role, Inflection AI estimates a starting annual base salary range of approximately $175,000 - $350,000 depending on experience

Responsibilities

As a Member of Technical Staff, Research Engineer on our Inference team, you'll play a crucial role in ensuring the real-time performance and reliability of our AI systems
Your responsibilities will include optimizing inference pipelines, reducing latency, and translating cutting-edge research into practical applications
Optimizing inference pipelines to maximize model performance and minimize latency in production environments, Collaborating with ML researchers and engineers to deploy inference solutions that meet rigorous enterprise standards, Integrating and refining tools to streamline the transition from research prototypes to production-ready systems, Continuously monitoring and tuning system performance with real-world data to drive improvements, Pioneering innovations in model inference critical to the success of our AI platform

Full Description

Inflection AI: Revolutionizing Enterprise AI Solutions

At Inflection AI, we're a public benefit corporation dedicated to harnessing the power of our world-class large language model to build cutting-edge AI platforms tailored to the needs of the enterprise.

Our Team:

We're a dynamic organization passionate about what we do, driven by innovative ideas, and committed to collaboration. We strive toassemble diverse teams with various backgrounds and experiences.

First Product: Pi

Pi is an empathetic and conversational chatbot built using our 350B+ frontier model, along with sophisticated fine-tuning (10M+ examples), inference, and orchestration platform. We're now focused on developing new systems that directly address the requirements of enterprise customers through the same approach.

About the Role:

As a Member of Technical Staff, Research Engineer on our Inference team, you'll play a crucial role in ensuring the real-time performance and reliability of our AI systems. Your responsibilities will include optimizing inference pipelines, reducing latency, and translating cutting-edge research into practical applications.

Ideal Candidate:

If you have extensive experience deploying and optimizing large-scale language models for real-time inference, are skilled with performance-enhancing tools and frameworks such as ONNX, TensorRT, or TVM, thrive in fast-paced environments where real-world application performance is paramount, understand the intricate trade-offs between model accuracy, latency, and scalability, and are passionate about delivering robust, efficient, and scalable inference solutions, this could be the right opportunity for you.

Key Responsibilities:

Optimizing inference pipelines to maximize model performance and minimize latency in production environments, Collaborating with ML researchers and engineers to deploy inference solutions that meet rigorous enterprise standards, Integrating and refining tools to streamline the transition from research prototypes to production-ready systems, Continuously monitoring and tuning system performance with real-world data to drive improvements, Pioneering innovations in model inference critical to the success of our AI platform.

Employee Compensation:

We aim to attract and retain top talent and compensate them fairly based on individual contributions. For this role, Inflection AI estimates a starting annual base salary range of approximately $175,000 - $350,000 depending on experience.

Apply Promote

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!