Posted on 9/18/2025

Head of Evaluation and Oversight Research

Scale AI

Washington, DC, United States

Full-time

$260K–$350K

Apply Promote

Qualifications

A track record of impactful research in machine learning, especially in generative AI, evaluation, or oversight
Significant experience leading ML research in academia or industry
Strong written and verbal communication skills for cross-functional collaboration
Experience building and mentoring teams of research scientists and engineers

Responsibilities

We are seeking a Head of Evaluation and Oversight Research to lead our research team in shaping the next generation of evaluation science for frontier AI models
Lead a team of research scientists and engineers on foundational work in evaluation and oversight
Drive research initiatives on frameworks and benchmarks for frontier AI models, spanning reasoning, coding, multi-modal, and agentic behaviors
Design and advance scalable oversight methods, leveraging model-assisted evaluation, rubric-guided judgments, and recursive oversight
Collaborate with leading research labs across industry and academia
Publish research at top-tier venues and contribute to open-source benchmarking initiatives
Remain deeply engaged with the research community, both understanding trends and setting them

Full Description

Job Description:

We are seeking a Head of Evaluation and Oversight Research to lead our research team in shaping the next generation of evaluation science for frontier AI models.

• Main Responsibilities:

• Lead a team of research scientists and engineers on foundational work in evaluation and oversight.

• Drive research initiatives on frameworks and benchmarks for frontier AI models, spanning reasoning, coding, multi-modal, and agentic behaviors.

• Design and advance scalable oversight methods, leveraging model-assisted evaluation, rubric-guided judgments, and recursive oversight.

• Collaborate with leading research labs across industry and academia.

• Publish research at top-tier venues and contribute to open-source benchmarking initiatives.

• Remain deeply engaged with the research community, both understanding trends and setting them.

• Ideal Background:

• A track record of impactful research in machine learning, especially in generative AI, evaluation, or oversight.

• Significant experience leading ML research in academia or industry.

• Strong written and verbal communication skills for cross-functional collaboration.

• Experience building and mentoring teams of research scientists and engineers.

• About Scale:

We believe the transition from traditional software to AI is a major shift. Our mission is to accelerate this transition across industries by powering the development and deployment of AI applications.

EEO Statement:

We are an inclusive and equal opportunity workplace. We do not discriminate on the basis of race, color, ancestry, religion, sex, national origin, sexual orientation, age, citizenship, marital status, disability status, gender identity, or veteran status.

Apply Promote

Subscribe to the AI Search Newsletter

Get top updates in AI to your inbox every weekend. It's free!