< More Jobs

Posted on 2025/12/12

Applied Scientist - AI Testing

Teleion Consulting

Seattle, WA, United States

Full-time

Qualifications

  • Experience with AI model evaluation, red teaming, or safety testing
  • Familiarity with prompt injection attacks, AI security threats, and LLM behavior analysis
  • Ability to analyze datasets, identify gaps, and produce actionable insights for Responsible AI evaluations
  • Strong communication skills for cross-functional collaboration with engineering, research, and product teams
  • Required: Eligibility to work in the United States without sponsorship presently or in the future

Benefits

  • This is a full time roles and Teleion offers full benefits, PTO, holiday, 401(k)

Responsibilities

  • The AI Evaluation & Security Specialist will support evaluation workflows for our client’s product group, with a focus on identifying vulnerabilities, strengthening AI system safety, and improving model performance across emerging digital worker and copilot experiences
  • Investigate and analyze known XPIA (Cross Prompt Injection Attack) incidents and similar attack vectors across OPG canvas surfaces
  • Identify and characterize harm cases and vulnerabilities affecting AI systems, including prompt manipulation, model exploitation, and safety bypass risks
  • Expand existing EVAL frameworks to include digital worker security testing, covering XPIA, UPIA (Unauthorized Prompt Injection Attacks), and memory poisoning risks
  • Evaluation & Model Quality
  • Support accuracy and quality evaluations for agent-based systems developed to accelerate Copilot and related productivity experiences
  • Develop evaluation insights that directly inform model iteration, safety hardening, and product readiness
  • Conduct deep dives into Responsible AI datasets, including synthetic datasets, to assess their suitability and impact on safety evaluation workflows
  • Validate and report on whether synthetic RAI datasets improve or fail to improve evaluation coverage, in alignment with internal standards
  • Produce clear evidence-based recommendations to strengthen Responsible AI dataset quality, coverage, and reliability

Full Description

• Must be living in the following states to be considered: Florida (FL), Georgia (GA) , Illinois (IL), Iowa (IA), Nevada (NV), North Carolina (NC), Pennsylvania (PA), Texas (TX), Washington (WA), Virginia (VA), Wisconsin (WI)

The AI Evaluation & Security Specialist will support evaluation workflows for our client’s product group, with a focus on identifying vulnerabilities, strengthening AI system safety, and improving model performance across emerging digital worker and copilot experiences.

Key Responsibilities

AI Security & Vulnerability Analysis

• Investigate and analyze known XPIA (Cross Prompt Injection Attack) incidents and similar attack vectors across OPG canvas surfaces.

• Identify and characterize harm cases and vulnerabilities affecting AI systems, including prompt manipulation, model exploitation, and safety bypass risks.

• Expand existing EVAL frameworks to include digital worker security testing, covering XPIA, UPIA (Unauthorized Prompt Injection Attacks), and memory poisoning risks.

Evaluation & Model Quality

• Support accuracy and quality evaluations for agent-based systems developed to accelerate Copilot and related productivity experiences.

• Develop evaluation insights that directly inform model iteration, safety hardening, and product readiness.

Responsible AI Research

• Conduct deep dives into Responsible AI datasets, including synthetic datasets, to assess their suitability and impact on safety evaluation workflows.

• Validate and report on whether synthetic RAI datasets improve or fail to improve evaluation coverage, in alignment with internal standards.

• Produce clear evidence-based recommendations to strengthen Responsible AI dataset quality, coverage, and reliability.

Qualifications

• Experience with AI model evaluation, red teaming, or safety testing.

• Familiarity with prompt injection attacks, AI security threats, and LLM behavior analysis.

• Ability to analyze datasets, identify gaps, and produce actionable insights for Responsible AI evaluations.

• Strong communication skills for cross-functional collaboration with engineering, research, and product teams.

This is a full time roles and Teleion offers full benefits, PTO, holiday, 401(k).

See how other employees have reviewed us on Glassdoor.

Required: Eligibility to work in the United States without sponsorship presently or in the future.

Teleion has made the Seattle Business Magazine Washington’s 100 Best Place to Work list!

(https://seattlebusinessmag.com/100-best-companies-work/100-best-companies-work-midsize)

Teleion is Minority owned and an Equal Opportunity Employer – We welcome all races, sexual orientations, gender identities, veterans, religions and disabilities

Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.