< More Jobs

Posted on 2026/04/15

HPC Lead

Mohamed bin Zayed University of Artificial Intelligence

United Arab Emirates

Full-time

Job description

Application Open:

Full-Time

MBZUAI is seeking an High-Performance Computing (HPC) Lead for the

Institute for Agriculture & AI (IAAI).

The HPC Lead will be

responsible for the effective management, optimization, and

governance of the Institute’s HPC resources that support research,

model development, and deployment activities.

This role ensures

secure, efficient, and equitable allocation of compu...tational

resources for internal teams and approved external partners, while

maintaining high standards of operational excellence, system

reliability, and compliance with institutional policies and

governance requirements.

Key Responsibilities

HPC Infrastructure Management

• Oversee the design, operation, and continuous enhancement of

the Institute’s high-performance computing (HPC) infrastructure,

including compute clusters, storage systems, networking, and

associated software environments supporting AI research.

• Ensure high availability, performance optimization,

scalability, and resilience of HPC resources to meet the evolving

demands of large-scale AI model training, simulation, and

experimentation

Resource Allocation and Infrastructure Governance

• Design and oversee transparent, fair, and efficient allocation

frameworks for HPC and computational resources, including

scheduling, quota management, and usage monitoring.

Ensure that

resource allocation aligns with institutional priorities, approved

research programs, and

governance decisions, while enabling responsible and equitable

access.

Partner Support and Enablement

• Ensure that internal research teams and approved external

partners are effectively supported in accessing and using HPC

resources.

• Oversee onboarding, training, and enablement mechanisms to

maximize the effective use of computational infrastructure for AI

research, development, and innovation.

Operational Excellence and Reliability

• Maintain operational excellence across the Institute’s

technical and computational platforms by ensuring robust system

monitoring, incident management, performance tuning, and capacity

planning.

• Drive continuous improvement of operational processes to ensure

reliability, efficiency, and responsiveness to user needs.

Security, Compliance, and Data Governance

• Ensure that all HPC operations and associated research

activities comply with MBZUAI’s information security standards,

data governance policies, and applicable regulatory

requirements.

• Oversee the implementation of appropriate access controls,

auditing mechanisms, and safeguards to protect sensitive data,

models, and intellectual property.

Research Enablement and Cross-Functional Coordination

• Work closely with research leadership, data engineers,

technical teams, and operations functions to align infrastructure

capabilities with scientific objectives and emerging research

needs.

• Provide strategic input into infrastructure roadmaps,

technology investments, and capacity planning to support the

Institute’s long-term growth and impact.

Academic Qualifications Required

• Master’s degree in Computer Science, Computational Science,

Data Science, Applied Mathematics, or a related field.

• PhD is desirable but not mandatory with equivalent senior

experience.

Professional Experience Required

Essential:

• Minimum of eight (8) to ten (10) years of progressive

experience managing high-performance computing (HPC) environments,

large-scale computing infrastructure, or advanced research

computing systems.

• Demonstrated experience supporting AI-driven and data-intensive

workloads, implementing transparent resource allocation and

governance frameworks, and leading multidisciplinary technical

teams.

• Experience in implementing security protocols and compliance

measures for HPC environments to safeguard sensitive research

data.

Preferred:

• Experience within academic, research, or research-intensive

institutional environments is highly desirable, particularly where

HPC infrastructure underpins large-scale AI research and

innovation.

• Familiarity with cloud-based HPC platforms and experience

managing hybrid or multi-cloud environments

Show full description

Choose what you’re giving feedback on

Report this listing

Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.