Posted on 2026/04/15
HPC Lead
Mohamed bin Zayed University of Artificial Intelligence
United Arab Emirates
Job description
Application Open:
Full-Time
MBZUAI is seeking an High-Performance Computing (HPC) Lead for the
Institute for Agriculture & AI (IAAI).
The HPC Lead will be
responsible for the effective management, optimization, and
governance of the Institute’s HPC resources that support research,
model development, and deployment activities.
This role ensures
secure, efficient, and equitable allocation of compu...tational
resources for internal teams and approved external partners, while
maintaining high standards of operational excellence, system
reliability, and compliance with institutional policies and
governance requirements.
Key Responsibilities
HPC Infrastructure Management
• Oversee the design, operation, and continuous enhancement of
the Institute’s high-performance computing (HPC) infrastructure,
including compute clusters, storage systems, networking, and
associated software environments supporting AI research.
• Ensure high availability, performance optimization,
scalability, and resilience of HPC resources to meet the evolving
demands of large-scale AI model training, simulation, and
experimentation
Resource Allocation and Infrastructure Governance
• Design and oversee transparent, fair, and efficient allocation
frameworks for HPC and computational resources, including
scheduling, quota management, and usage monitoring.
Ensure that
resource allocation aligns with institutional priorities, approved
research programs, and
governance decisions, while enabling responsible and equitable
access.
Partner Support and Enablement
• Ensure that internal research teams and approved external
partners are effectively supported in accessing and using HPC
resources.
• Oversee onboarding, training, and enablement mechanisms to
maximize the effective use of computational infrastructure for AI
research, development, and innovation.
Operational Excellence and Reliability
• Maintain operational excellence across the Institute’s
technical and computational platforms by ensuring robust system
monitoring, incident management, performance tuning, and capacity
planning.
• Drive continuous improvement of operational processes to ensure
reliability, efficiency, and responsiveness to user needs.
Security, Compliance, and Data Governance
• Ensure that all HPC operations and associated research
activities comply with MBZUAI’s information security standards,
data governance policies, and applicable regulatory
requirements.
• Oversee the implementation of appropriate access controls,
auditing mechanisms, and safeguards to protect sensitive data,
models, and intellectual property.
Research Enablement and Cross-Functional Coordination
• Work closely with research leadership, data engineers,
technical teams, and operations functions to align infrastructure
capabilities with scientific objectives and emerging research
needs.
• Provide strategic input into infrastructure roadmaps,
technology investments, and capacity planning to support the
Institute’s long-term growth and impact.
Academic Qualifications Required
• Master’s degree in Computer Science, Computational Science,
Data Science, Applied Mathematics, or a related field.
• PhD is desirable but not mandatory with equivalent senior
experience.
Professional Experience Required
Essential:
• Minimum of eight (8) to ten (10) years of progressive
experience managing high-performance computing (HPC) environments,
large-scale computing infrastructure, or advanced research
computing systems.
• Demonstrated experience supporting AI-driven and data-intensive
workloads, implementing transparent resource allocation and
governance frameworks, and leading multidisciplinary technical
teams.
• Experience in implementing security protocols and compliance
measures for HPC environments to safeguard sensitive research
data.
Preferred:
• Experience within academic, research, or research-intensive
institutional environments is highly desirable, particularly where
HPC infrastructure underpins large-scale AI research and
innovation.
• Familiarity with cloud-based HPC platforms and experience
managing hybrid or multi-cloud environments
Show full description
Choose what you’re giving feedback on
Report this listing

Zero to AI Engineer
Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.
Find AI, ML, Data Science Jobs By Location
Find Jobs By Position