Qualifications
- Work ethic and strong prioritization skills are important
- All engineers and researchers are expected to have strong communication skills
- They should be able to concisely and accurately share knowledge with their teammates
- We're looking for someone with technical expertise and a proactive approach to maintain and scale our facilities effectively
Responsibilities
- As the Associate Site Operations Manager, you'll oversee the data center technicians who keep xAI's AI infrastructure running smoothly
- This role is pivotal in ensuring our systems operate at peak efficiency, supporting the compute power behind our mission
- You'll co-lead a skilled team, manage critical operations, and implement smart, sustainable solutions
- Oversee Site Operations: Manage power, cooling, networking, and hardware deployments to ensure 99.999% uptime for xAI's AI compute systems, keeping our infrastructure reliable and ready for innovation
- Guide Your Team: Lead and develop a team of Data Center Operations Technicians through training, performance evaluations, and fostering a collaborative, high-performing environment tied to xAI's objectives
- Streamline Processes: Take charge of hardware lifecycles, incident resolution, and inventory management, refining procedures to ensure your team operates with precision and consistency
- Connect Key Players: Coordinate between technicians, xAI's AI specialists, and external vendors to integrate new technology and expand capacity seamlessly
- Drive Sustainable Solutions: Champion energy-efficient practices and sustainability efforts, optimizing resources while supporting the demands of cutting-edge AI workloads
- Measure Success: Track and report key metrics like uptime, power efficiency, and issue resolution times, using data to enhance site performance and inform decisions
- Handle Emergencies: Lead the team through urgent situations with clear direction, resolving issues quickly to protect our AI systems from disruption
- Optimize Operations: Build and refine processes-such as preventative maintenance schedules with vendors and ticket workflows in Jira-to keep operations efficient and scalable
- Support Expansion: Work with leadership to standardize best practices across sites (if applicable), ensuring operations align with xAI's ambitious growth plans
Full Description
About xAI
Our Mission
We are committed to creating AI systems that can accurately understand the universe and aid humanity in its pursuit of knowledge. Our team is small, highly motivated, and focused on engineering excellence.
We operate with a flat organizational structure. All employees are expected to be hands-on and to contribute directly to the company's mission. Leadership is given to those who show initiative and consistently deliver excellence. Work ethic and strong prioritization skills are important.
All engineers and researchers are expected to have strong communication skills. They should be able to concisely and accurately share knowledge with their teammates.
About the Role
As the Associate Site Operations Manager, you'll oversee the data center technicians who keep xAI's AI infrastructure running smoothly. This role is pivotal in ensuring our systems operate at peak efficiency, supporting the compute power behind our mission. You'll co-lead a skilled team, manage critical operations, and implement smart, sustainable solutions.
We're looking for someone with technical expertise and a proactive approach to maintain and scale our facilities effectively.
Responsibilities
• Oversee Site Operations: Manage power, cooling, networking, and hardware deployments to ensure 99.999% uptime for xAI's AI compute systems, keeping our infrastructure reliable and ready for innovation.
• Guide Your Team: Lead and develop a team of Data Center Operations Technicians through training, performance evaluations, and fostering a collaborative, high-performing environment tied to xAI's objectives.
• Streamline Processes: Take charge of hardware lifecycles, incident resolution, and inventory management, refining procedures to ensure your team operates with precision and consistency.
• Connect Key Players: Coordinate between technicians, xAI's AI specialists, and external vendors to integrate new technology and expand capacity seamlessly.
• Drive Sustainable Solutions: Champion energy-efficient practices and sustainability efforts, optimizing resources while supporting the demands of cutting-edge AI workloads.
• Measure Success: Track and report key metrics like uptime, power efficiency, and issue resolution times, using data to enhance site performance and inform decisions.
• Handle Emergencies: Lead the team through urgent situations with clear direction, resolving issues quickly to protect our AI systems from disruption.
• Optimize Operations: Build and refine processes-such as preventative maintenance schedules with vendors and ticket workflows in Jira-to keep operations efficient and scalable.
• Support Expansion: Work with leadership to standardize best practices across sites (if applicable), ensuring operations align with xAI's ambitious growth plans.
Find AI, ML, Data Science Jobs By Location
Find Jobs By Position
Subscribe to the AI Search Newsletter
Get top updates in AI to your inbox every weekend. It's free!