Posted on 6/13/2025
Staff Machine Learning Engineer
TieTalent
Springfield, IL
Qualifications
- Required Skills and Experience: Master's degree in Computer Science, AI, or related field
- 8+ years in building and operating large-scale data pipelines in production
- Experience with heterogeneous datasets and frameworks like Apache Spark, Ray, Dask
- Hands-on with large-scale ML data tools like Hugging Face Datasets, MosaicML Streaming, WebDataset, Petastorm
- Proficiency in Python and distributed data processing tools
- Leadership skills: influence on technical and data strategy, mentoring, cross-team collaboration, excellent communication, ability to work in fast-paced environments
- Open-source contributions
- Experience with clinical or biological data (EHR, genomics, imaging)
- We value curiosity, passion, and a drive to build and learn continuously
- GCP
- Python
- Illinois, United States
- Machine Learning
- Data Engineer
- English
Benefits
- Pay Range: NY: $190,000 - $230,000 USD California: $190,000 - $230,000 USD Illinois: $170,000 - $210,000 USD Remote (USA): $170,000 - $210,000 USD Salary varies by location and experience
- Tempus offers comprehensive benefits including incentive compensation, stock units, medical, and other benefits
Responsibilities
- You will design, build, and optimize data infrastructure powering Tempus's generative AI models, enabling analysis of complex data types like
- genomics, pathology images, radiology scans, and clinical notes
- This role is crucial for applying AI to improve healthcare outcomes
- architect, build, and maintain
- the data infrastructure supporting large multimodal generative models, managing datasets from ingestion to knowledge integration, supporting AI learning from real-world evidence
- Key Responsibilities: Design data processing workflows for multimodal data integration with ML training frameworks (GPU clusters)
- Develop strategies for data ingestion from various sources, ensuring compliance and efficiency
- Utilize frameworks like MosaicML Streaming, Ray Data, HF Datasets for data handling
- Collaborate with infrastructure teams to leverage cloud services (primarily GCP)
- Build connectors for knowledge sources such as knowledge graphs, biomedical literature, and ontologies
- Optimize data storage and access for large-scale training and knowledge retrieval
- Manage data workflows using tools like Airflow, Kubeflow Pipelines
- Implement monitoring and alerting for data pipelines, ensuring quality and performance
- Analyze and address data I/O bottlenecks, manage cloud storage costs
- Understanding of training large models (Foundation Models, LLMs, Multimodal Models)
Full Description
About
Staff Machine Learning Engineer page is loaded Staff Machine Learning Engineer
Apply locations Chicago, New York City, Remote - Illinois, Redwood City Time type: Full time | Posted on: Posted 30+ Days Ago | Job requisition id: JR202500372 Passionate about precision medicine and advancing the healthcare industry? Recent advancements in underlying technology have made it possible for AI to impact clinical care significantly. Tempus' proprietary platform connects real-world evidence to deliver real-time, actionable insights to physicians, providing critical information about treatments for patients at the right time. What You’ll Do: We seek an experienced
Staff Machine Learning Engineer
with expertise in
large-scale multimodal model systems engineering
to join our AI team. You will design, build, and optimize data infrastructure powering Tempus's generative AI models, enabling analysis of complex data types like
genomics, pathology images, radiology scans, and clinical notes . This role is crucial for applying AI to improve healthcare outcomes. Focus: Your main focus will be to
architect, build, and maintain
the data infrastructure supporting large multimodal generative models, managing datasets from ingestion to knowledge integration, supporting AI learning from real-world evidence. Key Responsibilities: Design data processing workflows for multimodal data integration with ML training frameworks (GPU clusters). Develop strategies for data ingestion from various sources, ensuring compliance and efficiency. Utilize frameworks like MosaicML Streaming, Ray Data, HF Datasets for data handling. Collaborate with infrastructure teams to leverage cloud services (primarily GCP). Build connectors for knowledge sources such as knowledge graphs, biomedical literature, and ontologies. Optimize data storage and access for large-scale training and knowledge retrieval. Manage data workflows using tools like Airflow, Kubeflow Pipelines. Implement monitoring and alerting for data pipelines, ensuring quality and performance. Analyze and address data I/O bottlenecks, manage cloud storage costs. Required Skills and Experience: Master's degree in Computer Science, AI, or related field. 8+ years in building and operating large-scale data pipelines in production. Experience with heterogeneous datasets and frameworks like Apache Spark, Ray, Dask. Hands-on with large-scale ML data tools like Hugging Face Datasets, MosaicML Streaming, WebDataset, Petastorm. Understanding of training large models (Foundation Models, LLMs, Multimodal Models). Proficiency in Python and distributed data processing tools. Leadership skills: influence on technical and data strategy, mentoring, cross-team collaboration, excellent communication, ability to work in fast-paced environments. Preferred Qualifications: PhD in Computer Science, Engineering, Bioinformatics, or related field. Open-source contributions. Experience with clinical or biological data (EHR, genomics, imaging). Pay Range: NY: $190,000 - $230,000 USD California: $190,000 - $230,000 USD Illinois: $170,000 - $210,000 USD Remote (USA): $170,000 - $210,000
USD Salary varies by location and experience. Tempus offers comprehensive benefits including incentive compensation, stock units, medical, and other benefits. For remote roles in Los Angeles, criminal history may be considered due to job responsibilities involving sensitive information. Qualified applicants will be considered per applicable laws. We are an equal opportunity employer, committed to diversity and inclusion. About Us
Why Work Here?
We’re looking for people who can change the world, question the status quo, and tackle tough problems. We value curiosity, passion, and a drive to build and learn continuously. Join us to address one of humanity’s most significant challenges.
#J-18808-Ljbffr
Nice-to-have skills
• Genomics
• Cloud Services
• GCP
• Python
• Illinois, United States
Work experience
• Machine Learning
• Data Engineer
• Data Infrastructure
Languages
• English
Find AI, ML, Data Science Jobs By Location
Find Jobs By Position
Subscribe to the AI Search Newsletter
Get top updates in AI to your inbox every weekend. It's free!