< More Jobs

Posted on 2026/01/28

Data Scientist NLP & AI

Jobs via Dice

Houston, TX, United States

Full-time

Qualifications

  • Engineering degree: BE / ME / BTech / MTech / BSc / MSc
  • Strong proficiency in Python and scripting for NLP and machine learning development
  • Solid experience with clinical NLP techniques and ML/DL models
  • Hands-on expertise with LLMs and agentic workflow tools such as LangGraph
  • Advanced knowledge of SQL and big data technologies including AWS EMR and Spark/pySpark
  • Practical experience with AWS services, especially AWS Bedrock
  • Experience working with relational databases such as PostgreSQL or MySQL
  • Exposure to generative AI solutions in healthcare use cases
  • Knowledge of healthcare data standards and terminologies (HL7, FHIR, CCDA)
  • Experience producing technical documentation, user guides, and specifications
  • Background in automated testing and validation frameworks for NLP systems
  • Strong collaboration skills across engineering and product teams
  • Familiarity with LangChain or similar agent-based AI frameworks

Responsibilities

  • You will design advanced NLP capabilities, integrate large language models (LLMs) and agent-based AI workflows, and leverage AWS big data technologies to improve clinical data processing, accessibility, and usability
  • Analyze and process clinical text data using AI-driven NLP techniques and advanced machine learning models
  • Enhance and optimize existing workflows by integrating modern machine learning and deep learning approaches, including LLMs and agentic workflow frameworks such as LangGraph in healthcare environments
  • Design and develop NLP modules using Python and other scripting languages as part of the NLP engineering team
  • Perform data preprocessing, quality assessment, and validation of NLP model outputs
  • Develop structured testing methodologies, error-detection mechanisms, and user documentation for NLP solutions
  • Build and maintain data infrastructure for efficient extraction, transformation, and loading (ETL) from diverse data sources, including MCP servers, using SQL and AWS big data tools such as EMR and Spark/pySpark
  • Partner with engineering teams to ensure scalable, high-performance data workflows leveraging SQL and AWS technologies
  • Apply hands-on knowledge of AWS services, particularly AWS Bedrock, to build generative AI solutions
  • Utilize relational databases such as PostgreSQL and MySQL to support NLP and AI pipelines

Full Description

Dice is the leading career destination for tech experts at every stage of their careers.

Our client, ADDSOURCE, is seeking the following.

Apply via Dice today!

Data Scientist NLP & AI

Experience Required - 12+ years of relevant experience Location - Houston, TX (minimum two days per week onsite) Hiring Mode - C2C What's in it for you?

As a Data Scientist NLP & AI, you will join an agile team dedicated to building intelligent healthcare solutions.

You will design advanced NLP capabilities, integrate large language models (LLMs) and agent-based AI workflows, and leverage AWS big data technologies to improve clinical data processing, accessibility, and usability.

Key Responsibilities

• Analyze and process clinical text data using AI-driven NLP techniques and advanced machine learning models.

• Enhance and optimize existing workflows by integrating modern machine learning and deep learning approaches, including LLMs and agentic workflow frameworks such as LangGraph in healthcare environments.

• Design and develop NLP modules using Python and other scripting languages as part of the NLP engineering team.

• Perform data preprocessing, quality assessment, and validation of NLP model outputs.

• Develop structured testing methodologies, error-detection mechanisms, and user documentation for NLP solutions.

• Build and maintain data infrastructure for efficient extraction, transformation, and loading (ETL) from diverse data sources, including MCP servers, using SQL and AWS big data tools such as EMR and Spark/pySpark.

• Partner with engineering teams to ensure scalable, high-performance data workflows leveraging SQL and AWS technologies.

• Apply hands-on knowledge of AWS services, particularly AWS Bedrock, to build generative AI solutions.

• Utilize relational databases such as PostgreSQL and MySQL to support NLP and AI pipelines.

Education

• Engineering degree: BE / ME / BTech / MTech / BSc / MSc

• Technical certifications across multiple technologies are a plus

Required Skills

• Strong proficiency in Python and scripting for NLP and machine learning development

• Solid experience with clinical NLP techniques and ML/DL models

• Hands-on expertise with LLMs and agentic workflow tools such as LangGraph

• Advanced knowledge of SQL and big data technologies including AWS EMR and Spark/pySpark

• Practical experience with AWS services, especially AWS Bedrock

• Experience working with relational databases such as PostgreSQL or MySQL

Nice-to-Have Skills

• Exposure to generative AI solutions in healthcare use cases

• Knowledge of healthcare data standards and terminologies (HL7, FHIR, CCDA)

• Experience producing technical documentation, user guides, and specifications

• Background in automated testing and validation frameworks for NLP systems

• Strong collaboration skills across engineering and product teams

• Familiarity with LangChain or similar agent-based AI frameworks

Zero to AI Engineer Program

Zero to AI Engineer

Skip the degree. Learn real-world AI skills used by AI researchers and engineers. Get certified in 8 weeks or less. No experience required.