Skip to main content
EXL logo

Data Scientist-Data Science-Gen AI Engineer

EXL
Full Timemid
INPosted April 24, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

ReactAWSGCPAzureDockerKubernetesDevOps

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

As a Machine Learning Systems Architect, your role will involve leading the architecture, development, and deployment of scalable machine learning systems with a focus on real-time inference for Large Language Models (LLMs) to serve multiple concurrent users. You will be optimizing inference pipelines using high-performance frameworks such as vLLM, Groq, ONNX Runtime, Triton Inference Server, and TensorRT to minimize latency and cost.

Your key responsibilities will include:

  • Designing and implementing agentic AI systems by utilizing frameworks like LangChain, AutoGPT, and ReAct for autonomous task orchestration.
  • Fine-tuning, integrating, and deploying foundation models like GPT, LLaMA, Claude, Mistral, Falcon, and others into intelligent applications.
  • Developing and maintaining robust MLOps workflows to manage the full model lifecycle, including training, deployment, monitoring, and versioning.
  • Collaborating with DevOps teams to implement scalable serving infrastructure leveraging containerization (Docker), orchestration (Kubernetes), and cloud platforms (AWS, GCP, Azure).
  • Implementing retrieval-augmented generation (RAG) pipelines by integrating vector databases such as FAISS, Pinecone, or Weaviate.
  • Building observability systems for LLMs to track prompt performance, latency, and user feedback.
  • Working cross-functionally with research, product, and operations teams to deliver production-grade AI systems that can handle real-world traffic patterns.
  • Staying updated on emerging AI trends, hardware acceleration techniques, and contributing to open-source or research initiatives whenever possible.

Your qualifications for this role should include:

  • Strong experience in leading the development and deployment of scalable machine learning systems.
  • Proficiency in optimizing inference pipelines using high-performance frameworks.
  • Hands-on experience with designing and implementing AI systems utilizing various frameworks.
  • Expertise in fine-tuning, integrating, and deploying foundation models into intelligent applications.
  • Solid understanding of MLOps workflows and managing the full model lifecycle.
  • Previous experience in collaborating with DevOps teams and implementing scalable serving infrastructure.
  • Familiarity with retrieval-augmented generation (RAG) pipelines and integrating vector databases.
  • Excellent skills in building observability systems and tracking performance metrics.
  • Ability to work effectively in a cross-functional team environment and deliver production-grade AI systems.
  • Continuous learning and staying updated on emerging AI trends and technologies.

If there are any additional details about the company provided in the job description, please share that information with me for inclusion in the final output. As a Machine Learning Systems Architect, your role will involve leading the architecture, development, and deployment of scalable machine learning systems with a focus on real-time inference for Large Language Models (LLMs) to serve multiple concurrent users. You will be optimizing inference pipelines using high-performance frameworks such as vLLM, Groq, ONNX Runtime, Triton Inference Server, and TensorRT to minimize latency and cost.

Your key responsibilities will include:

  • Designing and implementing agentic AI systems by utilizing frameworks like LangChain, AutoGPT, and ReAct for autonomous task orchestration.
  • Fine-tuning, integrating, and deploying foundation models like GPT, LLaMA, Claude, Mistral, Falcon, and others into intelligent applications.
  • Developing and maintaining robust MLOps workflows to manage the full model lifecycle, including training, deployment, monitoring, and versioning.
  • Collaborating with DevOps teams to implement scalable serving infrastructure leveraging containerization (Docker), orchestration (Kubernetes), and cloud platforms (AWS, GCP, Azure).
  • Implementing retrieval-augmented generation (RAG) pipelines by integrating vector databases such as FAISS, Pinecone, or Weaviate.
  • Building observability systems for LLMs to track prompt performance, latency, and user feedback.
  • Working cross-functionally with research, product, and operations teams to deliver production-grade AI systems that can handle real-world traffic patterns.
  • Staying updated on emerging AI trends, hardware acceleration techniques, and contributing to open-source or research initiatives whenever possible.

Your qualifications for this role should include:

  • Strong experience in leading the development and deployment of scalable machine learning systems.
  • Proficiency in optimizing inference pipelines using high-performance frameworks.
  • Hands-on experience with designing and implementing AI systems utilizing various frameworks.
  • Expertise in fine-tuning, integrating, and deploying foundation models into intelligent applications.
  • Solid understanding of MLOps workflows and managing the full model lifecycle.
  • Previous experience in collaborating with DevOp

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free