Skip to main content
Firstsource logo

Senior Data Scientist

Firstsource
Full Timesenior
Kanchipuram, Tamil Nadu, INPosted April 18, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonRustSQLAWSGCPAzureDockerKubernetesTensorFlowPyTorch

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

JOB DESCRIPTION

Senior Data Scientist

Speech, Voice & Conversational AI

Apply Now

Department:

Data Science & AI

Experience

12 – 15 Years

Role Overview

We are seeking a highly experienced Senior Data Scientist – Speech, Voice & Conversational AI to lead the architecture, design, and delivery of next-generation voice and speech AI solutions. This role sits at the intersection of deep machine learning expertise and practical product engineering, driving end-to-end voice AI capabilities across Firstsource’s service lines.

The ideal candidate brings 12–15 years of progressive experience in data science with a strong specialization in speech and voice technologies, along with hands-on expertise in Generative AI, Agentic AI frameworks, and modern voice pipeline tooling. You will act as a technical thought leader, shaping our voice AI strategy, mentoring teams, and collaborating with cross-functional stakeholders to deliver production-grade solutions at scale.

Key Responsibilities

Voice & Speech AI Architecture

  • Design and own the end-to-end architecture for voice AI solutions including real-time speech-to-text (STT), text-to-speech (TTS), voice-to-voice, speaker diarization, emotion detection, and voice biometrics.
  • Evaluate, benchmark, and integrate leading speech platforms and APIs such as Google Cloud Speech, Amazon Transcribe, Azure Speech Services, Whisper (OpenAI), Deepgram, AssemblyAI, ElevenLabs, and PlayHT.
  • Build robust voice pipelines that handle noise cancellation, language identification, accent adaptation, and real-time streaming at production scale.

Generative AI & Agentic AI

  • Architect and deploy GenAI-powered conversational agents leveraging Large Language Models (LLMs) such as GPT-4, Claude, Gemini, and open-source alternatives (LLaMA, Mistral).
  • Design Agentic AI workflows using frameworks such as LangChain, LangGraph, CrewAI, AutoGen, and Semantic Kernel to build multi-step, tool-using voice agents.
  • Implement Retrieval-Augmented Generation (RAG) pipelines with vector databases (Pinecone, Weaviate, Qdrant, Chroma) for context-aware voice assistants.
  • Drive prompt engineering strategies and fine-tuning approaches (LoRA, QLoRA, RLHF) to optimize LLM performance for speech-centric use cases.

Solution Design & Delivery

  • Lead solution design workshops with clients and internal stakeholders to translate business requirements into scalable voice AI architectures.
  • Define technical roadmaps, establish best practices, and create reusable solution accelerators for voice and conversational AI.
  • Own proof-of-concept (POC) development through to production deployment, working closely with MLOps and engineering teams.

Leadership & Mentoring

  • Mentor and upskill a team of data scientists and ML engineers on speech AI and GenAI best practices.
  • Represent Firstsource as a subject-matter expert in voice AI at internal reviews, client presentations, and industry forums.
  • Stay current on rapidly evolving GenAI, speech, and agentic AI research and translate insights into actionable opportunities.

Technical Skills & Tooling

Domain

Required Proficiency

  • Speech-to-Text (STT)
  • Whisper, Google Cloud Speech, Azure Speech, Amazon Transcribe, Deepgram, AssemblyAI, Kaldi
  • Text-to-Speech (TTS)
  • ElevenLabs, PlayHT, Azure Neural TTS, Amazon Polly, Google WaveNet, Tortoise TTS, Bark
  • Voice-to-Voice
  • Real-time duplex pipelines, WebRTC integration, voice cloning, prosody transfer, streaming architectures
  • LLM & GenAI
  • GPT-4/4o, Claude, Gemini, LLaMA, Mistral, fine-tuning (LoRA/QLoRA), RLHF, prompt engineering
  • Agentic AI Frameworks
  • LangChain, LangGraph, CrewAI, AutoGen, Semantic Kernel, function calling, tool-use patterns
  • RAG & Vector DBs
  • Pinecone, Weaviate, Qdrant, Chroma, FAISS, embedding models, hybrid search
  • ML / Deep Learning
  • PyTorch, TensorFlow, Transformers (HuggingFace), audio feature engineering (MFCCs, spectrograms)
  • Cloud & MLOps
  • AWS / Azure / GCP, Docker, Kubernetes, MLflow, model serving (Triton, TorchServe, vLLM)
  • Programming
  • Python (advanced), SQL, familiarity with Rust/C++ for performance-critical audio processing
  • Telephony & Contact Center
  • Twilio, Genesys, Amazon Connect, SIP/VoIP protocols, CCAI (Google Contact Center AI)

Qualifications & Experience

  • 12–15 years of progressive experience in Data Science, Machine Learning, or AI Engineering, with at least 5 years focused on speech, voice, or audio ML

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free