Senior Data Scientist

Firstsource

Full Timesenior

Kanchipuram, Tamil Nadu, INPosted April 18, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonRustSQLAWSGCPAzureDockerKubernetesTensorFlowPyTorch

Job Description

JOB DESCRIPTION

Senior Data Scientist

Speech, Voice & Conversational AI

Apply Now

Department:

Data Science & AI

Experience

12 – 15 Years

Role Overview

We are seeking a highly experienced Senior Data Scientist – Speech, Voice & Conversational AI to lead the architecture, design, and delivery of next-generation voice and speech AI solutions. This role sits at the intersection of deep machine learning expertise and practical product engineering, driving end-to-end voice AI capabilities across Firstsource’s service lines.

The ideal candidate brings 12–15 years of progressive experience in data science with a strong specialization in speech and voice technologies, along with hands-on expertise in Generative AI, Agentic AI frameworks, and modern voice pipeline tooling. You will act as a technical thought leader, shaping our voice AI strategy, mentoring teams, and collaborating with cross-functional stakeholders to deliver production-grade solutions at scale.

Key Responsibilities

Voice & Speech AI Architecture

Design and own the end-to-end architecture for voice AI solutions including real-time speech-to-text (STT), text-to-speech (TTS), voice-to-voice, speaker diarization, emotion detection, and voice biometrics.
Evaluate, benchmark, and integrate leading speech platforms and APIs such as Google Cloud Speech, Amazon Transcribe, Azure Speech Services, Whisper (OpenAI), Deepgram, AssemblyAI, ElevenLabs, and PlayHT.
Build robust voice pipelines that handle noise cancellation, language identification, accent adaptation, and real-time streaming at production scale.

Generative AI & Agentic AI

Architect and deploy GenAI-powered conversational agents leveraging Large Language Models (LLMs) such as GPT-4, Claude, Gemini, and open-source alternatives (LLaMA, Mistral).
Design Agentic AI workflows using frameworks such as LangChain, LangGraph, CrewAI, AutoGen, and Semantic Kernel to build multi-step, tool-using voice agents.
Implement Retrieval-Augmented Generation (RAG) pipelines with vector databases (Pinecone, Weaviate, Qdrant, Chroma) for context-aware voice assistants.
Drive prompt engineering strategies and fine-tuning approaches (LoRA, QLoRA, RLHF) to optimize LLM performance for speech-centric use cases.

Solution Design & Delivery

Lead solution design workshops with clients and internal stakeholders to translate business requirements into scalable voice AI architectures.
Define technical roadmaps, establish best practices, and create reusable solution accelerators for voice and conversational AI.
Own proof-of-concept (POC) development through to production deployment, working closely with MLOps and engineering teams.

Leadership & Mentoring

Mentor and upskill a team of data scientists and ML engineers on speech AI and GenAI best practices.
Represent Firstsource as a subject-matter expert in voice AI at internal reviews, client presentations, and industry forums.
Stay current on rapidly evolving GenAI, speech, and agentic AI research and translate insights into actionable opportunities.

Technical Skills & Tooling

Domain

Required Proficiency

Speech-to-Text (STT)
Whisper, Google Cloud Speech, Azure Speech, Amazon Transcribe, Deepgram, AssemblyAI, Kaldi
Text-to-Speech (TTS)
ElevenLabs, PlayHT, Azure Neural TTS, Amazon Polly, Google WaveNet, Tortoise TTS, Bark
Voice-to-Voice
Real-time duplex pipelines, WebRTC integration, voice cloning, prosody transfer, streaming architectures
LLM & GenAI
GPT-4/4o, Claude, Gemini, LLaMA, Mistral, fine-tuning (LoRA/QLoRA), RLHF, prompt engineering
Agentic AI Frameworks
LangChain, LangGraph, CrewAI, AutoGen, Semantic Kernel, function calling, tool-use patterns
RAG & Vector DBs
Pinecone, Weaviate, Qdrant, Chroma, FAISS, embedding models, hybrid search
ML / Deep Learning
PyTorch, TensorFlow, Transformers (HuggingFace), audio feature engineering (MFCCs, spectrograms)
Cloud & MLOps
AWS / Azure / GCP, Docker, Kubernetes, MLflow, model serving (Triton, TorchServe, vLLM)
Programming
Python (advanced), SQL, familiarity with Rust/C++ for performance-critical audio processing
Telephony & Contact Center
Twilio, Genesys, Amazon Connect, SIP/VoIP protocols, CCAI (Google Contact Center AI)

Qualifications & Experience

12–15 years of progressive experience in Data Science, Machine Learning, or AI Engineering, with at least 5 years focused on speech, voice, or audio ML

All jobs at Firstsource →Browse Data Science Jobs →