Skip to main content
Zorba Consulting India Pvt. Ltd. logo

LLM Engineer - Machine Learning

Zorba Consulting India Pvt. Ltd.
Full Timemid
Telangana, INPosted March 20, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonGCPAzureDockerKubernetesPyTorchCI/CD

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

Description : A leading consulting firm operating in the Enterprise Generative AI and Large Language Model (LLM) services sector, delivering production-grade LLM solutions, retrieval-augmented systems, and custom generative AI products for enterprise clients across domains. The team focuses on building secure, scalable, low-latency inference services and automating model lifecycle workflows for on-prem and cloud deployments.Position : LLM Engineer - On-site (India). We are hiring an experienced LLM engineer to design, fine-tune, and deploy LLM-based solutions that power search, summarization, agents, and domain-specific assistants.Role & Responsibilities : - Design, fine-tune, and validate LLMs for production use-cases : Instruction tuning, supervised fine-tuning, and parameter-efficient tuning (LoRA/adapters)- Implement retrieval-augmented generation (RAG) pipelines : Embeddings, vector search, chunking, and context assembly for high-recall responses.- Optimize inference for latency and cost : Quantization, model pruning, batching, and deployment with optimized runtimes (CUDA, Triton, bitsandbytes where applicable).- Build backend services and APIs to serve LLM inference and orchestration using containerized deployments (Docker/Kubernetes) and CI/CD pipelines.- Collaborate with product, data engineering, and ML teams to integrate LLMs into production flows, monitor model performance, and set up automated retraining/rollbacks.- Create reproducible training pipelines, implement evaluation suites, and produce documentation and runbooks for model governance and observability.Skills & Qualifications : Must-Have : - 4+ years of hands-on experience working with LLMs or advanced NLP models in production contexts.- Proficiency in Python for ML engineering and model development.- Experience with PyTorch and Hugging Face Transformers for training and fine-tuning.- Practical experience implementing RAG and vector search using tools like FAISS or similar vector databases.- Familiarity with LangChain (or equivalent orchestration) and integration with LLM APIs (OpenAI, Anthropic, etc.).- Experience containerizing and deploying ML services using Docker; familiarity with Kubernetes is a plus.Preferred : - Experience with inference optimizations : quantization (bitsandbytes), Triton, or GPU accelerated serving.- Exposure to distributed training frameworks (DeepSpeed) and cloud MLOps platforms (SageMaker, Azure ML, GCP AI Platform).- Knowledge of monitoring, logging, and model-evaluation frameworks for production LLMs (MLflow, Prometheus, Grafana).Benefits & Culture Highlights : - Collaborative, engineering-driven culture with strong focus on ownership and rapid iteration.- Opportunity to build end-to-end LLM products for enterprise clients and influence architecture decisions.- On-site role with hands-on access to GPU infrastructure and cross-functional product teams.Skills : pytorch, cuda, docker, python, agentic, llm (ref: hirist.tech)

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free