Skip to main content
P

Site Reliability Engineer (SRE) – AI & Incident Management

Praxis HR Solution
Full TimemidHybrid
Gurugram, Haryana, INPosted March 12, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonBashAWSGCPAzureDockerKubernetesTerraformAnsibleLinuxCI/CDDevOps

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

Job Title

Site Reliability Engineer (SRE) – AI & Incident Management

Location

Pune | Gurugram | Noida (Hybrid / On-site)

Employment Type

Full-Time

Notice Period

Immediate Joiners to 30 Days

Job Summary

We are looking for a highly motivated Site Reliability Engineer (SRE) with strong expertise in AI-driven systems and Incident Management. The ideal candidate will be responsible for ensuring reliability, scalability, and performance of critical production systems. This role requires hands-on experience in automation, monitoring, and incident response to maintain high system availability.

Key Responsibilities

  • Ensure high availability, reliability, and performance of production systems.
  • Monitor infrastructure and applications to detect and resolve issues proactively.
  • Manage incident response, troubleshooting, and root cause analysis (RCA).
  • Implement automation to improve operational efficiency and reduce manual efforts.
  • Work closely with development teams to improve system reliability and deployment processes.
  • Utilize AI/ML tools or AI-enabled platforms to enhance monitoring and incident prediction.
  • Maintain SLA, SLO, and SLI metrics for system reliability.
  • Build and maintain observability solutions (logging, metrics, tracing).
  • Participate in on-call rotations and handle production incidents.

Required Skills

  • Strong experience in Site Reliability Engineering (SRE)
  • Hands-on experience with Incident Management and Production Support
  • Knowledge of AI tools / AI-driven automation / AI-based monitoring
  • Experience with Cloud Platforms (AWS / Azure / GCP)
  • Familiarity with Monitoring Tools (Prometheus, Grafana, Datadog, Splunk, etc.)
  • Experience with Linux / scripting (Python, Bash)
  • Knowledge of CI/CD pipelines and DevOps practices
  • Understanding of containerization (Docker, Kubernetes)

Preferred Qualifications

  • Experience with AIOps platforms
  • Knowledge of Infrastructure as Code (Terraform / Ansible)
  • Strong debugging and problem-solving skills
  • Experience working in high-availability distributed systems

Why Join Us

  • Opportunity to work on modern AI-driven infrastructure
  • Exposure to large-scale production environments
  • Collaborative and growth-focused work culture

How to Apply

Interested candidates with Immediate to 30 days notice period can apply via Indeed or share their updated resume.

Job Types: Full-time, Permanent

Pay: ₹1,200,000.00 per year

Benefits

  • Cell phone reimbursement
  • Food provided
  • Health insurance
  • Paid sick time
  • Paid time off
  • Provident Fund
  • Work from home

Work Location: In person

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free