Skip to main content
Alter Domus logo

Site Reliability Engineer at Alter Domus

Alter Domus
Full Timemid
Telangana, INPosted March 19, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonJavaScriptGoShellAWSAzureLinuxUnixDevOps

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

We are looking for an experienced and motivated DevOps Engineer to join our Site Reliability Engineering (SRE) team . This role involves spearheading the Grafana Cloud and Backstage implementations as part of our Observability project. The ideal candidate will bring a blend of technical expertise in observability tools, strong problem-solving skills, and a passion for creating efficient, reliable systems.

Key Responsibilities:

  • Configure and manage data sources, including Prometheus and Azure Monitor, to build dashboards in Grafana.
  • Collaborate with DevOps engineers, system administrators, and software developers to understand monitoring requirements and design robust observability solutions.
  • Customize and extend Grafana functionalities by developing and implementing plugins and scripts.
  • Enhance visualizations for observability solutions to meet organizational needs.
  • Optimize dashboard performance and usability by fine-tuning data queries.
  • Troubleshoot and resolve issues related to Grafana configuration, data ingestion, and visualizations.
  • Participate in the administration, maintenance, and development of observability tools, including Grafana and ELK stack.
  • Troubleshoot network communication problems and ensure smooth operations.
  • Support Backstage implementation to enhance developer experience within the organization.

Required Skills

  • Familiarity with Event Management and Application Monitoring concepts.
  • Experience in building and enhancing visualizations for observability solutions.
  • Proficiency with observability tools such as Grafana , Prometheus , Dynatrace , Splunk , Azure Monitor , or AWS CloudWatch .
  • Expertise in scripting with one or more of the following languages: Unix Shell , Windows PowerShell , JavaScript , Python , or Go .
  • Strong problem-solving and analytical skills, with the ability to troubleshoot complex network communication issues.
  • Hands-on experience with the administration, maintenance, and development of Grafana or ELK stack.
  • Minimum of 5-7 years of domain experience in monitoring or related fields.
  • Comfortable working with both Windows and Linux command lines.
  • Excellent communication and collaboration skills, with the ability to work effectively within a team and interact with stakeholders.

Core/Must-Have Skills

  • Observability Subject Matter Expertise (SME)
  • Prometheus
  • Azure Monitor
  • Grafana
  • Open Telemetry

Good-to-Have Skills

  • Proficiency in Unix Shell, Windows PowerShell, JavaScript, Python, or Go.
  • Familiarity with Backstage implementation.
  • Experience troubleshooting network communication problems.

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free