Skip to main content
Alter Domus logo

Site Reliability Engineer

Alter Domus
Full Timemid
Telangana, INPosted 6 weeks ago

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonJavaScriptGoShellAWSAzureLinuxUnixDevOps

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

We are looking for an experienced and motivated DevOps Engineer to join our Site Reliability Engineering (SRE) team . This role involves spearheading the Grafana Cloud and Backstage implementations as part of our Observability project. The ideal candidate will bring a blend of technical expertise in observability tools, strong problem-solving skills, and a passion for creating efficient, reliable systems.

Key Responsibilities:

  • Configure and manage data sources, including Prometheus and Azure Monitor, to build dashboards in Grafana.
  • Collaborate with DevOps engineers, system administrators, and software developers to understand monitoring requirements and design robust observability solutions.
  • Customize and extend Grafana functionalities by developing and implementing plugins and scripts.
  • Enhance visualizations for observability solutions to meet organizational needs.
  • Optimize dashboard performance and usability by fine-tuning data queries.
  • Troubleshoot and resolve issues related to Grafana configuration, data ingestion, and visualizations.
  • Participate in the administration, maintenance, and development of observability tools, including Grafana and ELK stack.
  • Troubleshoot network communication problems and ensure smooth operations.
  • Support Backstage implementation to enhance developer experience within the organization.

Required Skills

  • Familiarity with Event Management and Application Monitoring concepts.
  • Experience in building and enhancing visualizations for observability solutions.
  • Proficiency with observability tools such as Grafana , Prometheus , Dynatrace , Splunk , Azure Monitor , or AWS CloudWatch .
  • Expertise in scripting with one or more of the following languages: Unix Shell , Windows PowerShell , JavaScript , Python , or Go .
  • Strong problem-solving and analytical skills, with the ability to troubleshoot complex network communication issues.
  • Hands-on experience with the administration, maintenance, and development of Grafana or ELK stack.
  • Minimum of 5-7 years of domain experience in monitoring or related fields.
  • Comfortable working with both Windows and Linux command lines.
  • Excellent communication and collaboration skills, with the ability to work effectively within a team and interact with stakeholders.

Core/Must-Have Skills

  • Observability Subject Matter Expertise (SME)
  • Prometheus
  • Azure Monitor
  • Grafana
  • Open Telemetry

Good-to-Have Skills

  • Proficiency in Unix Shell, Windows PowerShell, JavaScript, Python, or Go.
  • Familiarity with Backstage implementation.
  • Experience troubleshooting network communication problems.

About Alter Domus

Alter Domus logo

Alter Domus

alterdomus.com

DevopsOn-site

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free