Site Reliability Engineer (Remote)

Jobs Ai

Full Timemid

CAPosted April 27, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

KubernetesAnsibleJenkinsLinuxAgile

Job Description

Role

: Site Reliability Engineer (Remote)

Location

: Remote (Work from Anywhere)

Payout

: Competitive

Industry

: Technology, Artificial Intelligence, Data & Analytics

Job Function

: Engineering, Information Technology, Research

Role Overview:

One of our clients, a global leader in the technology industry, is seeking a skilled Site Reliability Engineer to play a pivotal role in ensuring the performance, reliability, and scalability of mission-critical infrastructure. This is a contractor position that offers the opportunity to work remotely and leverage expertise in Linux, Kubernetes, and Prometheus to architect, monitor, and enhance robust systems. As a Site Reliability Engineer, you will be responsible for designing, implementing, and maintaining scalable infrastructure to support innovative applications.

Key Responsibilities:

Design, implement, and maintain scalable infrastructure using Linux, Kubernetes, and Prometheus to support mission-critical applications
Monitor system health, analyze performance metrics, and proactively address bottlenecks or potential failures to ensure high system reliability
Automate operational processes to minimize manual intervention and increase system reliability, using tools such as scripting and automation frameworks
Respond swiftly to incidents, conduct root cause analysis, and drive continuous improvements in incident response procedures to ensure high system availability
Collaborate closely with development and operations teams to deliver seamless deployments and high system reliability, using agile methodologies and collaboration tools

Required Skills & Qualifications:

Deep expertise in Linux, Kubernetes, and Prometheus, with experience in designing and implementing scalable infrastructure
Strong understanding of system performance metrics, monitoring, and analysis, with experience in using tools such as Grafana and Prometheus
Experience in automating operational processes using scripting and automation frameworks, such as Ansible and Jenkins
Strong problem-solving skills, with experience in root cause analysis and incident response
Excellent collaboration and communication skills, with experience in working with development and operations teams

More About the Opportunity:

This role offers the opportunity to work with a global leader in the technology industry, contributing to the development of innovative applications and systems. The successful candidate will have the chance to leverage their expertise in Linux, Kubernetes, and Prometheus to make a significant impact on the performance, reliability, and scalability of mission-critical infrastructure.

Equal Opportunity Employer:

We hire based on skills and expertise. All qualified candidates are welcome regardless of background, experience, or prior employment history. Applications are reviewed solely on demonstrated technical ability and qualifications.

Apply Now!

All jobs at Jobs Ai →Browse Remote DevOps Engineer Jobs →