Senior Site Reliability Engineer
Amazech SolutionsResume Keywords to Include
Make sure these keywords appear in your resume to improve ATS scoring
Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score
Job Description
Amazech Solutions is one of the fastest-growing IT Solutions and Staffing companies, established in 2007, serving clients nationwide. We are proud to be a trusted partner to a range of clients and an employee-centric organization.
🚀 Hiring: Senior Site Reliability Engineer (SRE III)
📍 Dallas, TX (Hybrid – 3 Days Onsite)
💼 Type: Long Term Multi-Year Contract
🔧 About the Role
We are looking for a highly skilled Senior Site Reliability Engineer (SRE III) to join our team supporting mission-critical platforms. This role focuses on ensuring system reliability, scalability, and performance across cloud-native environments.
🎯 Key Responsibilities
- Own and maintain highly available, scalable production systems
- Design and implement monitoring, alerting, and observability solutions
- Drive incident management, root cause analysis (RCA), and postmortems
- Build and manage Kubernetes-based infrastructure (EKS preferred)
- Automate infrastructure using Terraform and Infrastructure as Code (IaC)
- Improve system reliability through performance tuning and capacity planning
- Collaborate with DevOps and Engineering teams to enhance CI/CD pipelines
- Define and track SLIs, SLOs, and SLAs
✅ Must-Have Skills
- Strong experience with AWS (preferred) or Azure cloud environments
- Hands-on expertise in Kubernetes (EKS), Docker, and container orchestration
- Deep knowledge of Terraform (IaC)
- Experience with monitoring tools (Datadog, Prometheus, Grafana, New Relic)
- Proficiency in scripting (Python, Bash)
- Experience with incident response, on-call rotations, and production support
- Strong understanding of CI/CD tools (Jenkins, GitHub Actions, ArgoCD)
🌟 Nice to Have
- Experience in financial services / fintech environments
- Exposure to security and compliance standards
- Knowledge of chaos engineering and resiliency testing
👤 Experience Required
- 8+ years in SRE / DevOps / Cloud Engineering roles
- Proven experience managing large-scale production systems
💡 What We’re Looking For
- Engineers who thrive in high-availability environments
- Strong problem-solvers with real-world incident handling experience
- Candidates who can balance automation, reliability, and performance
Want AI-powered job matching?
Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.
Get Started Free