Resume Keywords to Include
Make sure these keywords appear in your resume to improve ATS scoring
Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score
Job Description
Summary
Site Reliability Engineer focusing on network operations within a 24/7 NOC to monitor application services and critical network infrastructure, lead incident response for P1/P2 outages, perform deep technical troubleshooting across the stack, and drive automation and observability improvements. Role is onsite at corporate headquarters supporting rotating shifts including weekends.
Responsibilities
- Maintain real-time monitoring of application services, network infrastructure, and business KPIs
- Participate in 24/7 on-call rotations and respond to PagerDuty alerts
- Lead P1 and P2 incident troubleshooting and coordinate cross-team remediation
- Perform network diagnostics, packet analysis, and performance testing during incidents
- Monitor and troubleshoot routers, switches, firewalls, load balancers, wireless systems, and SD-WAN
- Analyze network performance, identify bottlenecks, and recommend optimizations
- Document root cause analyses and produce actionable remediation reports
- Develop automation and scripts to remediate common incidents and reduce manual toil
- Build and refine monitoring dashboards and support SLO SLI definition
Requirements
- 1 years experience in site reliability engineering, NOC operations, or similar roles (lower bound)
- Strong networking knowledge including TCP IP, OSI model, BGP, OSPF, switching, VLANs, DNS, DHCP, and VPNs
- Hands-on experience with network monitoring and packet analysis tools such as Wireshark and tcpdump
- Experience with Juniper routing and switching and enterprise wireless platforms such as MIST or Aruba
- Proficiency with monitoring platforms like New Relic and tooling such as PagerDuty ServiceNow and Jira
- Cloud networking experience with AWS or Azure and familiarity with containers and orchestration
- Programming and automation skills in Python Go Bash or PowerShell
- Proven experience managing P1 P2 incidents and performing root cause analysis
Similar Jobs
BI Developer- Husky (India) Chennai
Husky Technologies
Linux Systems Administrator
Bespoke Technologies, Inc
Software Engineer II - Python, PySpark, AWS
JPMorganChase
Data and Analytics Engineer
Lancesoft
Security Engineer
Robert Half
Want AI-powered job matching?
Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.
Get Started Free