About the Role

We are looking for a hands-on Site Reliability Engineer with deep expertise in Vo IP platforms and cloud infrastructure automation.

You will be a key member of our SRE/NOC team, responsible for the reliability, scalability, and operational excellence of the Net Sapiens communications platform running on Oracle Cloud Infrastructure (OCI).

This role sits at the intersection of software engineering and systems operations — you will build and maintain the infrastructure that powers voice and collaboration services for thousands of businesses.

What You'll Do

Own the reliability, performance, and availability of the Net Sapiens UCaa S platform across production environments

Design, build, and maintain Infrastructure-as-Code (Ia C) using Terraform and configuration management workflows using Ansible

Perform Tier 1 and Tier 2 troubleshooting of Vo IP platform issues including call quality, signaling failures, media path problems, and platform outages

Diagnose and resolve issues across the Vo IP stack, including Netsapins application and Media stack, SIP trunking, RTP/SRTP media

Participate in 24x7 on-call rotation, responding to and resolving critical production incidents with a sense of urgency and clear communication

Collaborate with engineering and NOC teams on incident response, root cause analysis, and post-mortem documentation

Manage and optimize OCI resources including compute, networking, storage, and security services

Develop and maintain runbooks, monitoring dashboards, and alerting policies to reduce mean time to detection (MTTD) and resolution (MTTR)

Support patching, upgrades, and lifecycle management of platform components using automation frameworks

Contribute to SRE initiatives including SLO/SLA definition, error budget tracking, and capacity planning

Must-Have Qualifications

3+ years of experience in a Site Reliability Engineering, Dev Ops, or Systems Engineering role

Hands-on experience with the Net Sapiens platform or Asterisk PBX (configuration, administration, and troubleshooting)

Proficiency with Terraform for cloud infrastructure provisioning and state management

Strong working knowledge of Ansible for configuration management and automation

Practical experience wit h Oracle Cloud Infrastructure (OCI) — compute, VCN, load balancers, IAM, and security services

Proven Tier 1 / Tier 2 Vo IP troubleshooting skills, including experience with Asterisk PBX, SIP, RTP, and call flow diagnostics

Comfortable with tools such as sngrep, tcpdump, Wireshark, and Vo IP packet analysis

Willingness and ability to participate in a 24x7 on-call rotation, including nights and weekends

Strong Linux systems administration skills (RHEL/Cent OS/Ubuntu)

Solid scripting ability in Bash and/or Python

Good-to-Have Qualifications

Experience with Ansible AWX for centralized automation and job scheduling

Experience with env0 for environment management and Terraform workflow orchestration

Familiarity with SOC 2 Type 2 and/or HIPAA compliance requirements in a Saa S environment

Exposure to AIOps, observability platforms, or SIEM tooling (e.g., Stellar Cyber)

Experience with CI/CD pipelines and Git Ops practices

·Knowledge of Border0 or similar zero-trust privileged access management tools

Site Reliability Engineer – Voip & Cloud Infrastructure

Resume Keywords to Include

Job Description

About the Role

What You'll Do

Similar Jobs

Want AI-powered job matching?

Similar Jobs