REMOTE AI Support Operations Engineer

RemoteRemote$150k – $200kPosted 9 weeks ago

Role Overview

CyberCoders is hiring a mid-level REMOTE AI Support Operations Engineer. This is a full-time remote role, with the team based in Remote. The posted range is $150k to $200k. Full responsibilities, required qualifications, and the apply link are listed in the description below.

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonKubernetesTerraformAnsibleLinuxORBarTitle

Job Description

Title: AI Support Operations Engineer

Location: Fully REMOTE!

Salary: $150-200k/year + BONUS + RSUs

We're not following someone else's cloud blueprint - we're creating the next one. While legacy providers hand you a finished process, we're engineering the next generation of AI-optimized data center infrastructure from the ground up.

As our first internal Staff AI Support Operations Engineer, you'll be a foundational technical leader on a brand-new Ops team. This is a role for an architect-practitioner: the kind of engineer who can untangle a complex InfiniBand issue one hour and automate away the root cause the next. You won't just maintain systems - you'll build the operational standards and technical foundations that every future engineer will rely on.

Key Responsibilities

Cluster Engineering & Operations: Collaborate with engineering teams to architect, deploy, and bring new AI compute clusters online while delivering expert-level support for existing high-density GPU environments
Infrastructure Source of Truth: Own NetBox and related internal systems, ensuring all infrastructure data is accurate, consistent, and reliably maintained
Automation & Tooling: Build and refine internal automation using Python, Ansible, and Terraform to eliminate manual workflows and modernize fragile legacy processes
Tier 3 Escalation Lead: Serve as the highest technical escalation point for customer and internal issues prior to involvement from Platform or Network/Undercloud teams
Documentation Excellence: Transform tribal knowledge into clear, durable SOPs and technical documentation that establish the operational "gold standard"
Technical Leadership & Mentorship: Raise the technical bar for the team through code reviews, architectural guidance, and mentorship as the organization scales

Qualifications

Enterprise-Grade Server Proficiency: Advanced operational knowledge of HPE, Dell, and SuperMicro platforms, including IPMI, BMC, iDRAC workflows, and familiarity with Redfish-based management.
Core Engineering Toolkit: Mastery of Python, Ansible, and Terraform as primary tools for automation, orchestration, and infrastructure lifecycle management.
Linux Performance Engineering: Strong capability in diagnosing and tuning Linux systems, resolving performance bottlenecks, and optimizing workloads at the OS level.
Advanced Incident Resolution: Demonstrated experience serving as the final technical escalation point for complex, high-impact infrastructure failures.
Cloud-Native Operations: Proven production experience operating and troubleshooting Kubernetes environments.

Nice to have

Next-Generation GPU Hardware: Familiarity with NVIDIA Blackwell (B200/B300) or Hopper (H100/H200) architectures.
High-Performance Fabrics: Experience with InfiniBand or RoCE networking, and modern high-throughput storage platforms such as Weka or VAST Data.
Bare-Metal Provisioning: Exposure to OpenStack or Canonical MAAS for automated provisioning of physical infrastructure.

Legacy is predictable. Safe. Slow. We're none of those things. We're building the Neo-Cloud at AI speed, and the rules aren't handed to you - you define them. If you're ready to trade routine for impact and build systems that actually move the company forward, let's talk.

Frequently Asked Questions

How do I apply for the REMOTE AI Support Operations Engineer position at CyberCoders?

Use the Apply button above to submit your application directly to CyberCoders. Most applications take less than 5 minutes if your resume and contact details are ready, and you'll be routed to the employer's official application system to finish.

Is the REMOTE AI Support Operations Engineer role at CyberCoders remote?

Yes. This is a remote role. The team is based in Remote, but the position itself does not require relocating to that office.

How much does the REMOTE AI Support Operations Engineer role at CyberCoders pay?

CyberCoders has posted a compensation range of $150k to $200k for this position. Final offers typically vary based on candidate experience, location, and internal salary bands.

When was the REMOTE AI Support Operations Engineer role at CyberCoders posted?

This role was posted on April 4, 2026 (65 days ago). It's still listed as actively hiring; we re-confirm openings against the source system multiple times per day and remove closed roles.

Browse Remote Jobs Hiring Now →

AI-powered job search

Get every job scored to your resume

Upload your resume and get jobs ranked, your resume tailored, and employee contacts found automatically.

Get Started Free

No credit card to start