HPC Compute Platform Engineer

Dallas, Texas, USPosted 8 weeks ago

Role Overview

GTN Technical Staffing is hiring a mid-level HPC Compute Platform Engineer. This is a full-time hybrid role, based in Dallas. Part of GTN Technical Staffing's Lifecycle hiring. Full responsibilities, required qualifications, and the apply link are listed in the description below.

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

KubernetesTerraformAnsibleLinuxCI/CDApolloORBenefits

Job Description

HPC Compute Platform Engineer

Location: Dallas, TX (Hybrid)

Type: Direct Hire

Competitive base salary + performance bonus
100% company-paid benefits

Overview

We are seeking a Compute Platform Engineer to support the reliability, performance, and operational health of large-scale, high-performance compute infrastructure supporting critical research and production workloads.

This role is responsible for maintaining and troubleshooting CPU and GPU-based compute platforms, ensuring consistent performance at scale, and driving operational excellence across the environment. The position works closely with platform engineering, infrastructure, operations teams, and hardware vendors to support a stable and highly available compute ecosystem.

The ideal candidate brings strong hands-on experience with HPC or AI infrastructure, deep knowledge of server hardware, and a proactive approach to troubleshooting, automation, and continuous improvement.

Key Responsibilities

Compute Infrastructure Engineering

Design, configure, and manage high-performance compute infrastructure composed of CPU and GPU nodes
Support large-scale HPC and AI platforms, ensuring systems are stable, performant, and production-ready
Perform diagnostics, tuning, and capacity planning to support efficient scale-out of compute environments

Hardware Reliability & Lifecycle Management

Manage full firmware and BIOS lifecycle across compute infrastructure, including baselines, validation, rollout, and compliance
Troubleshoot complex hardware issues across CPU, GPU, DPU, NVSwitch, NICs, memory, PSU, and BMC components
Drive root cause analysis and implement solutions to improve system reliability and reduce recovery time
Analyze hardware lifecycle processes and recommend improvements for optimization and efficiency

Automation & Platform Operations

Automate health checks, onboarding workflows, and operational processes to improve deployment efficiency
Leverage Infrastructure-as-Code (IaC) methodologies to enable scalable and repeatable infrastructure management
Recommend and implement tooling and process improvements to enhance platform operations

Vendor & Cross-Functional Collaboration

Collaborate with hardware vendors to resolve firmware and system issues, providing detailed diagnostics, logs, and impact analysis
Work closely with infrastructure, platform, and operations teams to align on system performance and reliability goals
Support integration of hardware improvements across the broader environment

Monitoring, Performance & Security

Monitor hardware performance and identify opportunities for optimization
Implement best practices for platform security and system hardening
Ensure adherence to operational standards and data center processes

Technical Leadership

Act as a subject matter expert for compute infrastructure and hardware-related issues
Mentor junior engineers and contribute to a culture of continuous improvement and technical excellence

Required Experience

3+ years of hands-on experience supporting large-scale compute platforms, HPC, or AI infrastructure
Strong experience with HPE server platforms such as ProLiant and Apollo
Experience working with NVIDIA GPUs, including A100, H100/H200, or similar
Solid understanding of server architecture including UEFI/BIOS, PCIe devices, and out-of-band management systems (iLO, BMC)
Proven ability to troubleshoot complex hardware issues and coordinate with vendors for resolution
Experience with Linux in high-performance or latency-sensitive environments
Familiarity with core networking concepts including DNS, DHCP, VLANs, switching, and routing
Experience working within data center environments and operational processes

Technical Skills

Experience with automation tools such as Ansible, Terraform, and CI/CD pipelines
Exposure to Infrastructure-as-Code (IaC) practices
Working knowledge of Kubernetes and/or OpenStack (preferred)
Strong problem-solving and analytical skills with the ability to operate in complex environments

Preferred Experience

Experience supporting AI platforms or next-generation GPU architectures
Exposure to large-scale distributed compute environments
Experience working in mission-critical or high-availability infrastructure environments

Frequently Asked Questions

How do I apply for the HPC Compute Platform Engineer position at GTN Technical Staffing?

Use the Apply button above to submit your application directly to GTN Technical Staffing. Most applications take less than 5 minutes if your resume and contact details are ready, and you'll be routed to the employer's official application system to finish.

Is the HPC Compute Platform Engineer role at GTN Technical Staffing remote or in-office?

This is a hybrid role based in Dallas. Expect a mix of in-office and remote days, with the specific cadence set by the hiring manager.

What does a HPC Compute Platform Engineer at GTN Technical Staffing earn?

GTN Technical Staffing has not disclosed a salary range in this posting. Many employers share specifics later in the interview process; you can also ask during a recruiter screen if compensation transparency is important to you.

When was the HPC Compute Platform Engineer role at GTN Technical Staffing posted?

This role was posted on April 8, 2026 (61 days ago). It's still listed as actively hiring; we re-confirm openings against the source system multiple times per day and remove closed roles.

AI-powered job search

Get every job scored to your resume

Upload your resume and get jobs ranked, your resume tailored, and employee contacts found automatically.

Get Started Free

No credit card to start

HPC Compute Platform Engineer

GTN Technical Staffing

Full Timemid Hybrid

Dallas, Texas, USPosted 8 weeks ago

Role Overview

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

KubernetesTerraformAnsibleLinuxCI/CDApolloORBenefits

Job Description

HPC Compute Platform Engineer

Location: Dallas, TX (Hybrid)

Type: Direct Hire

Competitive base salary + performance bonus
100% company-paid benefits

Overview

Key Responsibilities

Compute Infrastructure Engineering

Design, configure, and manage high-performance compute infrastructure composed of CPU and GPU nodes
Support large-scale HPC and AI platforms, ensuring systems are stable, performant, and production-ready
Perform diagnostics, tuning, and capacity planning to support efficient scale-out of compute environments

Hardware Reliability & Lifecycle Management

Manage full firmware and BIOS lifecycle across compute infrastructure, including baselines, validation, rollout, and compliance
Troubleshoot complex hardware issues across CPU, GPU, DPU, NVSwitch, NICs, memory, PSU, and BMC components
Drive root cause analysis and implement solutions to improve system reliability and reduce recovery time
Analyze hardware lifecycle processes and recommend improvements for optimization and efficiency

Automation & Platform Operations

Automate health checks, onboarding workflows, and operational processes to improve deployment efficiency
Leverage Infrastructure-as-Code (IaC) methodologies to enable scalable and repeatable infrastructure management
Recommend and implement tooling and process improvements to enhance platform operations

Vendor & Cross-Functional Collaboration

Collaborate with hardware vendors to resolve firmware and system issues, providing detailed diagnostics, logs, and impact analysis
Work closely with infrastructure, platform, and operations teams to align on system performance and reliability goals
Support integration of hardware improvements across the broader environment

Monitoring, Performance & Security

Monitor hardware performance and identify opportunities for optimization
Implement best practices for platform security and system hardening
Ensure adherence to operational standards and data center processes

Technical Leadership

Act as a subject matter expert for compute infrastructure and hardware-related issues
Mentor junior engineers and contribute to a culture of continuous improvement and technical excellence

Required Experience

3+ years of hands-on experience supporting large-scale compute platforms, HPC, or AI infrastructure
Strong experience with HPE server platforms such as ProLiant and Apollo
Experience working with NVIDIA GPUs, including A100, H100/H200, or similar
Solid understanding of server architecture including UEFI/BIOS, PCIe devices, and out-of-band management systems (iLO, BMC)
Proven ability to troubleshoot complex hardware issues and coordinate with vendors for resolution
Experience with Linux in high-performance or latency-sensitive environments
Familiarity with core networking concepts including DNS, DHCP, VLANs, switching, and routing
Experience working within data center environments and operational processes

Technical Skills

Experience with automation tools such as Ansible, Terraform, and CI/CD pipelines
Exposure to Infrastructure-as-Code (IaC) practices
Working knowledge of Kubernetes and/or OpenStack (preferred)
Strong problem-solving and analytical skills with the ability to operate in complex environments

Preferred Experience

Experience supporting AI platforms or next-generation GPU architectures
Exposure to large-scale distributed compute environments
Experience working in mission-critical or high-availability infrastructure environments

Frequently Asked Questions

How do I apply for the HPC Compute Platform Engineer position at GTN Technical Staffing?

Is the HPC Compute Platform Engineer role at GTN Technical Staffing remote or in-office?

This is a hybrid role based in Dallas. Expect a mix of in-office and remote days, with the specific cadence set by the hiring manager.

What does a HPC Compute Platform Engineer at GTN Technical Staffing earn?

When was the HPC Compute Platform Engineer role at GTN Technical Staffing posted?

This role was posted on April 8, 2026 (61 days ago). It's still listed as actively hiring; we re-confirm openings against the source system multiple times per day and remove closed roles.