Skip to main content
Tekgence Inc logo

DevOps with python

Tekgence Inc
Posted April 6, 2026

Resume Keywords to Include

Make sure these keywords appear in your resume to improve ATS scoring

PythonGoKubernetesLinuxCI/CDDevOps

Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score

Job Description

DevOps with python experience DataCenter Automation engineer

Toronto, Canada (3 days onsite)

12+ months contract

Summary-

This role sits at the intersection of hardware validation, infrastructure automation, and

large-scale system testing. You will work closely with hardware, networking, infrastructure, and

software engineering teams to develop tools and processes that validate system reliability,

accelerate deployments, and ensure operational excellence across our compute clusters.

What We're Looking For

● BS/MS in Computer Science, Electrical Engineering, Computer Engineering, or a related

technical field.

● 2+ years of experience in infrastructure testing, systems engineering, data center

operations, or hardware validation.

● Strong programming skills in Python, Go, or similar scripting languages.

● Experience building automation frameworks for testing large-scale systems.

● Familiarity with Linux systems and troubleshooting infrastructure environments.

● Experience with networking fundamentals (TCP/IP, routing, switching).

● Strong debugging and problem-solving skills.

Nice to Have

● Experience testing AI/ML infrastructure or accelerator hardware.

● Experience with cluster orchestration systems such as Kubernetes or Slurm.

● Familiarity with CI/CD pipelines for infrastructure testing.

● Experience with monitoring tools such as Prometheus and Grafana.

● Experience with data center networking technologies.

What You’ll Do

● Develop and maintain automation frameworks for testing and validating data center

infrastructure.

● Design automated tests for AI accelerator systems, compute clusters, networking, and

storage infrastructure.

● Build tools that validate system performance, reliability, and scalability before production

deployment.

● Automate deployment validation and infrastructure health checks across clusters.

● Work with hardware and infrastructure teams to troubleshoot system issues and identify

root causes.

● Develop scripts and automation for data center bring-up, system validation, and

diagnostics.

● Collaborate with engineering teams to improve test coverage and operational workflows.

● Create documentation, test procedures, and operational playbooks for infrastructure

validation.

Linkedin:- linkedin.com/in/nitesh-ch-a378b5222

Direct: 469-421-5604 , Ext- 218

  • nitesh.j@tekgence.com

6655 Deseo Dr, Suite 104,Irving, TX , 75039

  • www.tekgence.com

Want AI-powered job matching?

Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.

Get Started Free