Senior ML Infrastructure Engineer
Ellison Institute of TechnologyRole Overview
Ellison Institute of Technology is hiring a Senior ML Infrastructure Engineer. This is a full-time role in Halifax Regional Municipality, Nova Scotia. Part of Ellison Institute of Technology's Lifecycle hiring. Full responsibilities, required qualifications, and the apply link are listed in the description below.
Resume Keywords to Include
Make sure these keywords appear in your resume to improve ATS scoring
Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score
Job Description
At the Ellison Institute of Technology (EIT), we’re on a mission to translate scientific discovery into real world impact. We bring together visionary scientists, technologists, policy makers, and entrepreneurs to tackle humanity’s greatest challenges in four transformative areas:
- Health, Medical Science & Generative Biology
- Food Security & Sustainable Agriculture
- Climate Change & Managing CO₂
- Artificial Intelligence & Robotics
This is ambitious work - work that demands curiosity, courage, and a relentless drive to make a difference. At EIT, you’ll join a community built on excellence, innovation, tenacity, trust, and collaboration, where bold ideas become real-world breakthroughs. Together, we push boundaries, embrace complexity, and create solutions to scale ideas for lab to society. Explore more at www.eit.org
Our MLOps team
Join ourMLOpsteam to build the cloud and compute foundation that enables scientific breakthroughs. Deliver reliable, secure platforms and self-service guardrails that accelerate experimentation and turn ideas into results—faster, at scale, and with confidence.
Day-to-day, you might:
- Build,operate, and continuously optimise our high-performance GPU training and inference clusters, focusing on robust, high-availability scheduling, isolation, and automated lifecycle management.
- Drive systems design and implementation for high-throughput data paths, optimising I/O, caching, and data locality across compute and storage (including our current Lustre implementation).
- Proactively benchmark, profile, and resolve performance bottlenecks across the compute, network, and orchestration layers to maximise efficiency for distributed training and inference.
- Establish comprehensive observability, resilience, and automated security controls to ensure compliance and robust operation of sensitive research environments.
- Partner with Research, Data, and Applied teams to forecast capacity and cost for GPU and storage needs, setting quotas and streamlining ML experimentation pipelines.
What makes you a great fit:
- Proven experience leading the design, build, and operation of high-performance ML compute clusters at scale
- A proactive, autonomous approach to systems design and the proven ability and desire to ideate, co-create and implementoptimalsolutions
- Exposure to migrating or transforming ML infrastructure from traditional schedulers to modern, containerised systems
- Expertisewith high-throughput storage systems for ML/HPC workloads
- Expert-level understanding of GPU architecture, high-speed networking for distributed training, and performance profiling to resolve bottlenecks
- A solid grasp ofIaCand CI/CD practices (e.g., Terraform, Argo CD)
We offer the following salary and benefits:
Enhanced holiday pay
Pension
Life Assurance
Income Protection
Private Medical Insurance
Hospital Cash Plan
Therapy Services
Perk Box
Electric Car Scheme
Why work for EIT:
At the Ellison Institute, we believe a collaborative, inclusive team is key to our success. We are building a supportive environment where creative risks are encouraged, and everyone feels heard. Valuing emotional intelligence, empathy, respect, and resilience, we encourage people to be curious and to have a shared commitment to excellence. Join us and make an impact
#J-18808-Ljbffr
Frequently Asked Questions
How do I apply for the Senior ML Infrastructure Engineer position at Ellison Institute of Technology?
Use the Apply button above to submit your application directly to Ellison Institute of Technology. Most applications take less than 5 minutes if your resume and contact details are ready, and you'll be routed to the employer's official application system to finish.
Where is the Senior ML Infrastructure Engineer position at Ellison Institute of Technology located?
This position is based in Halifax Regional Municipality, Nova Scotia. Ellison Institute of Technology has not indicated remote or hybrid options for this role, so candidates should plan for on-site work.
What does a Senior ML Infrastructure Engineer at Ellison Institute of Technology earn?
Ellison Institute of Technology has not disclosed a salary range in this posting. Many employers share specifics later in the interview process; you can also ask during a recruiter screen if compensation transparency is important to you.
When was the Senior ML Infrastructure Engineer role at Ellison Institute of Technology posted?
This role was posted on March 22, 2026 (79 days ago). It's still listed as actively hiring; we re-confirm openings against the source system multiple times per day and remove closed roles.
How much experience does the Senior ML Infrastructure Engineer role at Ellison Institute of Technology require?
This is a senior-level position. Most senior roles call for 5+ years of directly relevant experience. Ellison Institute of Technology lists their specific requirements in the description below, so review the must-have qualifications closely before applying.
AI-powered job search
Get every job scored to your resume
Upload your resume and get jobs ranked, your resume tailored, and employee contacts found automatically.
Get Started FreeNo credit card to start