
Lead Systems Operations Engineer
Wells FargoResume Keywords to Include
Make sure these keywords appear in your resume to improve ATS scoring
Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score
Job Description
About This Role
Wells Fargo is seeking a Lead Systems Operations Engineer for OpenShift, responsible for cluster lifecycle, upgrades, incident leadership, and policy-driven governance across multi-cluster estates.
In This Role, You Will
- Lead complex, broad impact initiatives including provision of high level systems consultation for the technology teams
- Work as key participant in large scale planning of computer systems and network infrastructure for Systems Operations functional area
- Review and analyze complex technical challenges, as well as escalated support issues related to core business solutions that require in depth evaluation of multiple factors, such as alternatives, enhancements, periodic systems reviews, or improvements to existing systems
- Make decisions on technical changes and enhancements
- Consult with engineering team on change design requiring solid understanding of technical process controls or standards that influence and drive new initiatives
- Collaborate and consult with technical peers, colleagues, and mid to more experienced level managers to resolve systems support issues and achieve goals
- Operate Red Hat OpenShift clusters (install/upgrade day-2 ops, cluster operators, MachineSets).
- Manage workloads, routes/ingress, registries, quotas, limits, and image lifecycles.
- Implement networking (OVN-K, ingress controllers, load balancers) and storage (PV/PVC via CSI).
- Harden clusters with RBAC, SCC, security patches, and certificate rotation.
- Build logging/monitoring with Prometheus/Grafana; alerting and runbooks for SRE operations.
- Automate with GitOps/Argo CD, Helm, and pipelines; scripting in Bash/Python/Ansible.
- Lead production change windows and cluster upgrades with minimal downtime.
- Define platform SLOs, capacity/lifecycle roadmaps; coach engineers on advanced troubleshooting.
- Partner with security/networking teams on policy, segmentation, and audit readiness.
Required Qualifications
- 5+ years of Systems Engineering, Technology Architecture experience, or equivalent demonstrated through one or a combination of the following: work experience, training, military experience, education
Desired Qualifications
- Experience running multi-cluster/hybrid OCP and GitOps at scale.
- RHCE/EX280/CKA (plus).
- 10+ years infra/container operations; 3+ years leading OpenShift/GKE/AKS production environments.
- Kubernetes fundamentals with oc/kubectl; CRI-O/container internals.
- OpenShift 4.x (operators, routes, ImageStreams, OAuth/IdP, ClusterRoles/Bindings).
- Networking (CNI, NetworkPolicy), storage (RWX/RWO, snapshots), registry (Quay).
- Troubleshooting etcd/worker/ingress issues; upgrade planning/execution.
- Security and compliance: image security, signing, scanning, secrets mgmt.
- Incident command; capacity/DR; stakeholder communications.
Job Expectations
- Participate in 24x7 on-call rotation and support change windows (weekends/holidays as needed).
- Maintain runbooks/SOPs and create dashboards/alerts with actionable SLOs.
- Automate repetitive tasks to reduce toil; contribute reusable modules/pipelines.
- Communicate status/risks clearly to stakeholders; follow incident/problem/change processes.
- Mentor and coach engineers; lead upgrade/change programs.
Reference Number
R-529880
Want AI-powered job matching?
Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.
Get Started Free