Staff Embedded ML Engineer, Edge AI job at Simplisafe in Boston, MA
SimplisafeJob Description
Title: Staff Embedded ML Engineer, Edge AI
Location: Boston, MA, United States
Job Description
About SimpliSafe
SimpliSafe is a leading innovator in the home security industry, dedicated to making every home a safe home. With a mission to provide accessible and comprehensive security solutions, we design and build user-centric products that empower individuals and families to protect what matters most.
We believe in a collaborative and agile environment where learning and growth are continuous. Our teams are composed of talented individuals who are passionate about technology, security, and delivering exceptional customer experiences.
We're embracing a hybrid work model that enables our teams to split their time between office and home. Hybrid for us means we expect our teams to come together in our state-of-the-art office on two core days, typically Tuesday, Wednesday, or Thursday - working together in person and choosing where they work for the remainder of the week. We all benefit from flexibility and get to use the best of both worlds to get our work done.
Why are we hiring?
Well, we're growing and thriving. So, we need smart, talented, and humble people who share our values to join us as we disrupt the home security space and relentlessly pursue our mission of keeping Every Home Secure.
About the Role
We are seeking a highly motivated and experienced Embedded Machine Learning Engineer to join our growing Edge AI team. As a key contributor, you will lead the on-device inference and performance optimization of ML models powering outdoor monitoring in the home security space. This role is less about inventing new CV architectures and more about making models fast, power-efficient, stable, and shippable on real embedded hardware (outdoor cameras and doorbells). You will operate across the stack (from model runtime integration down to kernel/operator optimization, memory movement, scheduling, and accelerator utilization) to deliver reliable real-time behavior under tight compute, memory, bandwidth, and thermal constraints across device tiers.
Responsibilities
Own the embedded deployment and performance of on-device ML inference for outdoor monitoring workloads (real-time video/event pipelines).
Optimize end-to-end inference performance across CPU/DSP/NPU/GPU (as applicable): latency, throughput (FPS), memory footprint, power, thermals, startup time, and stability.
Perform kernel/operator-level optimization:
vectorization (e.g., SIMD/NEON), tiling, cache-friendly memory layouts
reducing bandwidth and memory copies, optimizing post-processing
fusing ops, minimizing synchronization/overhead, thread scheduling
Integrate and maintain ML models within embedded pipelines:
model import/export validation, operator compatibility, graph transforms
runtime integration in C/C++ (including pre/post-processing)
robust error handling, watchdogs, and safe fallback behavior
Drive quantization and deployment readiness from an embedded perspective:
validate INT8/FP16 paths, calibration flows, numerical accuracy checks
debug quantization edge cases and operator mismatches on target runtimes
Build tooling for profiling, benchmarking, and regression tracking on devices:
per-layer timing, memory tracking, thermal/perf tests, CI gating
automated performance regression gating across device tiers and firmware versions
Partner closely with ML engineers to translate model changes into deployment impact; provide constraints and design guidance that improve deployability and performance.
Provide Staff-level leadership: set performance standards, lead technical reviews, mentor engineers, and influence platform roadmap for on-device ML.
Qualifications
8+ years of experience in embedded systems and/or performance engineering, with experience shipping production software on constrained devices.
Strong C/C++ expertise with deep knowledge of low-level performance topics: CPU architecture, memory hierarchy, concurrency, and real-time considerations.
Demonstrated experience optimizing ML inference on embedded targets, including operator/kernel tuning and end-to-end pipeline optimization.
Familiarity with modern vision model families (transformer-based detectors such as DEIM/DFINE/RT-DETR series and CNN-based detectors such as YOLO family or similar) sufficient to optimize their execution characteristics (tensor shapes, attention/conv patterns, post-processing).
Experience with on-device inference runtimes and deployment workflows (e.g., TFLite, ONNX Runtime, TensorRT or vendor runtimes), including operator support constraints and graph-level transformations.
Strong debugging and profiling skills (perf, flame graphs, hardware counters, tracing) and ability to drive performance investigations to closure.
Ability to lead cross-functionally across ML, firmware, and hardware teams; comfortable defining benchmarks/KPIs and making tradeoffs.
Bonus Points:
Experience with embedded accelerators and vendor toolchains (D
Similar Jobs
Technical Support Specialist (IoT, Robotics & AI) – Fresher - Fresher
Paruvaththe Payir Sei
[Remote] Sr. Software Engineer II (Embedded Firmware)
Tandem Diabetes Care
Industrial Electric Motor & VFD Regional Sales - Southeast Houston
ITT Inc. - English US
Senior Product Manager
CoLab
Senior Platform Operations Manager
Coinbase
Want AI-powered job matching?
Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.
Get Started Free