Job Description
Reddit is continuing to grow our teams with the best talent. This role is completely remote friendly within the United States. If you happen to live close to one of our physical office locations (San Francisco, Los Angeles, New York City & Chicago) our doors are open for you to come into the office as often as you'd like.
The AI Engineering team at Reddit is embarking on a strategic initiative to build our own Reddit-native foundational Large Language Models (LLMs). This team sits at the intersection of applied research and massive-scale infrastructure, tasked with training models that truly understand the unique culture, language, and structure of Reddit communities. You will be joining a team of distinguished engineers and safety experts to build the "engine room" of Reddit's AI future—creating the foundational models that will power Safety & Moderation, Search, Ads, and the next generation of user products.
As a Staff Research Engineer for Pre-training Science, you will serve as the technical lead for defining the Continual Pre-Training (CPT) strategies that transform generic foundation models into Reddit-native experts. You will bridge the gap between "General Intelligence" and "Community Context," designing scientific frameworks that inject Reddit’s unique knowledge (conversational trees, slang, multimodal memes) into base models without causing catastrophic forgetting. You will define the "learning recipe"—the precise mix of data, hyperparameters, and architectural adaptations needed to build a model that speaks the language of the internet.
Responsibilities:
- Architect and validate rigorous Continual Pre-Training (CPT) frameworks, focusing on domain adaptation techniques that effectively transfer Reddit’s knowledge into licensed frontier models.
- Design the "Science of Multimodality": Lead research into fusing vision and language encoders to process Reddit’s rich media (images, video) alongside conversational text threads.
- Formulate data curriculum strategies: scientifically determining the optimal ratio of "Reddit data" vs. "General data" to maximize community understanding while maintaining safety and reasoning capabilities.
- Conduct deep-dive research into Scaling Laws for Graph-based data: investigating how Reddit’s tree-structured conversations impact model convergence compared to flat text.
- Design and scale continuous evaluation pipelines (the "Reddit Gym") that monitor model reasoning and safety capabilities in real-time, enabling dynamic adjustments to training recipes.
- Drive high-stakes architectural decisions regarding compute allocation, distributed training strategies (3D parallelism), and checkpointing mechanisms on AWS Trainium/Nova clusters.
- Serve as a force multiplier for the engineering team by setting coding standards, conducting high-level design reviews, and mentoring senior engineers on distributed systems and ML fundamentals.
Required Qualifications:
- 7+ years of experience in Machine Learning engineering or research, with a specific focus on LLM Pre-training, Domain Adaptation, or Transfer Learning.
- Expert-level proficiency in Py
More Jobs at Reddit
View all →Manager, Mid-Market Sales (Client Account Executives)
Machine Learning Engineer, Search and Answers
Machine Learning Engineer, Ads
Machine Learning Engineer, Ads
iOS Software Engineer, i18n: Grow Global and Local Communities
Want AI-powered job matching?
Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.
Get Started Free