Resume Keywords to Include
Make sure these keywords appear in your resume to improve ATS scoring
Sign up free to auto-tailor your resume with all these keywords and get a higher ATS score
Job Description
Clinical Real World Data Scientist
Join the team protecting half a billion lives every year with next-gen science, mRNA innovation, and AI-driven breakthroughs. In Vaccines, you'll help advance prevention on a global scale – and shape the future of immunization.
The Data Assessment Center of Excellence (CoE) is a specialized team within Sanofi's Digital RWD & HI function, operating at the intersection of epidemiology, RWD, data products and insights/evidence generation. The vision of the CoE is to ensure all Sanofians has the right data, used the right way, for real patient impact.
The CoE is responsible for:
- Establishing and maintaining enterprise-wide standards for RWD assessment and fitness-for-purpose data evaluation through the Data Insights & Viability Engine (DIVE).
- Supporting cross-functional teams — including R&D, Business units (Vaccines, General Medicine and Specialty Care) & Digital— with expert guidance on data source selection & suitable data usage to minimize bias and confounder.
- Building and disseminating best practices, methodological frameworks, and training resources across Sanofi's RWD ecosystem
- Driving innovation in RWD methodologies, including the integration of AI tools such as Cursor, LLM, CortexAI (as available) to accelerate speed to delivery of data selection guidance, internal study design, and/or digital product development
The Clinical RWD Scientist is a critical role within the Data Assessment Center of Excellence (CoE), embedded in Sanofi's Digital RWD & HI function. This role bridges the gap between theoretical concepts to practical & reliable RWD solutions. You are an agile professional interested in challenging the status quo with deep subject matter expertise in US RWD, pharmaco-epidemiological methods and a quick learner of new data, methodology, and technology. You are a proactive team member that values cross-learning, see challenges as opportunity and can work with assumptions.
We're an R&D-driven, AI-powered biopharma company committed to improving people's lives and delivering compelling growth. Our deep understanding of the immune system – and innovative pipeline – enables us to invent medicines and vaccines that treat and protect millions of people around the world. Together, we chase the miracles of science to improve people's lives.
Main Responsibilities
- Designing and executing rigorous data assessment frameworks, evaluating the fitness-for-purpose of real-world data (RWD) sources for insights or evidence generation across the enterprise, support the development of reliable RWD Foundation and Products.
- Lead and execute feasibility assessments for RWD sources (electronic health records, administrative claims, patient registries, wearable/digital health data) to determine suitability for specific research/business objectives
- Develop and apply structured data assessment frameworks to evaluate data quality dimensions, including accuracy, completeness, validity, timeliness, longitudinally consistency, and integrity
- Assess the availability and representativeness of patient populations within RWD sources available in Sanofi for both internal decision-making and regulatory-grade evidence generation
- Evaluate the feasibility of extracting structured and unstructured data elements (e.g., clinical scores, patient-reported outcomes) from EHR systems, including NLP-based extraction from clinical notes
- Document assessment outcomes in standardized feasibility reports and communicate findings clearly to cross-functional stakeholders
- Identify and articulate limitations of RWD sources, such as proxy endpoint constraints, population coverage gaps
- Design methodologically sound recommendations & minimize misuse of RWD, leading to unreliable insights or evidence generation
- Ensure appropriate use of ICD codes, procedure codes, and other medical coding standards (sourced from peer-reviewed references such as PubMed, Embase, and Orphanet, etc.) for patient identification, healthcare provider segmentation, clinical site identification, and phenotyping
- Apply advanced epidemiological and biostatistical methods including propensity score methods, time-to-event analyses, sensitivity analyses, and bias assessment
- Provide methodological input on the use of clinical score proxies and surrogate endpoints in RWD contexts, clearly delineating their applicability for internal versus regulatory/publication use
- Provide methodology advises ensuring deliverables from RWD Foundation, RWD Science, and RWD Products are based on medical evidence/guidelines, clinically & contextually relevant
- Work closely with analysts & data scientists to ensure methodological recommendation is realistic and implementable
- Partner with R&D, Business units (Vaccines, General Medicine and Specialty Care) & Digital teams on data identification and appropriate usage of RWD for insights / evidence generation across drug lifecycle
- Serve as the methodological point of contact for fit-for-purpose data assessment inquiries from internal stakeholders
- Collaborate with RWD Foundation, RWD Product Owners, RWD Data Sciences to ensure RWD are used appropriately to inform reliable decision making & to provide knowledge transfer on data domain expertise
- Manage external data vendors and technology partners (e.g., EHR, claims, registries) to understand data limitations and to verify methodological recommendations when required
NOTE: This role does not conduct real-world evidence studies.
About YOU
- Advanced degree (Master's or PhD) in Epidemiology, Biostatistics, Health Informatics, Health Economics, Pharmacoepidemiology, or a closely related quantitative discipline
- Minimum 4-5 years for Master's degree holder or 2-4 years for Doctoral degree holder of relevant experience in real-world data, commercial analytics, real-world evidence, health outcomes research, fit-for-purpose feasibility assessment, data quality assessment or a related field within the pharmaceutical, biotech, or health technology industry
- Experience in predictive modeling using RWD to identify at risk patient populations with a publication record in peer-review journals
- Experience in patient & healthcare provider segmentation to inform Medical and Commercial strategy
- Demonstrated expertise in epidemiological study design and statistical methods such as propensity score matching, descriptive statistics, regression analysis, predictive modelling.
- Strong proficiency in statistical programming languages: SQL, Python, R, and/or SAS
- Solid working knowledge of Snowflake for database querying and data extraction
- Familiarity with medical coding systems: ICD-10, CPT, SNOMED CT, LOINC, RxNorm and experience/knowledge on OHDSI OMOP CDM standardized data model for healthcare data
- Understanding of US EHR, claims, disease registry data, public health surveillance data as well as US healthcare billing system
- Experience with AI coding tools such as Cursor, GitHub Copilot, Claude, LLM
- Knowledge of automation tools such as Power Automate, Power App (an asset not required)
- Requires a high level of interactive communication with diverse stakeholders
- Can work with assumptions & in a fast-paced environment
- Proven teamwork and collaboration skills
Why Choose Us?
- Bring the miracles of science to life alongside a supportive, future-focused team.
- Discover endless opportunities to grow your talent and drive your career, whether it's through a promotion or lateral move, at home or internationally.
- Enjoy a thoughtful, well-crafted rewards package that recognizes your contribution and amplifies your impact.
- Take good care of yourself and your family, with a wide range of health and wellbeing benefits including high-quality healthcare, prevention and wellness programs and at least 14 weeks'gender-neutral parental leave
Tagged as: Life Sciences
Want AI-powered job matching?
Upload your resume and get every job scored, your resume tailored, and hiring manager emails found - automatically.
Get Started Free