Trends & Insights
Key themes and future directions from R/Pharma 2025
Executive Summary
R/Pharma 2025 marked a pivotal moment in pharmaceutical R adoption, with three dominant themes emerging: AI/LLM integration, open-source transformation, and regulatory acceptance. The conference showcased 19 workshops and 30+ presentations revealing an industry in rapid transition from proprietary tools to collaborative, standards-based approaches.
Key Statistics:
- ๐ค 8+ sessions on AI/LLM (50% increase from previous years)
- ๐ GSK achieved 50%+ R code across Biostatistics
- โ FDA neutrality on statistical software increasingly normalized
- ๐ CDISC ARS/ARM driving automation and standardization
1. The AI/LLM Revolution ๐ค
Emergence as Dominant Theme
AI and Large Language Models were the most discussed topic, appearing in:
- 3 dedicated workshops (ellmer basics, enterprise tooling, clinical data privacy)
- 8+ presentations across all sessions
- Multiple vendor tools ({ellmer}, {llumen}, {mcpr}, Databot, {meRlin})
From Prototype to Production
The conversation has matured from โCan we use AI?โ to โHow do we deploy it safely?โ
Key Developments:
1. Enterprise-Ready Frameworks
Organizations are building production AI systems:
- Merckโs {llumen} - Comprehensive agentic framework used internally
- Supports multiple LLM sources (Azure, OpenAI, local)
- RAG with vector databases for documents
- Database querying (SQL, Cypher, GraphQL)
- Foundation model integration (TxGemma)
- Rocheโs Multi-Agent Copilot - Package-specific AI agents
- Each internal package gets its own expert agent
- LangGraph orchestration
- Integration with Cursor via MCP
- A2-AIโs GxP Solutions - Validated AI workflows
- AWS Bedrock integration
- MCP server implementations
- Compliance documentation
2. Privacy-Preserving Approaches
Critical for pharma with sensitive clinical data:
- On-premise LLM deployments (llama.cpp)
- Query sanitization removing PII
- Audit logging for compliance
- Role-based access controls
- {DataChat} example - Chat interface with data never leaving secure environment
3. Practical Applications
Moving beyond novelty to genuine productivity:
Data Exploration:
- Natural language queries on CDISC data
- Conversational interfaces for non-programmers
- RAG-powered document search
Code Assistance:
- Context-aware programming help
- Debugging complex statistical code
- Documentation generation
Report Automation:
- LLM-powered table summarization
- Narrative generation from results
- QC log generation
Standards Emerging
Model Context Protocol (MCP):
- Standardized interface for AI tools
- R implementation via {mcpr}
- Cross-language interoperability
- Growing ecosystem (Claude Code, Cursor, VS Code)
Challenges Identified
Despite enthusiasm, several barriers remain:
- โ ๏ธ Validation complexity - How to validate non-deterministic outputs
- โ ๏ธ Cost management - API costs at scale
- โ ๏ธ Regulatory uncertainty - Limited guidance on AI in submissions
- โ ๏ธ Skills gap - Prompt engineering and AI orchestration expertise
2. Open-Source Transformation ๐
GSKโs Landmark Achievement
Sam Wardenโs presentation on GSKโs journey was a conference highlight:
Timeline:
- 2018-2020: Pilot programs and early adopters
- 2020-2022: COVID acceleration and platform adoption
- 2023-2024: 50%+ R code target achieved
- 2025: Rburst initiative for full integration
Success Factors:
- Executive commitment - Clear target setting (50%+ R)
- Multi-wave approach - Gradual transformation
- Training investment - Bookdown courses, Resource Hub, AccelerateR
- Platform deployment - Posit Workbench for infrastructure
- Cultural change - Growth mindset and adaptability
Industry-Wide Movement
GSK is not alone - the tipping point has been reached:
Novartis:
- Mosaic platform for ARS-driven TFL automation
- Standards-based, language-agnostic approach
- React UI with R backend
Roche/Genentech:
- Leading {admiral}, {teal}, {gtsummary} development
- autoslideR for slide automation
- {crane} for pharma-specific reporting
Pfizer:
- Custom R packages for RWD programming
- Dual R/SAS syntax support
- Quarto documentation sites
Moderna:
- AI-enhanced Shiny apps for trial data
- ellmer/GPT integration
- Accelerating exploratory analysis
Pharmaverse Ecosystem
The collaborative open-source movement is thriving:
Key Packages:
- {admiral} - Transitioning to stable maintenance (feature complete)
- {teal} - Version 1.0 released (interactive apps)
- {gtsummary}/{crane} - Clinical table generation
- {cardinal} - Standardized TLG templates
- {sdtm.oak} - SDTM programming framework
Benefits Realized:
- โ Reduced duplication across companies
- โ Shared validation burden
- โ Faster innovation cycles
- โ Transparent, auditable code
3. Regulatory Evolution โ
FDA Software Neutrality
The 2015 FDA clarification on software neutrality is now fully operationalized:
- No preference for SAS vs R vs Python
- Focus on methodology and validation, not tools
- R-based submissions increasingly routine
New FDA Guidance Impact
2023 Covariate Adjustment Guidance driving change:
- Novartis developed {beeca} in response
- ASA-BIOP collaboration on {RobinCar2}
- Academic methods reaching practice faster
- R enabling rapid response to guidance
Validation Maturity
Key Developments:
1. Risk-Based Approaches
- {riskmetric} for package assessment
- Shared metric repositories (R Validation Hub)
- Automated quality scoring
- Focus on critical packages
2. Acceptance-Test Driven Development (ATDD)
- Plain-language tests in Quarto
- Shareable with regulators
- โGiven-when-thenโ format
- Brian Repko bringing JBehave concepts to R
3. Shiny App Validation
- Litmusverse suite (Jumping Rivers)
- Risk-based validation frameworks
- Code quality assessment
- Traceability and documentation tools
GxP-Ready AI
Pioneering work on validating AI applications:
- Devin Pastoor (A2-AI): Production AI in GxP contexts
- Testing strategies for non-deterministic outputs
- Audit trails and logging
- Version control and rollback procedures
4. Automation & Efficiency ๐
ARS/ARM as Game Changers
CDISC Analysis Results Standard driving automation:
Mosaic (Novartis):
- YAML captures ARD requirements
- Language-agnostic rules (metadata โ R or any language)
- React UI for customization
- Push-button TFL generation
{cards} Ecosystem:
- Analysis Results Data objects
- QC becomes straightforward (compare ARDs)
- LLMs can summarize language-agnostic results
- {crane} and {gtsummary} building on this foundation
Report Automation
Time Savings Achieved:
- autoslideR (Roche): 0.5-4 days per slide deck
- DMC materials automation: Week โ Day for exposure reports (AstraZeneca)
- LLM-powered {gtsummary}: Submission-ready tables in minutes vs hours
Technologies Enabling This:
- officer/flextable - Programmatic Word reports
- Quarto - Multi-format publishing (PDF, HTML, presentations)
- Template systems - Reusable structures (Cardinal)
- CI/CD pipelines - Automated reanalysis on code changes
5. Advanced Analytics & Methods ๐
Bayesian Methods Maturing
Tools Reaching Production:
- BayesERtools (Genentech) - Exposure-response with Stan
- bmstate (Generable) - Multistate survival models
- Stan debugging workshop - Making Bayesian accessible
Advantages in Pharma:
- Prior information integration
- Uncertainty quantification
- Small sample performance
- Complex model flexibility
High-Performance Computing
Polars for Clinical Data:
- 10-100x faster than pandas
- Apache Arrow native
- Lazy evaluation
- Parallel processing out-of-the-box
HPC Integration:
- Offloading Shiny computations to clusters
- Maintaining interactive UX
- Resource optimization
Machine Learning
TabPFN - Novel deep learning for tabular data:
- No training required (pre-trained on priors)
- Fast inference
- Bayesian-like uncertainty
- Promising for exploratory analysis
Synthetic Data:
- {synthpop} for privacy-preserving data sharing
- Software testing without PHI
- Model training on synthetic patients
6. Data Quality & Validation ๐
{pointblank} Adoption
Comprehensive data validation framework:
Use Cases:
- Quick dataset understanding (
scan_data()) - Validation at scale (35+ tables daily)
- Beautiful automated documentation
- Integration with databases, Arrow, Shiny
Pharmaceutical Applications:
- SDTM compliance checks
- ADaM dataset verification
- Cross-domain consistency
- Longitudinal data integrity
Automated Traceability
DevOps Principles in Pharma:
Graticuleโs approach:
- Docker containers for reproducibility
- CI/CD for automatic reanalysis
- Version control with validated outputs
- Cloud storage (AWS S3) for results
Benefits:
- Clear audit trail
- Reproducible analyses
- Automated documentation
- Integrated QC
7. Specialized Applications ๐งฌ
Real-World Evidence
Pfizerโs RWD Programming:
- Custom R package for database queries
- Dual R/SAS syntax support
- Shiny apps for workflow support
- Quarto documentation
Niche Medical Devices
Abbottโs Synthetic Data:
- {synthpop} for diagnostics
- Privacy-preserving test data
- Validation datasets
Oncology Innovations
PMDA Inquiries:
- {cards} for rapid response (Japan regulatory)
- Structured ARD approach
- Accelerated turnaround
Pharmacokinetics
aNCA (Pharmaverse):
- Open-source NCA software
- PKNCA backend (200+ parameters)
- 100% test coverage
- Validated against commercial tools (ยฑ0.1%)
8. Regional Insights ๐
Asia/Pacific Innovations
Strengths:
- Strong PMDA regulatory focus
- Practical automation tools
- Under-represented population research
- Genetic diversity considerations
Notable Presentations:
- SHIONOGIโs open-source culture
- {meRlin} AI assistant
- PMDA e-CRT automation
- Vibe Coding approaches
Global Collaboration
Conference demonstrated truly global R pharma community:
- Workshops from US, Europe, Asia
- Shared challenges and solutions
- Pharmaverse as unifying force
9. Future Directions ๐ฎ
Short-Term (2025-2026)
AI Integration:
- โ More validated AI applications in production
- โ MCP ecosystem expansion
- โ Regulatory guidance on AI in submissions
- โ Cost-effective on-premise LLM solutions
Automation:
- โ ARS/ARM adoption across industry
- โ Push-button TFL generation becomes standard
- โ Automated slide decks and CSRs routine
Validation:
- โ Shared metric repositories operational
- โ Risk-based validation industry standard
- โ Shiny app validation frameworks mature
Medium-Term (2027-2028)
Tool Consolidation:
- Language-agnostic analysis platforms
- R/Python/Julia interoperability
- Cloud-native pharma analytics
Regulatory:
- AI-specific guidance documents
- Electronic submissions with code
- Real-time regulatory review platforms
Skills:
- AI/LLM literacy standard for programmers
- Bayesian methods more accessible
- DevOps practices ubiquitous
Long-Term (2029+)
Transformative Possibilities:
AI-Designed Clinical Trials
- LLMs suggesting optimal designs
- Automated SAP generation
- Real-time adaptation
Fully Automated TFLs
- From data lock to submission package
- Human review only, no coding
- Multi-language outputs for global submissions
Real-Time Safety Monitoring
- Continuous analysis during trials
- AI-detected signals
- Predictive adverse event modeling
Personalized Trial Design
- Bayesian adaptive with real-time learning
- Biomarker-driven enrollment
- Precision medicine integration
10. Challenges & Barriers โ ๏ธ
Technical Challenges
Validation Complexity:
- Non-deterministic AI outputs
- Rapidly evolving tools
- Keeping pace with innovation
Integration Issues:
- Legacy system compatibility
- Data silos
- IT security constraints
Organizational Challenges
Cultural Resistance:
- โSAS is what weโve always usedโ
- Fear of change
- Risk aversion in regulated environment
Resource Constraints:
- Training investment required
- Dual maintenance (SAS + R) during transition
- Validation documentation burden
Regulatory Challenges
Uncertainty:
- Limited AI/LLM guidance
- Evolving expectations
- Regional variations (FDA vs EMA vs PMDA)
Documentation:
- What level of detail for AI systems?
- How to handle model updates?
- Third-party API dependencies
11. Success Patterns ๐ฏ
What Works
Based on successful implementations presented:
1. Executive Sponsorship
- Clear targets (GSKโs 50%)
- Resource commitment
- Long-term vision
2. Gradual Transformation
- Multi-wave approach
- Pilot programs
- Learn and adapt
3. Support Infrastructure
- Training programs (not just one-time)
- Help desks and office hours
- Documentation and resources
4. Community Building
- Internal R user groups
- External pharmaverse participation
- Shared learning culture
5. Platform Investment
- Posit Workbench
- Version control (Git/GitHub)
- CI/CD pipelines
- Cloud infrastructure
6. Standards Adoption
- CDISC ARS/ARM
- Pharmaverse conventions
- Style guides and linters
12. Actionable Insights ๐ก
For Organizations Starting R Adoption
- Start with low-risk projects (exploratory analysis, visualizations)
- Invest in training early and continuously
- Build internal community (lunch & learns, Slack channels)
- Adopt pharmaverse packages (donโt reinvent)
- Establish validation framework from day one
- Use Posit products (Workbench, Connect, Package Manager)
For Organizations Scaling R
- Embrace automation (ARS/ARM, template systems)
- Implement CI/CD for reproducibility
- Explore AI carefully (privacy first, validate thoroughly)
- Contribute to pharmaverse (share internal packages)
- Formalize support model (beyond training)
- Plan SAS sunset (donโt maintain dual indefinitely)
For R Professionals
- Learn AI/LLM basics (will be essential skill)
- Master one reporting framework (officer/flextable or Quarto)
- Understand validation (not just coding)
- Contribute to open source (career differentiator)
- Stay current (R/Pharma, webinars, blogs)
- Network (pharmaverse Slack, conferences)
13. The Bigger Picture ๐
R/Pharma 2025 in Context
This conference revealed an industry at an inflection point:
From:
- Proprietary, siloed tools
- Manual, repetitive coding
- Conservative, risk-averse culture
- Isolated company solutions
To:
- Open-source, collaborative development
- Automated, standards-driven workflows
- Innovative, evidence-based adoption
- Shared industry platforms
Why This Matters
For Patients:
- Faster drug development
- More rigorous analysis
- Transparent, reproducible science
- Precision medicine enablement
For Industry:
- Reduced costs
- Faster time to market
- Better talent recruitment
- Competitive advantage
For Science:
- Reproducibility crisis addressed
- Method innovation accelerated
- Global collaboration enabled
- Democratized advanced analytics
14. How R/Pharma Has Evolved ๐
Conference Evolution (2018-2025)
Based on publicly available information and this yearโs content, hereโs how the conference themes have shifted:
2018-2020: Foundation Years
Key Themes:
- ๐ฆ Package development basics - Building pharma packages
- ๐ Basic Shiny apps - Interactive visualizations
- ๐ SAS vs R debates - โShould we switch?โ
- ๐ Simple reporting - Basic TFLs
Sentiment: Cautious optimism, early adopters sharing success stories
Notable:
- Pharmaverse not yet established
- Validation was major concern
- FDA software neutrality clarification (2015) still fresh
2021-2022: Acceleration Phase
Key Themes:
- ๐ COVID acceleration - Remote work drove R adoption
- ๐ค Pharmaverse launch - Collaborative ecosystem born
- โ Validation frameworks - {riskmetric}, R Validation Hub
- ๐ฆ Production packages - {admiral}, {teal} gaining traction
Sentiment: Growing confidence, momentum building
Milestones:
- Major pharma (Roche, Novartis) sharing production solutions
- Regulatory submissions with R increasing
- Training programs formalized (GSK example)
2023-2024: Mainstream Adoption
Key Themes:
- ๐ CDISC integration - Analysis Results Standard (ARS/ARM)
- ๐ง Automation focus - Template-based TFL generation
- ๐ Advanced analytics - Bayesian methods, ML
- ๐ Multi-language - R + Python workflows
Sentiment: R is mainstream, focus shifts to optimization
Evidence:
- Multiple talks on automation (not just feasibility)
- Validation becoming standardized, less controversial
- Advanced topics (multistate models, Stan) gaining space
2025: AI Revolution Year
Dominant Themes:
- ๐ค AI/LLM explosion - 8+ sessions (50% increase)
- ๐ข Enterprise transformation - GSKโs 50%+ achievement
- โ Regulatory maturity - GxP-ready AI, validated Shiny
- ๐ Production at scale - From โcan we?โ to โhow fast?โ
Shift in Discourse:
- 2018: โIs R viable for pharma?โ
- 2022: โHow do we migrate from SAS?โ
- 2025: โHow do we integrate AI with R?โ
Whatโs New in 2025:
AI Dominance
- 2018-2022: Barely mentioned
- 2023: Experimental projects
- 2024: Pilot implementations
- 2025: Production systems ({llumen}, multi-agent copilots, enterprise frameworks)
Regulatory Confidence
- 2018-2020: โWill FDA accept this?โ
- 2021-2022: โHereโs how we validatedโ
- 2023-2024: โValidation frameworks establishedโ
- 2025: โValidated AI in GxPโ - previously unthinkable
Speed of Innovation
- 2018: Yearly package updates
- 2022: Quarterly releases (pharmaverse)
- 2025: Real-time AI integration - tools released months ago
Collaboration Level
- 2018: Individual company solutions
- 2020: Pharmaverse begins
- 2023: Cross-company working groups
- 2025: Shared AI infrastructure (MCP, metric repos)
Topic Frequency Evolution
| Topic | 2020 | 2022 | 2024 | 2025 |
|---|---|---|---|---|
| Package Development | ๐ฅ๐ฅ๐ฅ | ๐ฅ๐ฅ | ๐ฅ | ๐ฅ |
| Validation | ๐ฅ๐ฅ๐ฅ | ๐ฅ๐ฅ๐ฅ | ๐ฅ๐ฅ | ๐ฅ |
| Shiny Apps | ๐ฅ๐ฅ | ๐ฅ๐ฅ๐ฅ | ๐ฅ๐ฅ | ๐ฅ๐ฅ |
| Clinical Reporting | ๐ฅ๐ฅ | ๐ฅ๐ฅ | ๐ฅ๐ฅ๐ฅ | ๐ฅ๐ฅ๐ฅ |
| Bayesian Methods | ๐ฅ | ๐ฅ | ๐ฅ๐ฅ | ๐ฅ๐ฅ |
| AI/LLM | - | - | ๐ฅ | ๐ฅ๐ฅ๐ฅ๐ฅ |
| Automation | ๐ฅ | ๐ฅ๐ฅ | ๐ฅ๐ฅ | ๐ฅ๐ฅ๐ฅ |
| Python Integration | ๐ฅ | ๐ฅ | ๐ฅ๐ฅ | ๐ฅ๐ฅ |
Interpretation:
- Traditional topics (package dev, validation) declining as they become solved problems
- AI/LLM emerged from nowhere to dominate the conversation
- Automation shifted from โnice to haveโ to strategic imperative
- Python integration normalized (not R vs Python, but R and Python)
What Changed Between 2024 and 2025?
Major Shifts:
{ellmer} Package Launch (2024)
- Made LLM integration trivial in R
- Multiple workshops/presentations built on it
- Enterprise adoption accelerated
Claude/ChatGPT API Maturity
- Function calling standardized
- MCP protocol emerged
- Privacy-preserving deployments viable
GSK Milestone
- 50%+ R code achieved
- Proof that full transformation possible
- Other pharma following suit
CDISC ARS Adoption
- Analysis Results Standard gaining traction
- Mosaic (Novartis) and others automating TFLs
- Machine-readable results enabling AI
Posit Product Evolution
- Positron IDE launched (VS Code + R)
- Better Python support
- AI assistants integrated
What Stayed Constant:
- โ Pharmaverse continues thriving
- โ Validation remains priority (but less controversial)
- โ Collaboration over competition
- โ Open-source commitment
Predictions for R/Pharma 2026
Based on the trajectory, expect:
Likely Topics:
- โAI in Submissionsโ - First AI-powered analysis in regulatory filing
- โReal-time Clinical Trial Analysisโ - Adaptive designs with AI
- โSynthetic Patient Generationโ - Privacy-preserving AI for trials
- โQuantum ML for Drug Discoveryโ - Cutting edge
- โFully Automated Study Reportsโ - From data lock to CSR in hours
Maturing Topics:
- LLM integration will be assumed (not taught from scratch)
- Validation frameworks standardized across industry
- Python-R seamless interop (not separate tracks)
Declining Topics:
- โIntro to R packagesโ (basics widely known)
- โWhy leave SAS?โ (decision already made)
- โCan we validate R?โ (settled question)
Bold Prediction:
R/Pharma 2026 will feature the first fully AI-generated clinical study report accepted by a regulatory agency. The discussion will shift from โIs this possible?โ to โWhat should humans still review?โ
15. Conclusion
R/Pharma 2025 demonstrated that open-source pharmaceutical analytics is not the future itโs the present. With AI integration, regulatory acceptance, and enterprise adoption converging, the question is no longer โShould we adopt R?โ but โHow fast can we transform?โ
The conference showcased:
- โ Production-ready AI systems
- โ Validated, GxP-compliant workflows
- โ Major pharma success stories (GSK 50%+)
- โ Thriving ecosystem (pharmaverse)
- โ Regulatory confidence (FDA neutrality)
The momentum is undeniable. The future is open source. The time is now.
- R/Pharma website: rinpharma.com
- Pharmaverse: pharmaverse.org
- Posit: posit.co
- R Validation Hub: pharmar.org
Analysis compiled from R/Pharma 2025 Conference materials | Last updated: November 2025