Sanayana Computational Genomics Group

Pioneering AI-Driven Solutions for Rare Disease Genomics

Our Mission

The Sanayana Computational Genomics Group is committed to advancing the frontiers of genomic medicine. We harness the power of artificial intelligence and machine learning to decipher the complexities of the human genome, with a primary focus on improving diagnostic rates and therapeutic insights for rare diseases. Our interdisciplinary team integrates expertise in clinical genetics, computational biology, and AI to translate research into tangible clinical impact.

The Diagnostic Odyssey in Rare Diseases

Rare diseases collectively affect millions worldwide, yet the diagnostic journey is often protracted and inconclusive. A significant proportion of these cases, estimated at over 40%, remain unresolved due to the challenges in interpreting non-coding genomic variants. Global Rare Diseases Initiative Report, 2023 Current analytical pipelines struggle with the synthesis of heterogeneous data types, including unstructured clinical narratives and complex genomic signals.

Current Diagnostic Landscape for Suspected Genetic Rare Diseases
Diagnosed (~60%)
Undiagnosed (~40%)
Diagnosed Undiagnosed

Project: CLINVAR-ASP

To address this critical gap, we are developing CLINVAR-ASP (Clinical Variant Reasoning via AI Semantic Processing), an innovative framework designed to enhance the interpretation of non-coding variants.

Methodological Framework

CLINVAR-ASP High-Level Workflow
Data Ingestion
(Clinical Notes, Genomics)
Claude AI Semantic Processing & Evidence Synthesis
Multi-Modal Integration & Conservation Analysis
Variant Prioritization & Clinical Report

Key components of CLINVAR-ASP include:

The conceptual basis for CLINVAR-ASP builds upon our preliminary work in AI-driven variant prioritization. Mehta A, Kapoor R, et al. (2023). Enhancing Non-Coding Variant Interpretation with Large Language Models. *bioRxiv* doi:10.1101/2024.05.30.596789

The Intellectual Role of Claude AI

Anthropic's Claude is not merely a tool but an intellectual partner in CLINVAR-ASP, tasked with complex cognitive functions:

Comparative Analysis Time per Case (Pilot Study)
~8 hrs
~0.5 hrs
Manual Curation Claude-Assisted

Our pilot studies indicate that Claude integration can reduce literature curation and initial hypothesis generation time per case from approximately 8 hours to under 30 minutes, while improving the accuracy of phenotype-variant mapping by an estimated 32-41%. Sharma P, Mehta A. (2024). Pilot Validation of an AI-Augmented Workflow for Rare Disease Diagnostics. *Proc. AI Med. Conf.*, 112-115.

Our Team

The Sanayana Group comprises a dedicated team of researchers and clinicians:

Dr. Arjun Mehta: Principal Investigator. Expertise: Computational Biology, Genomic AI.
Dr. Priya Sharma: Co-Investigator. Expertise: Clinical Genetics, Rare Disease Diagnostics.
Rahul Kapoor, MS: Lead Bioinformatician. Expertise: Machine Learning, NLP Pipelines.
Neha Patel: PhD Candidate. Expertise: Biomedical NLP, Data Curation.

Supported by a dynamic group of postgraduate researchers and clinical scientists.

Anticipated Impact & Outcomes

Successful completion of the CLINVAR-ASP project is anticipated to yield:

Beyond diagnostics, the methodologies developed hold promise for applications in agricultural genomics (e.g., crop improvement with CGIAR) and accelerating target identification in drug discovery programs.

Project Timeline

The CLINVAR-ASP project is structured over a 12-month execution plan:

M1-3: Data Curation & Claude Pipeline Dev.
M4-6: Model Training & Initial Validation
M7-9: Clinical Pilot & Iterative Refinement
M10-12: Deployment, Dissemination & Reporting

Selected Publications & Preprints

Collaborations & Support

This research is strengthened by collaborations with NHS Genomic Medicine Centres (London, Manchester, Birmingham) and benefits from institutional support. We are actively seeking API credit support from Anthropic's AI for Science Program to maximize the computational scope and impact of this project. All research involving patient data is conducted under strict IRB approval (IRB-2024-7890) and HIPAA compliance.