# ClinicalTrials.gov Tool: Current State & Future Improvements **Status**: Currently Implemented **Priority**: High (Core Data Source for Drug Repurposing) --- ## Current Implementation ### What We Have (`src/tools/clinicaltrials.py`) - V2 API search via `clinicaltrials.gov/api/v2/studies` - Filters: `INTERVENTIONAL` study type, `RECRUITING` status - Returns: NCT ID, title, conditions, interventions, phase, status - Query preprocessing via shared `query_utils.py` ### Current Strengths 1. **Good Filtering**: Already filtering for interventional + recruiting 2. **V2 API**: Using the modern API (v1 deprecated) 3. **Phase Info**: Extracting trial phases for drug development context ### Current Limitations 1. **No Outcome Data**: Missing primary/secondary outcomes 2. **No Eligibility Criteria**: Missing inclusion/exclusion details 3. **No Sponsor Info**: Missing who's running the trial 4. **No Result Data**: For completed trials, no efficacy data 5. **Limited Drug Mapping**: No integration with drug databases --- ## API Capabilities We're Not Using ### Fields We Could Request ```python # Current fields fields = ["NCTId", "BriefTitle", "Condition", "InterventionName", "Phase", "OverallStatus"] # Additional valuable fields additional_fields = [ "PrimaryOutcomeMeasure", # What are they measuring? "SecondaryOutcomeMeasure", # Secondary endpoints "EligibilityCriteria", # Who can participate? "LeadSponsorName", # Who's funding? "ResultsFirstPostDate", # Has results? "StudyFirstPostDate", # When started? "CompletionDate", # When finished? "EnrollmentCount", # Sample size "InterventionDescription", # Drug details "ArmGroupLabel", # Treatment arms "InterventionOtherName", # Drug aliases ] ``` ### Filter Enhancements ```python # Current aggFilters = "studyType:INTERVENTIONAL,status:RECRUITING" # Could add "status:RECRUITING,ACTIVE_NOT_RECRUITING,COMPLETED" # Include completed for results "phase:PHASE2,PHASE3" # Only later-stage trials "resultsFirstPostDateRange:2020-01-01_" # Trials with posted results ``` --- ## Recommended Improvements ### Phase 1: Richer Metadata ```python EXTENDED_FIELDS = [ "NCTId", "BriefTitle", "OfficialTitle", "Condition", "InterventionName", "InterventionDescription", "InterventionOtherName", # Drug synonyms! "Phase", "OverallStatus", "PrimaryOutcomeMeasure", "EnrollmentCount", "LeadSponsorName", "StudyFirstPostDate", ] ``` ### Phase 2: Results Retrieval For completed trials, we can get actual efficacy data: ```python async def get_trial_results(nct_id: str) -> dict | None: """Fetch results for completed trials.""" url = f"https://clinicaltrials.gov/api/v2/studies/{nct_id}" params = { "fields": "ResultsSection", } # Returns outcome measures and statistics ``` ### Phase 3: Drug Name Normalization Map intervention names to standard identifiers: ```python # Problem: "Metformin", "Metformin HCl", "Glucophage" are the same drug # Solution: Use RxNorm or DrugBank for normalization async def normalize_drug_name(intervention: str) -> str: """Normalize drug name via RxNorm API.""" url = f"https://rxnav.nlm.nih.gov/REST/rxcui.json?name={intervention}" # Returns standardized RxCUI ``` --- ## Integration Opportunities ### With PubMed Cross-reference trials with publications: ```python # ClinicalTrials.gov provides PMID links # Can correlate trial results with published papers ``` ### With DrugBank/ChEMBL Map interventions to: - Mechanism of action - Known targets - Adverse effects - Drug-drug interactions --- ## Python Libraries to Consider | Library | Purpose | Notes | |---------|---------|-------| | [pytrials](https://pypi.org/project/pytrials/) | CT.gov wrapper | V2 API support unclear | | [clinicaltrials](https://github.com/ebmdatalab/clinicaltrials-act-tracker) | Data tracking | More for analysis | | [drugbank-downloader](https://pypi.org/project/drugbank-downloader/) | Drug mapping | Requires license | --- ## API Quirks & Gotchas 1. **Rate Limiting**: Undocumented, be conservative 2. **Pagination**: Max 1000 results per request 3. **Field Names**: Case-sensitive, camelCase 4. **Empty Results**: Some fields may be null even if requested 5. **Status Changes**: Trials change status frequently --- ## Example Enhanced Query ```python async def search_drug_repurposing_trials( drug_name: str, condition: str, include_completed: bool = True, ) -> list[Evidence]: """Search for trials repurposing a drug for a new condition.""" statuses = ["RECRUITING", "ACTIVE_NOT_RECRUITING"] if include_completed: statuses.append("COMPLETED") params = { "query.intr": drug_name, "query.cond": condition, "filter.overallStatus": ",".join(statuses), "filter.studyType": "INTERVENTIONAL", "fields": ",".join(EXTENDED_FIELDS), "pageSize": 50, } ``` --- ## Sources - [ClinicalTrials.gov API Documentation](https://clinicaltrials.gov/data-api/api) - [CT.gov Field Definitions](https://clinicaltrials.gov/data-api/about-api/study-data-structure) - [RxNorm API](https://lhncbc.nlm.nih.gov/RxNav/APIs/api-RxNorm.findRxcuiByString.html)