META_PLAN: DeepBoner Stabilization Roadmap
Created: 2025-12-03 Status: Active Purpose: Single source of truth for what to do next before adding features
Executive Summary
Codebase Health: PRODUCTION-READY
- 317 tests passing
- No type errors (mypy clean)
- No linting issues (ruff clean)
- 1 open bug (P3 - low-priority UX)
Key Finding: Architecture is sound. Both high-impact specs (SPEC-13, SPEC-14) are now implemented. Documentation is sprawling but mostly accurate.
Recommendation: Clean up tech debt (Anthropic wiring), then organize docs.
Current State Assessment
Documentation Status
| Document | Status | Action |
|---|---|---|
docs/STATUS_LLAMAINDEX_INTEGRATION.md |
DONE | Keep as-is |
docs/specs/archive/SPEC_13_EVIDENCE_DEDUPLICATION.md |
β IMPLEMENTED | Archived |
docs/specs/archive/SPEC_14_CLINICALTRIALS_OUTCOMES.md |
β IMPLEMENTED | Archived |
docs/future-roadmap/TOOL_ANALYSIS_CRITICAL.md |
ANALYSIS DONE | Reference for future |
docs/ARCHITECTURE.md |
PARTIAL | Expand with diagrams |
docs/architecture/system_registry.md |
DONE | Canonical SSOT for wiring |
Architecture Status
| Component | Status | Notes |
|---|---|---|
src/orchestrators/ |
COMPLETE | Factory pattern, protocols |
src/clients/ |
COMPLETE | OpenAI/HuggingFace working, Anthropic partial (tech debt) |
src/tools/ |
COMPLETE | Deduplication + outcomes extraction done |
src/agents/ |
FUNCTIONAL | All agents wired, some experimental |
src/services/ |
COMPLETE | Embeddings, RAG, memory all working |
Open Issues
| Issue | Priority | Effort |
|---|---|---|
| β DONE | ||
| β DONE | ||
| Remove Anthropic wiring (P3) | P3 | 1 hour |
| Expand ARCHITECTURE.md | MEDIUM | 2 hours |
| P3 Progress Bar positioning | P3 | 30 min |
The Next 5 Steps
Step 1: Implement SPEC_13 - Evidence Deduplication β
COMPLETE
Priority: HIGH DONE | Effort: 3-4 hours | Impact: 30-50% token savings
β COMPLETED - Deduplication now removes duplicate papers from PubMed/Europe PMC/OpenAlex.
Files modified:
src/tools/search_handler.py- Addedextract_paper_id()anddeduplicate_evidence()src/tools/openalex.py- Extracts PMID fromwork.ids.pmidtests/unit/tools/test_search_handler.py- 22 dedup teststests/integration/test_search_deduplication.py- Integration test
Spec: docs/specs/archive/SPEC_13_EVIDENCE_DEDUPLICATION.md (Status: Implemented)
Step 2: Implement SPEC_14 - ClinicalTrials Outcomes β
COMPLETE
Priority: HIGH DONE | Effort: 2-3 hours | Impact: Critical efficacy data
β COMPLETED - ClinicalTrials now extracts outcome measures and results status.
Files modified:
src/tools/clinicaltrials.py- AddedOutcomesModule,HasResultsfields,_extract_primary_outcome()tests/unit/tools/test_clinicaltrials.py- 4 outcome tests + 2 integration tests
Spec: docs/specs/archive/SPEC_14_CLINICALTRIALS_OUTCOMES.md (Status: Implemented)
Step 3: Remove Anthropic Tech Debt
Priority: P3 | Effort: 1 hour | Impact: Code clarity
Anthropic is partially wired but NOT supported (no embeddings API). Creates confusion.
Files to modify:
src/utils/config.py- Remove ANTHROPIC_API_KEY handlingsrc/clients/factory.py- Remove Anthropic casesrc/agent_factory/judges.py- Remove Anthropic referencesCLAUDE.md- Update documentation
Doc: docs/future-roadmap/P3_REMOVE_ANTHROPIC_PARTIAL_WIRING.md
Step 4: Documentation Consolidation
Priority: MEDIUM | Effort: 2 hours | Impact: Developer clarity
Create single canonical architecture doc with:
- System flow diagram
- Component interaction map
- Error handling patterns
- Deployment topology
Output: Expanded docs/ARCHITECTURE.md
Step 5: Create Implementation Status Matrix
Priority: LOW | Effort: 1 hour | Impact: Project tracking
Update docs/index.md or create docs/IMPLEMENTATION_STATUS.md with:
- Phase completion tracking (14 phases)
- Post-hackathon roadmap status
- Clear DONE vs TODO markers
What NOT To Do (Yet)
- Add new features - Stabilize first
- Add new LLM providers - OpenAI/HuggingFace cover all use cases
- Build Neo4j knowledge graph - Overkill for current needs
- Implement full-text retrieval - Phase 15+ (after stabilization)
- Add MeSH term expansion - Phase 15+ (optimization)
Documentation Sprawl Analysis
Total docs: 91 markdown files in docs/
Organization:
docs/
βββ architecture/ # Canonical architecture docs (4 files)
βββ brainstorming/ # Ideas, not commitments (6 files)
βββ bugs/ # Active bugs + archive (25+ files)
βββ decisions/ # ADRs from Nov 2025 (2 files)
βββ development/ # Dev guides (1 file)
βββ future-roadmap/ # Deferred work (5 files)
βββ guides/ # User guides (1 file)
βββ implementation/ # Phase docs 1-14 (15 files)
βββ specs/ # Feature specs (4 files)
βββ ARCHITECTURE.md # High-level overview
βββ index.md # Entry point
Recommendation: Structure is fine. Both SPEC_13 and SPEC_14 are now implemented.
Success Criteria
After completing Steps 1-5:
- Evidence deduplication reduces duplicate papers by 80%+ β
- ClinicalTrials shows outcome measures and results status β
- No Anthropic references in codebase
- ARCHITECTURE.md has flow diagrams
- All 14 implementation phases marked DONE/TODO
Decision Log
| Date | Decision | Rationale |
|---|---|---|
| 2025-12-03 | Implement specs before doc cleanup | Specs are ready, high impact |
| 2025-12-03 | Remove Anthropic over adding Gemini | Tech debt cleanup > new features |
| 2025-12-03 | Defer full-text retrieval | Stabilize core first |
| 2025-12-03 | Mark SPEC_13 complete | All acceptance criteria verified, PR #122 |
| 2025-12-03 | Mark SPEC_14 complete | All acceptance criteria verified (was already implemented) |
References
docs/architecture/system_registry.md- Decorator/marker/tool wiring SSOTdocs/bugs/ACTIVE_BUGS.md- Current bug trackingCLAUDE.md- Development commands and patterns