Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

App Files Files Community

VibecoderMcSwaggins commited on 12 days ago

Commit

cd7c282

unverified ·

1 Parent(s): 97907da

feat(SPEC-16): Unified Chat Client Architecture (#115)

Browse files

* chore: Update dependencies and verify SPEC-16 for Unified Chat Client

* feat: Implement unified ChatClient architecture (SPEC-16) Phase 1

* refactor: Deprecate Simple Mode and map to Unified Advanced Mode (SPEC-16 Phase 2)

* refactor: Complete SPEC-16 cleanup - remove stale dual-mode tests

- Delete obsolete e2e/integration tests referencing removed functions
(check_magentic_requirements, mode="simple", etc.)
- Update unit tests for unified architecture (no mode parameter)
- Fix type errors in HuggingFaceChatClient (add type: ignore for untyped base)
- Remove mode toggle from Gradio UI
- Add ChatClient factory tests

Closes #105, Fixes #113
Refs #114 (tech debt: naming cleanup deferred)

* chore: Sync pre-commit mypy with project dependencies

Add agent-framework-core to pre-commit additional_dependencies so
mypy runs with the same type information in pre-commit hooks as in
`make typecheck`.

Previously, the pre-commit mypy hook ran in isolation without
agent_framework types, causing BaseChatClient to appear as Any.

* style: Format files for CI compliance

* chore: Sync ruff version (0.14.7) between pre-commit and uv.lock

Fixes divergence where pre-commit used v0.14.7 but CI/local used v0.14.6,
causing formatting differences.

* fix: Address CodeRabbit review findings (PR #115)

## Factory (CRITICAL)
- Add case-insensitive provider matching (OpenAI → openai)
- Raise ValueError for unsupported providers (no silent fallback)
- Fix misleading Gemini log (now warns + falls through)

## HuggingFace Client (CRITICAL + MAJOR)
- Fix Role enum conversion: use .value, not str(enum)
- str(Role.USER) → "Role.USER" (wrong)
- Role.USER.value → "user" (correct)
- Fix temperature/max_tokens: use `is not None` instead of `or`
- `or` treats 0/0.0 as falsy, breaking temperature=0.0

## Tests
- Add test for unsupported provider ValueError
- Add test for case-insensitive provider matching
- Add test for Role enum conversion

* fix: Apply same defensive patterns codebase-wide

## Case-insensitive provider matching
- llm_factory.py: Normalize llm_provider before comparison
- config.py: Normalize llm_provider in get_api_key()

## Explicit None checks for numeric defaults
- judge.py: total_evidence_count=0 is now honored

These are the same anti-patterns fixed in the CodeRabbit review,
now applied consistently across the codebase.

Files changed (43) hide show

.pre-commit-config.yaml +1 -0
docs/bugs/ACTIVE_BUGS.md +20 -1
docs/bugs/P0_SIMPLE_MODE_FORCED_SYNTHESIS_BYPASS.md +219 -0
docs/specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md +246 -175
pyproject.toml +1 -1
src/agents/code_executor_agent.py +4 -7
src/agents/magentic_agents.py +10 -22
src/agents/retrieval_agent.py +4 -7
src/app.py +25 -76
src/clients/__init__.py +0 -0
src/clients/base.py +19 -0
src/clients/factory.py +76 -0
src/clients/huggingface.py +191 -0
src/orchestrators/__init__.py +16 -15
src/orchestrators/advanced.py +40 -38
src/orchestrators/factory.py +20 -73
src/orchestrators/simple.py +0 -778
src/prompts/judge.py +2 -1
src/utils/config.py +18 -4
src/utils/llm_factory.py +23 -60
tests/e2e/test_advanced_mode.py +0 -70
tests/e2e/test_simple_mode.py +0 -65
tests/integration/test_dual_mode_e2e.py +0 -83
tests/integration/test_simple_mode_synthesis.py +0 -157
tests/unit/agents/test_magentic_agents_domain.py +8 -8
tests/unit/agents/test_magentic_judge_termination.py +26 -14
tests/unit/clients/__init__.py +1 -0
tests/unit/clients/test_chat_client_factory.py +211 -0
tests/unit/orchestrators/test_advanced_orchestrator.py +21 -17
tests/unit/orchestrators/test_advanced_orchestrator_domain.py +15 -20
tests/unit/orchestrators/test_factory_domain.py +7 -9
tests/unit/orchestrators/test_simple_orchestrator_domain.py +0 -47
tests/unit/orchestrators/test_simple_synthesis.py +0 -320
tests/unit/orchestrators/test_termination.py +0 -104
tests/unit/test_app_domain.py +43 -34
tests/unit/test_gradio_crash.py +2 -2
tests/unit/test_magentic_fix.py +0 -101
tests/unit/test_magentic_termination.py +0 -155
tests/unit/test_orchestrator.py +0 -290
tests/unit/test_orchestrator_factory.py +20 -25
tests/unit/test_streaming_fix.py +2 -1
tests/unit/test_ui_elements.py +38 -18
uv.lock +23 -23

.pre-commit-config.yaml CHANGED Viewed

@@ -18,4 +18,5 @@ repos:
           - pydantic-settings>=2.2
           - tenacity>=8.2
           - pydantic-ai>=0.0.16
         args: [--ignore-missing-imports]

           - pydantic-settings>=2.2
           - tenacity>=8.2
           - pydantic-ai>=0.0.16
+          - agent-framework-core>=1.0.0b251120
         args: [--ignore-missing-imports]

docs/bugs/ACTIVE_BUGS.md CHANGED Viewed

@@ -7,7 +7,26 @@
 ## P0 - Blocker
-_No active P0 bugs._
 ---

 ## P0 - Blocker
+### P0 - Simple Mode Ignores Forced Synthesis (Issue #113)
+**File:** `docs/bugs/P0_SIMPLE_MODE_FORCED_SYNTHESIS_BYPASS.md`
+**Issue:** [#113](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/113)
+**Found:** 2025-12-01 (Free Tier Testing)
+**Problem:** When HuggingFace Inference fails 3 times, the Judge returns `recommendation="synthesize"` but Simple Mode's `_should_synthesize()` ignores it due to strict score thresholds (requires `combined_score >= 10` but forced synthesis has score 0).
+**Impact:** Free tier users see 10 iterations of "Gathering more evidence" despite Judge saying "synthesize".
+**Root Cause:** Coordination bug between two fixes:
+- **PR #71 (SPEC_06):** Added `_should_synthesize()` with strict thresholds
+- **Commit 5e761eb:** Added `_create_forced_synthesis_assessment()` with `score=0, confidence=0.1`
+- These don't work together - forced synthesis bypasses nothing.
+**Strategic Fix:** [SPEC_16: Unified Chat Client Architecture](../specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md) - **INTEGRATION, NOT DELETION**
+- Create `HuggingFaceChatClient` adapter for Microsoft Agent Framework
+- **INTEGRATE** Simple Mode's free-tier capability into Advanced Mode
+- Users without API keys → Advanced Mode with HuggingFace backend (capability PRESERVED)
+- Retire Simple Mode's redundant orchestration CODE (not the capability!)
+- Bug disappears because Advanced Mode handles termination correctly (Manager agent signals)
 ---

docs/bugs/P0_SIMPLE_MODE_FORCED_SYNTHESIS_BYPASS.md ADDED Viewed

	@@ -0,0 +1,219 @@

+# P0 BUG: Simple Mode Ignores Forced Synthesis from HF Inference Failures
+**Status**: Open → **Fix via SPEC_16 (Integration)**
+**Priority**: P0 (Demo-blocking)
+**Discovered**: 2025-12-01
+**Affected Component**: `src/orchestrators/simple.py`
+**Strategic Fix**: [SPEC_16: Unified Chat Client Architecture](../specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md)
+**GitHub Issue**: [#113](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/113)
+> **Decision**: Instead of patching Simple Mode, we will **INTEGRATE its capability into Advanced Mode** per SPEC_16.
+>
+> **What this means:**
+> - ✅ Free-tier HuggingFace capability is PRESERVED via `HuggingFaceChatClient`
+> - ✅ Users without API keys still get full functionality (Advanced Mode + HuggingFace backend)
+> - 🗑️ Simple Mode's redundant orchestration CODE is retired (not the capability!)
+> - 🐛 The bug disappears because Advanced Mode's Manager agent handles termination correctly
+---
+## Problem Statement
+When HuggingFace Inference API fails 3 consecutive times, the `HFInferenceJudgeHandler` correctly returns a "forced synthesis" assessment with `sufficient=True, recommendation="synthesize"`. However, **Simple Mode's `_should_synthesize()` method ignores this signal** because of overly strict code-enforced thresholds.
+### Observed Behavior
+```
+✅ JUDGE_COMPLETE: Assessment: synthesize (confidence: 10%)
+🔄 LOOPING: Gathering more evidence...  ← BUG: Should have synthesized!
+```
+The orchestrator loops **10 full iterations** despite the judge repeatedly saying "synthesize" after iteration 4.
+### Expected Behavior
+When `HFInferenceJudgeHandler._create_forced_synthesis_assessment()` returns:
+- `sufficient=True`
+- `recommendation="synthesize"`
+The orchestrator should **immediately synthesize**, regardless of score thresholds.
+---
+## Root Cause Analysis
+### The Forced Synthesis Assessment (judges.py:514-549)
+```python
+def _create_forced_synthesis_assessment(self, question, evidence):
+    return JudgeAssessment(
+        details=AssessmentDetails(
+            mechanism_score=0,        # ← Problem 1: Score is 0
+            clinical_evidence_score=0, # ← Problem 2: Score is 0
+            drug_candidates=["AI analysis required..."],
+            key_findings=findings,
+        ),
+        sufficient=True,              # ← Correct: Says sufficient
+        confidence=0.1,               # ← Problem 3: Too low for emergency
+        recommendation="synthesize",  # ← Correct: Says synthesize
+        ...
+    )
+```
+### The _should_synthesize Logic (simple.py:159-216)
+```python
+def _should_synthesize(self, assessment, iteration, max_iterations, evidence_count):
+    combined_score = mechanism_score + clinical_evidence_score  # = 0
+    # Priority 1: Judge approved - BUT REQUIRES combined_score >= 10!
+    if assessment.sufficient and assessment.recommendation == "synthesize":
+        if combined_score >= 10:  # ← 0 >= 10 is FALSE!
+            return True, "judge_approved"
+    # Priority 2-5: All require scores or drug candidates we don't have
+    # Priority 6: Emergency synthesis
+    if is_late_iteration and evidence_count >= 30 and confidence >= 0.5:
+        #                                          ↑ 0.1 >= 0.5 is FALSE!
+        return True, "emergency_synthesis"
+    return False, "continue_searching"  # ← Always ends up here!
+```
+### The Bug
+1. **Priority 1 has wrong precondition**: It checks `combined_score >= 10` even when the judge explicitly says "synthesize". The score check should be skipped when it's a forced/error recovery synthesis.
+2. **Priority 6 confidence threshold is too high**: 0.5 confidence is reasonable for "emergency" synthesis, but forced synthesis from API failures uses 0.1 confidence to indicate low quality—this should still trigger synthesis.
+---
+## Impact
+- **User sees**: 10 iterations of "Gathering more evidence" with 0% confidence
+- **Final output**: Partial synthesis with "Max iterations reached"
+- **Time wasted**: ~2-3 minutes of useless API calls
+- **UX**: Extremely confusing - user sees "synthesize" but system keeps searching
+---
+## Proposed Fix
+### ~~Option A: Patch Simple Mode~~ (REJECTED)
+We considered patching `_should_synthesize()` to respect forced synthesis signals. However, this adds MORE complexity to an already complex system that we plan to delete.
+### ✅ Strategic Fix: SPEC_16 Unification (APPROVED)
+**Delete Simple Mode entirely and unify on Advanced Mode.**
+See: [SPEC_16: Unified Chat Client Architecture](../specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md)
+The implementation path:
+1. **Phase 1**: Create `HuggingFaceChatClient` adapter (~150 lines)
+   - Implements `agent_framework.BaseChatClient`
+   - Wraps `huggingface_hub.InferenceClient`
+   - Enables Advanced Mode to work with free tier
+2. **Phase 2**: Delete Simple Mode
+   - Remove `src/orchestrators/simple.py` (~778 lines)
+   - Remove `src/tools/search_handler.py` (~219 lines)
+   - Update factory to always use `AdvancedOrchestrator`
+3. **Why this works**: Advanced Mode uses Microsoft Agent Framework's built-in termination. When JudgeAgent returns "SUFFICIENT EVIDENCE" (per SPEC_15), the Manager agent immediately delegates to ReportAgent. **No custom `_should_synthesize()` thresholds needed.**
+### Why Unification > Patching
+| Approach | Lines Changed | Bug Fixed? | Technical Debt |
+|----------|---------------|------------|----------------|
+| Patch Simple Mode | +20 lines | Temporarily | Adds complexity |
+| **SPEC_16 Unification** | **-997 lines** | **Permanently** | **Eliminates 778 lines** |
+---
+## Files to DELETE (via SPEC_16)
+| File | Lines | Reason |
+|------|-------|--------|
+| `src/orchestrators/simple.py` | 778 | Contains buggy `_should_synthesize()` - entire file deleted |
+| `src/tools/search_handler.py` | 219 | Manager agent handles orchestration in Advanced Mode |
+## Files to CREATE (via SPEC_16)
+| File | Lines | Purpose |
+|------|-------|---------|
+| `src/clients/__init__.py` | ~10 | Package exports |
+| `src/clients/factory.py` | ~50 | `get_chat_client()` factory |
+| `src/clients/huggingface.py` | ~150 | `HuggingFaceChatClient` adapter |
+**Net change: -997 lines deleted, +210 lines added = ~787 lines removed**
+---
+## Acceptance Criteria (SPEC_16 Implementation)
+- [ ] `HuggingFaceChatClient` implements `agent_framework.BaseChatClient`
+- [ ] `get_chat_client()` returns HuggingFace client when no OpenAI key
+- [ ] `AdvancedOrchestrator` works with HuggingFace backend
+- [ ] `simple.py` is deleted (778 lines removed)
+- [ ] Free tier users get Advanced Mode with HuggingFace
+- [ ] No more "continue_searching" loops when HF fails
+- [ ] Manager agent respects "SUFFICIENT EVIDENCE" signal (SPEC_15)
+---
+## Test Case (SPEC_16 Verification)
+```python
+@pytest.mark.asyncio
+async def test_unified_architecture_handles_hf_failures():
+    """
+    After SPEC_16: Free tier uses Advanced Mode with HuggingFace backend.
+    When HF fails, Manager agent should trigger synthesis via ReportAgent.
+    This test replaces the old Simple Mode test because:
+    - simple.py is DELETED
+    - Advanced Mode handles termination via Manager agent signals
+    - No _should_synthesize() thresholds to bypass
+    """
+    from unittest.mock import patch, MagicMock
+    from src.orchestrators.advanced import AdvancedOrchestrator
+    from src.clients.factory import get_chat_client
+    # Verify factory returns HuggingFace client when no OpenAI key
+    with patch("src.utils.config.settings") as mock_settings:
+        mock_settings.has_openai_key = False
+        mock_settings.has_gemini_key = False
+        mock_settings.has_huggingface_key = True
+        client = get_chat_client()
+        assert "HuggingFace" in type(client).__name__
+    # Verify AdvancedOrchestrator accepts HuggingFace client
+    # (The actual termination is handled by Manager agent respecting
+    #  "SUFFICIENT EVIDENCE" signals per SPEC_15)
+```
+---
+## Related Issues & Specs
+| Reference | Type | Relationship |
+|-----------|------|--------------|
+| [SPEC_16](../specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md) | Spec | **THE FIX** - Unified architecture eliminates this bug |
+| [SPEC_15](../specs/SPEC_15_ADVANCED_MODE_PERFORMANCE.md) | Spec | Manager agent termination logic (already implemented) |
+| [Issue #105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105) | GitHub | Deprecate Simple Mode |
+| [Issue #109](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/109) | GitHub | Simplify Provider Architecture |
+| [Issue #110](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/110) | GitHub | Remove Anthropic Support |
+| PR #71 (SPEC_06) | PR | Added `_should_synthesize()` - now causes this bug |
+| Commit 5e761eb | Commit | Added `_create_forced_synthesis_assessment()` |
+---
+## References
+- `src/orchestrators/simple.py:159-216` - `_should_synthesize()` method
+- `src/agent_factory/judges.py:514-549` - `_create_forced_synthesis_assessment()`
+- `src/agent_factory/judges.py:477-512` - `_create_quota_exhausted_assessment()`

docs/specs/SPEC_16_UNIFIED_CHAT_CLIENT_ARCHITECTURE.md CHANGED Viewed

@@ -1,279 +1,350 @@
 # SPEC_16: Unified Chat Client Architecture
 **Status**: Proposed
-**Priority**: P1 (Architectural Simplification)
-**Issue**: Updates [#105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105), [#109](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/109)
 **Created**: 2025-12-01
-**Last Verified**: 2025-12-01 (line counts and imports verified against codebase)
 ## Summary
-Eliminate the Simple Mode / Advanced Mode parallel universe by implementing a pluggable `ChatClient` architecture. This moves the system away from a hardcoded `OpenAIChatClient` namespace to a neutral `BaseChatClient` protocol, allowing the multi-agent framework to work with ANY LLM provider through a unified codebase.
-## Strategic Goals
-1. **Namespace Neutrality**: Decouple the core orchestrator from the `OpenAI` namespace. The system should speak `ChatClient`, not `OpenAIChatClient`.
-2. **Full-Stack Provider Chain**: Prioritize providers that offer both LLM and Embeddings (OpenAI, Gemini, HuggingFace+Local) to ensure a unified environment.
-3. **Fragmentation Reduction**: Remove "LLM-only" providers (Anthropic) that force complex hybrid dependency chains (e.g., Anthropic LLM + OpenAI Embeddings).
-## Problem Statement
-### Current Architecture: Two Parallel Universes
 ```text
 User Query
     │
     ├── Has API Key? ──Yes──→ Advanced Mode (488 lines)
     │                         └── Microsoft Agent Framework
-    │                         └── OpenAIChatClient (hardcoded dependency)
     │
     └── No API Key? ──────────→ Simple Mode (778 lines)
-                                └── While-loop orchestration
                                 └── Pydantic AI + HuggingFace
 ```
-**Problems:**
-1. **Double Maintenance**: 1,266 lines across two orchestrator systems.
-2. **Namespace Lock-in**: The Advanced Orchestrator is tightly coupled to `OpenAIChatClient` (25 references across 5 files).
-3. **Fragmented Chains**: Using Anthropic requires a "Frankenstein" chain (Anthropic LLM + OpenAI Embeddings).
-4. **Testing Burden**: Two test suites, two CI paths.
-## Proposed Solution: ChatClientFactory
-### Architecture After Implementation
 ```text
 User Query
     │
-    └──→ Advanced Mode (unified)
          └── Microsoft Agent Framework
-         └── ChatClientFactory (Namespace Neutral):
-             ├── OpenAIChatClient (Paid Tier: Best Performance)
-             ├── GeminiChatClient (Alternative Tier: LLM + Embeddings)
-             └── HuggingFaceChatClient (Free Tier: LLM + Local Embeddings)
 ```
-### New Files
-```text
-src/
-├── clients/
-│   ├── __init__.py
-│   ├── base.py              # Re-export BaseChatClient (The neutral protocol)
-│   ├── factory.py           # ChatClientFactory
-│   ├── huggingface.py       # HuggingFaceChatClient
-│   └── gemini.py            # GeminiChatClient [Future]
-```
-### ChatClientFactory Implementation
-```python
-# src/clients/factory.py
-from agent_framework import BaseChatClient
-from agent_framework.openai import OpenAIChatClient
-from src.utils.config import settings
-def get_chat_client(
-    provider: str | None = None,
-    api_key: str | None = None,
-) -> BaseChatClient:
-    """
-    Factory for creating chat clients.
-    Auto-detection priority:
-    1. Explicit provider parameter
-    2. OpenAI key (Best Function Calling)
-    3. Gemini key (Best Context/Cost)
-    4. HuggingFace (Free Fallback)
-    Args:
-        provider: Force specific provider ("openai", "gemini", "huggingface")
-        api_key: Override API key for the provider
-    Returns:
-        Configured BaseChatClient instance (Neutral Namespace)
-    """
-    # OpenAI (Standard)
-    if provider == "openai" or (provider is None and settings.has_openai_key):
-        return OpenAIChatClient(
-            model_id=settings.openai_model,
-            api_key=api_key or settings.openai_api_key,
-        )
-    # Gemini (High Performance Alternative) - REQUIRES config.py update first
-    if provider == "gemini" or (provider is None and settings.has_gemini_key):
-        from src.clients.gemini import GeminiChatClient
-        return GeminiChatClient(
-            model_id="gemini-2.0-flash",
-            api_key=api_key or settings.gemini_api_key,
-        )
-    # Free Fallback (HuggingFace)
-    from src.clients.huggingface import HuggingFaceChatClient
-    return HuggingFaceChatClient(
-        model_id="meta-llama/Llama-3.1-70B-Instruct",
-    )
-```
-### Changes to Advanced Orchestrator
 ```python
-# src/orchestrators/advanced.py
-# BEFORE (hardcoded namespace):
 from agent_framework.openai import OpenAIChatClient
 class AdvancedOrchestrator:
     def __init__(self, ...):
-        self._chat_client = OpenAIChatClient(...)
-# AFTER (neutral namespace):
 from src.clients.factory import get_chat_client
 class AdvancedOrchestrator:
-    def __init__(self, chat_client=None, provider=None, api_key=None, ...):
-        # The orchestrator no longer knows about OpenAI
-        self._chat_client = chat_client or get_chat_client(
-            provider=provider,
-            api_key=api_key,
-        )
 ```
----
-## Technical Requirements
-### BaseChatClient Protocol (Verified)
-The `agent_framework.BaseChatClient` requires implementing **2 abstract methods**:
 ```python
 class HuggingFaceChatClient(BaseChatClient):
-    """Adapter for HuggingFace Inference API."""
     async def _inner_get_response(
         self,
         messages: list[ChatMessage],
         **kwargs
     ) -> ChatResponse:
-        """Synchronous response generation."""
-        ...
-    async def _inner_get_streaming_response(
-        self,
-        messages: list[ChatMessage],
-        **kwargs
-    ) -> AsyncIterator[ChatResponseUpdate]:
-        """Streaming response generation."""
         ...
 ```
-### Required Config Changes
-**BEFORE implementation**, add to `src/utils/config.py`:
 ```python
-# Settings class additions:
-gemini_api_key: str | None = Field(default=None, description="Google Gemini API key")
-@property
-def has_gemini_key(self) -> bool:
-    """Check if Gemini API key is available."""
-    return bool(self.gemini_api_key)
 ```
 ---
-## Files to Modify (Complete List)
-### Category 1: OpenAIChatClient References (25 total)
-| File | Lines | Changes Required |
-|------|-------|------------------|
-| `src/orchestrators/advanced.py` | 31, 70, 95, 101, 122 | Replace with `get_chat_client()` |
-| `src/agents/magentic_agents.py` | 4, 17, 29, 58, 70, 117, 129, 161, 173 | Change type hints to `BaseChatClient` |
-| `src/agents/retrieval_agent.py` | 5, 53, 62 | Change type hints to `BaseChatClient` |
-| `src/agents/code_executor_agent.py` | 7, 43, 52 | Change type hints to `BaseChatClient` |
-| `src/utils/llm_factory.py` | 19, 22, 35, 38, 42 | Merge into `clients/factory.py` |
-### Category 2: Anthropic References (46 total - Issue #110)
-| File | Refs | Changes Required |
-|------|------|------------------|
-| `src/agent_factory/judges.py` | 10 | Remove Anthropic imports and fallback |
-| `src/utils/config.py` | 10 | Remove `anthropic_api_key`, `anthropic_model`, `has_anthropic_key` |
-| `src/utils/llm_factory.py` | 10 | Remove Anthropic model creation |
-| `src/app.py` | 12 | Remove Anthropic key detection and UI |
-| `src/orchestrators/simple.py` | 2 | Remove Anthropic mentions |
-| `src/agents/hypothesis_agent.py` | 1 | Update comment |
-### Category 3: Files to Delete (Phase 3)
-| File | Lines | Reason |
-|------|-------|--------|
-| `src/orchestrators/simple.py` | 778 | Replaced by unified Advanced Mode |
-| `src/tools/search_handler.py` | 219 | Manager agent handles orchestration |
-**Total deletion: ~997 lines**
-**Total addition: ~400 lines (new clients)**
-**Net: ~600 fewer lines, single architecture**
 ---
 ## Migration Plan
-### Phase 1: Neutralize Namespace & Add HuggingFace
-- [ ] Add `gemini_api_key` and `has_gemini_key` to `src/utils/config.py`
 - [ ] Create `src/clients/` package
-- [ ] Implement `HuggingFaceChatClient` adapter (~150 lines)
-- [ ] Implement `ChatClientFactory` (~50 lines)
-- [ ] Refactor `AdvancedOrchestrator` to use `get_chat_client()`
-- [ ] Update type hints in `magentic_agents.py`, `retrieval_agent.py`, `code_executor_agent.py`
-- [ ] Merge `llm_factory.py` functionality into `clients/factory.py`
-### Phase 2: Simplify Provider Chain (Issue #110)
-- [ ] Remove Anthropic from `judges.py` (10 refs)
-- [ ] Remove Anthropic from `config.py` (10 refs)
-- [ ] Remove Anthropic from `llm_factory.py` (10 refs)
-- [ ] Remove Anthropic from `app.py` (12 refs)
-- [ ] Update user-facing strings mentioning Anthropic
-- [ ] (Future) Implement `GeminiChatClient` (~200 lines)
-### Phase 3: Deprecate Simple Mode (Issue #105)
-- [ ] Update `src/orchestrators/factory.py` to use unified `AdvancedOrchestrator`
-- [ ] Delete `src/orchestrators/simple.py` (778 lines)
-- [ ] Delete `src/tools/search_handler.py` (219 lines)
-- [ ] Update tests to only test Advanced Mode
-- [ ] Archive deleted files to `docs/archive/` for reference
 ---
-## Why This is "Elegant"
-1. **One System**: We stop maintaining two parallel universes.
-2. **Dependency Injection**: The specific LLM provider is injected, not hardcoded.
-3. **Full-Stack Alignment**: We prioritize providers (OpenAI, Gemini) that own the whole vertical (LLM + Embeddings), reducing environment complexity.
 ---
-## Verification Checklist (For Implementer)
-Before starting implementation, verify:
-- [x] `agent_framework.BaseChatClient` exists (verified: `agent_framework._clients.BaseChatClient`)
 - [x] Abstract methods: `_inner_get_response`, `_inner_get_streaming_response`
-- [x] `agent_framework.ChatResponse`, `ChatResponseUpdate`, `ChatMessage` importable
-- [x] `settings.has_openai_key` exists (line 118)
-- [ ] `settings.has_gemini_key` **MUST BE ADDED** (does not exist)
-- [ ] `settings.gemini_api_key` **MUST BE ADDED** (does not exist)
 ---
 ## References
 - Microsoft Agent Framework: `agent_framework.BaseChatClient`
-- Gemini API: [Embeddings + LLM](https://ai.google.dev/gemini-api/docs/embeddings)
 - HuggingFace Inference: `huggingface_hub.InferenceClient`
-- Issue #105: Deprecate Simple Mode
 - Issue #109: Simplify Provider Architecture
 - Issue #110: Remove Anthropic Provider Support

 # SPEC_16: Unified Chat Client Architecture
 **Status**: Proposed
+**Priority**: P0 (Fixes Critical Bug #113)
+**Issue**: Updates [#105](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/105), [#109](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/109), **[#113](https://github.com/The-Obstacle-Is-The-Way/DeepBoner/issues/113)** (P0 Bug)
 **Created**: 2025-12-01
+**Last Updated**: 2025-12-01
+---
+## ⚠️ CRITICAL CLARIFICATION: Integration, Not Deletion
+**This spec INTEGRATES Simple Mode's free-tier capability into Advanced Mode.**
+| What We're Doing | What We're NOT Doing |
+|------------------|----------------------|
+| ✅ Integrating HuggingFace support into Advanced Mode | ❌ Removing free-tier capability |
+| ✅ Unifying two parallel implementations into one | ❌ Breaking functionality for users without API keys |
+| ✅ Deleting redundant orchestration CODE | ❌ Deleting the CAPABILITY that code provided |
+| ✅ Making Advanced Mode work with ANY provider | ❌ Locking users into paid-only tiers |
+**After this spec:**
+- Users WITH OpenAI key → Advanced Mode (OpenAI backend) ✅
+- Users WITHOUT any key → Advanced Mode (HuggingFace backend) ✅ **SAME CAPABILITY, UNIFIED ARCHITECTURE**
+---
 ## Summary
+Unify Simple Mode and Advanced Mode into a **single orchestration system** by:
+1. **Renaming the namespace**: `OpenAIChatClient` → `BaseChatClient` (neutral protocol)
+2. **Creating an adapter**: `HuggingFaceChatClient` implements `BaseChatClient`
+3. **Retiring parallel code**: Simple Mode's while-loop becomes unnecessary
+The result: **One codebase, multiple providers, zero parallel universes.**
+> **🔥 P0 Bug Fix**: This also resolves Issue #113. Simple Mode's `_should_synthesize()` has a bug that ignores forced synthesis signals. Advanced Mode's Manager agent handles termination correctly. By integrating, the bug disappears.
+---
+## The Integration Concept
+### Before: Two Parallel Universes (Current)
 ```text
 User Query
     │
     ├── Has API Key? ──Yes──→ Advanced Mode (488 lines)
     │                         └── Microsoft Agent Framework
+    │                         └── OpenAIChatClient (hardcoded) ◄── THE BOTTLENECK
     │
     └── No API Key? ──────────→ Simple Mode (778 lines)
+                                └── While-loop orchestration (SEPARATE CODE)
                                 └── Pydantic AI + HuggingFace
 ```
+**Problem**: Same capability, two implementations, double maintenance, P0 bug in Simple Mode.
+### After: Unified Architecture (This Spec)
 ```text
 User Query
     │
+    └──→ Advanced Mode (unified) ◄── ONE SYSTEM FOR ALL USERS
          └── Microsoft Agent Framework
+         └── get_chat_client() returns: ◄── NAMESPACE NEUTRAL
+             │
+             ├── OpenAIChatClient      (if OpenAI key present)
+             ├── GeminiChatClient      (if Gemini key present) [Future]
+             └── HuggingFaceChatClient (fallback - FREE TIER) ◄── INTEGRATED!
 ```
+**Result**: Free-tier users get the SAME Advanced Mode experience, just with HuggingFace as the LLM backend.
+---
+## What Gets Integrated vs Retired
+### ✅ INTEGRATED (Capability Preserved)
+| Simple Mode Component | Integration Target | How |
+|-----------------------|-------------------|-----|
+| HuggingFace LLM calls | `HuggingFaceChatClient` | New adapter (~150 lines) |
+| Free-tier access | `get_chat_client()` factory | Auto-selects HF when no key |
+| Search tools (PubMed, etc.) | Already shared | `src/agents/tools.py` |
+| Evidence models | Already shared | `src/utils/models.py` |
+### 🗑️ RETIRED (Redundant Code Removed)
+| Simple Mode Component | Why Retired | Replacement in Advanced Mode |
+|-----------------------|-------------|------------------------------|
+| While-loop orchestration | Redundant | Manager agent orchestrates |
+| `_should_synthesize()` thresholds | **BUGGY** (P0 #113) | Manager agent signals |
+| `SearchHandler` scatter-gather | Redundant | SearchAgent handles this |
+| `JudgeHandler` | Redundant | JudgeAgent handles this |
+**Key insight**: We're not losing functionality. We're consolidating two implementations of the SAME functionality into one.
+---
+## Technical Implementation
+### The Single Change That Enables Unification
 ```python
+# BEFORE (hardcoded to OpenAI):
 from agent_framework.openai import OpenAIChatClient
 class AdvancedOrchestrator:
     def __init__(self, ...):
+        self._chat_client = OpenAIChatClient(...)  # ❌ Only OpenAI works
+# AFTER (neutral - any provider):
+from agent_framework import BaseChatClient
 from src.clients.factory import get_chat_client
 class AdvancedOrchestrator:
+    def __init__(self, ...):
+        self._chat_client = get_chat_client()  # ✅ OpenAI, Gemini, OR HuggingFace
 ```
+### HuggingFaceChatClient Adapter
 ```python
+# src/clients/huggingface.py
+from agent_framework import BaseChatClient, ChatMessage, ChatResponse
+from huggingface_hub import InferenceClient
 class HuggingFaceChatClient(BaseChatClient):
+    """Adapter that makes HuggingFace work with Microsoft Agent Framework."""
+    def __init__(self, model_id: str = "meta-llama/Llama-3.1-70B-Instruct"):
+        self._client = InferenceClient(model=model_id)
+        self._model_id = model_id
     async def _inner_get_response(
         self,
         messages: list[ChatMessage],
         **kwargs
     ) -> ChatResponse:
+        """Convert HuggingFace response to Agent Framework format."""
+        # Convert messages to HF format
+        hf_messages = [{"role": m.role, "content": m.content} for m in messages]
+        # Call HuggingFace
+        response = self._client.chat_completion(messages=hf_messages)
+        # Convert back to Agent Framework format
+        return ChatResponse(
+            content=response.choices[0].message.content,
+            # ... other fields
+        )
+    async def _inner_get_streaming_response(self, ...):
+        """Streaming version."""
         ...
 ```
+### ChatClientFactory
 ```python
+# src/clients/factory.py
+from agent_framework import BaseChatClient
+from agent_framework.openai import OpenAIChatClient
+from src.utils.config import settings
+def get_chat_client(provider: str | None = None) -> BaseChatClient:
+    """
+    Factory that returns the appropriate chat client.
+    Priority:
+    1. OpenAI (if key available) - Best function calling, GPT-5
+    2. Gemini (if key available) - Good alternative [Future]
+    3. HuggingFace (always available) - FREE TIER FALLBACK
+    """
+    if provider == "openai" or (provider is None and settings.has_openai_key):
+        return OpenAIChatClient(
+            model_id=settings.openai_model,  # gpt-5
+            api_key=settings.openai_api_key,
+        )
+    # Future: Gemini support
+    # if settings.has_gemini_key:
+    #     return GeminiChatClient(...)
+    # FREE TIER: HuggingFace (no API key required for public models)
+    from src.clients.huggingface import HuggingFaceChatClient
+    return HuggingFaceChatClient(
+        model_id="meta-llama/Llama-3.1-70B-Instruct",
+    )
 ```
 ---
+## Why This Fixes P0 Bug #113
+### The Bug (Simple Mode)
+```python
+# src/orchestrators/simple.py - THE BUG
+def _should_synthesize(self, assessment, ...):
+    # When HF fails, judge returns: score=0, confidence=0.1, recommendation="synthesize"
+    if assessment.sufficient and assessment.recommendation == "synthesize":
+        if combined_score >= 10:  # ❌ 0 >= 10 is FALSE
+            return True
+    if confidence >= 0.5:  # ❌ 0.1 >= 0.5 is FALSE
+        return True, "emergency"
+    return False, "continue_searching"  # ❌ LOOPS FOREVER
+```
+### The Fix (Advanced Mode - Already Works Correctly)
+```python
+# Advanced Mode doesn't have this bug because:
+# 1. JudgeAgent says "SUFFICIENT EVIDENCE" in natural language
+# 2. Manager agent understands this and delegates to ReportAgent
+# 3. No hardcoded thresholds to bypass
+# The Manager agent prompt (src/orchestrators/advanced.py:152):
+"""
+When JudgeAgent says "SUFFICIENT EVIDENCE" or "STOP SEARCHING":
+→ IMMEDIATELY delegate to ReportAgent for synthesis
+"""
+```
+**By integrating Simple Mode's capability into Advanced Mode, the bug disappears** because Advanced Mode's termination logic works correctly.
 ---
 ## Migration Plan
+### Phase 1: Create HuggingFaceChatClient (Enables Integration)
 - [ ] Create `src/clients/` package
+- [ ] Implement `HuggingFaceChatClient` (~150 lines)
+  - Extends `agent_framework.BaseChatClient`
+  - Wraps `huggingface_hub.InferenceClient.chat_completion()`
+  - Implements required abstract methods
+- [ ] Implement `get_chat_client()` factory (~50 lines)
+- [ ] Add unit tests
+**Exit Criteria**: `get_chat_client()` returns working HuggingFace client when no API key.
+### Phase 2: Integrate into Advanced Mode (Fixes P0 Bug)
+- [ ] Update `AdvancedOrchestrator` to use `get_chat_client()`
+- [ ] Update `magentic_agents.py` type hints: `OpenAIChatClient` → `BaseChatClient`
+- [ ] Update `orchestrators/factory.py` to always return `AdvancedOrchestrator`
+- [ ] Update `app.py` to remove mode toggle (everyone gets Advanced Mode)
+- [ ] Archive `simple.py` to `docs/archive/` (for reference)
+- [ ] Migrate Simple Mode tests to Advanced Mode tests
+**Exit Criteria**: Free-tier users get Advanced Mode with HuggingFace backend. P0 bug gone.
+### Phase 3: Cleanup (Optional)
+- [ ] Remove Anthropic provider code (Issue #110)
+- [ ] Add Gemini support (Issue #109)
+- [ ] Delete archived files after verification period
 ---
+## Files Changed
+### New Files (~200 lines)
+| File | Lines | Purpose |
+|------|-------|---------|
+| `src/clients/__init__.py` | ~10 | Package exports |
+| `src/clients/factory.py` | ~50 | `get_chat_client()` |
+| `src/clients/huggingface.py` | ~150 | HuggingFace adapter |
+### Modified Files
+| File | Change |
+|------|--------|
+| `src/orchestrators/advanced.py` | Use `get_chat_client()` instead of `OpenAIChatClient` |
+| `src/orchestrators/factory.py` | Always return `AdvancedOrchestrator` |
+| `src/agents/magentic_agents.py` | Type hints: `OpenAIChatClient` → `BaseChatClient` |
+| `src/app.py` | Remove mode toggle, always use Advanced |
+### Archived Files (NOT deleted from git history)
+| File | Lines | Reason |
+|------|-------|--------|
+| `src/orchestrators/simple.py` | 778 | Functionality INTEGRATED, code retired |
+| `src/tools/search_handler.py` | 219 | Manager agent handles this now |
 ---
+## Verification Checklist
+### Technical Prerequisites (Verified ✅)
+- [x] `agent_framework.BaseChatClient` exists
 - [x] Abstract methods: `_inner_get_response`, `_inner_get_streaming_response`
+- [x] `huggingface_hub.InferenceClient.chat_completion()` exists
+- [x] `chat_completion()` has `tools` parameter (verified in 0.36.0)
+- [x] HuggingFace supports Llama 3.1 70B via free inference
+- [x] **Dependency pinned**: `huggingface-hub>=0.24.0` in pyproject.toml (required for stable tool calling)
+### Capability Preservation Checklist
+After implementation, verify:
+- [ ] User with OpenAI key → Gets Advanced Mode with OpenAI (GPT-5)
+- [ ] User with NO key → Gets Advanced Mode with HuggingFace (Llama 3.1 70B)
+- [ ] Free-tier search works (PubMed, ClinicalTrials, EuropePMC)
+- [ ] Free-tier synthesis works (LLM generates report)
+- [ ] No more "continue_searching" infinite loops (P0 bug fixed)
+---
+## Implementation Notes (From Independent Audit)
+### Dependency Requirement ✅ FIXED
+The `huggingface-hub` package must be `>=0.24.0` for stable `chat_completion` with tools support.
+```toml
+# pyproject.toml - ALREADY UPDATED
+"huggingface-hub>=0.24.0",  # Required for stable chat_completion with tools
+```
+### Llama 3.1 Prompt Considerations ⚠️
+The Manager agent prompt in `AdvancedOrchestrator._create_task_prompt()` was optimized for GPT-5. When using Llama 3.1 70B via HuggingFace, the prompt **may need tuning** to ensure strict adherence to delegation logic.
+**Potential issue**: Llama 3.1 may not immediately delegate to ReportAgent when JudgeAgent says "SUFFICIENT EVIDENCE".
+**Mitigation**: During implementation, test with HuggingFace backend and add reinforcement phrases if needed:
+- "You MUST delegate to ReportAgent when you see SUFFICIENT EVIDENCE"
+- "Do NOT continue searching after Judge approves"
+This is a **runtime verification** task, not a spec change.
 ---
 ## References
 - Microsoft Agent Framework: `agent_framework.BaseChatClient`
 - HuggingFace Inference: `huggingface_hub.InferenceClient`
+- Issue #105: Deprecate Simple Mode → **Reframe as "Integrate Simple Mode"**
 - Issue #109: Simplify Provider Architecture
 - Issue #110: Remove Anthropic Provider Support
+- Issue #113: P0 Bug - Simple Mode ignores forced synthesis

pyproject.toml CHANGED Viewed

@@ -17,7 +17,7 @@ dependencies = [
     "httpx>=0.27", # Async HTTP client (PubMed)
     "beautifulsoup4>=4.12", # HTML parsing
     "xmltodict>=0.13", # PubMed XML -> dict
-    "huggingface-hub>=0.20.0", # Hugging Face Inference API
     # UI
     "gradio[mcp]>=6.0.0", # Chat interface with MCP server support (6.0 required for css in launch())
     # Utils

     "httpx>=0.27", # Async HTTP client (PubMed)
     "beautifulsoup4>=4.12", # HTML parsing
     "xmltodict>=0.13", # PubMed XML -> dict
+    "huggingface-hub>=0.24.0", # Hugging Face Inference API - 0.24.0 required for stable chat_completion with tools
     # UI
     "gradio[mcp]>=6.0.0", # Chat interface with MCP server support (6.0 required for css in launch())
     # Utils

src/agents/code_executor_agent.py CHANGED Viewed

@@ -4,10 +4,10 @@ import asyncio
 import structlog
 from agent_framework import ChatAgent, ai_function
-from agent_framework.openai import OpenAIChatClient
 from src.tools.code_execution import get_code_executor
-from src.utils.config import settings
 logger = structlog.get_logger()
@@ -40,7 +40,7 @@ async def execute_python_code(code: str) -> str:
         return f"Execution failed: {e}"
-def create_code_executor_agent(chat_client: OpenAIChatClient | None = None) -> ChatAgent:
     """Create a code executor agent.
     Args:
@@ -49,10 +49,7 @@ def create_code_executor_agent(chat_client: OpenAIChatClient | None = None) -> C
     Returns:
         ChatAgent configured for code execution.
     """
-    client = chat_client or OpenAIChatClient(
-        model_id=settings.openai_model,
-        api_key=settings.openai_api_key,
-    )
     return ChatAgent(
         name="CodeExecutorAgent",

 import structlog
 from agent_framework import ChatAgent, ai_function
+from src.clients.base import BaseChatClient
+from src.clients.factory import get_chat_client
 from src.tools.code_execution import get_code_executor
 logger = structlog.get_logger()
         return f"Execution failed: {e}"
+def create_code_executor_agent(chat_client: BaseChatClient | None = None) -> ChatAgent:
     """Create a code executor agent.
     Args:
     Returns:
         ChatAgent configured for code execution.
     """
+    client = chat_client or get_chat_client()
     return ChatAgent(
         name="CodeExecutorAgent",

src/agents/magentic_agents.py CHANGED Viewed

@@ -1,7 +1,6 @@
 """Magentic-compatible agents using ChatAgent pattern."""
 from agent_framework import ChatAgent
-from agent_framework.openai import OpenAIChatClient
 from src.agents.tools import (
     get_bibliography,
@@ -9,12 +8,13 @@ from src.agents.tools import (
     search_preprints,
     search_pubmed,
 )
 from src.config.domain import ResearchDomain, get_domain_config
-from src.utils.config import settings
 def create_search_agent(
-    chat_client: OpenAIChatClient | None = None,
     domain: ResearchDomain | str | None = None,
 ) -> ChatAgent:
     """Create a search agent with internal LLM and search tools.
@@ -26,10 +26,7 @@ def create_search_agent(
     Returns:
         ChatAgent configured for biomedical search
     """
-    client = chat_client or OpenAIChatClient(
-        model_id=settings.openai_model,  # Use configured model
-        api_key=settings.openai_api_key,
-    )
     config = get_domain_config(domain)
     return ChatAgent(
@@ -55,7 +52,7 @@ related to {config.name}.""",
 def create_judge_agent(
-    chat_client: OpenAIChatClient | None = None,
     domain: ResearchDomain | str | None = None,
 ) -> ChatAgent:
     """Create a judge agent that evaluates evidence quality.
@@ -67,10 +64,7 @@ def create_judge_agent(
     Returns:
         ChatAgent configured for evidence assessment
     """
-    client = chat_client or OpenAIChatClient(
-        model_id=settings.openai_model,
-        api_key=settings.openai_api_key,
-    )
     config = get_domain_config(domain)
     return ChatAgent(
@@ -114,7 +108,7 @@ Be rigorous but fair. Look for:
 def create_hypothesis_agent(
-    chat_client: OpenAIChatClient | None = None,
     domain: ResearchDomain | str | None = None,
 ) -> ChatAgent:
     """Create a hypothesis generation agent.
@@ -126,10 +120,7 @@ def create_hypothesis_agent(
     Returns:
         ChatAgent configured for hypothesis generation
     """
-    client = chat_client or OpenAIChatClient(
-        model_id=settings.openai_model,
-        api_key=settings.openai_api_key,
-    )
     config = get_domain_config(domain)
     return ChatAgent(
@@ -158,7 +149,7 @@ Focus on mechanistic plausibility and existing evidence.""",
 def create_report_agent(
-    chat_client: OpenAIChatClient | None = None,
     domain: ResearchDomain | str | None = None,
 ) -> ChatAgent:
     """Create a report synthesis agent.
@@ -170,10 +161,7 @@ def create_report_agent(
     Returns:
         ChatAgent configured for report generation
     """
-    client = chat_client or OpenAIChatClient(
-        model_id=settings.openai_model,
-        api_key=settings.openai_api_key,
-    )
     config = get_domain_config(domain)
     return ChatAgent(

 """Magentic-compatible agents using ChatAgent pattern."""
 from agent_framework import ChatAgent
 from src.agents.tools import (
     get_bibliography,
     search_preprints,
     search_pubmed,
 )
+from src.clients.base import BaseChatClient
+from src.clients.factory import get_chat_client
 from src.config.domain import ResearchDomain, get_domain_config
 def create_search_agent(
+    chat_client: BaseChatClient | None = None,
     domain: ResearchDomain | str | None = None,
 ) -> ChatAgent:
     """Create a search agent with internal LLM and search tools.
     Returns:
         ChatAgent configured for biomedical search
     """
+    client = chat_client or get_chat_client()
     config = get_domain_config(domain)
     return ChatAgent(
 def create_judge_agent(
+    chat_client: BaseChatClient | None = None,
     domain: ResearchDomain | str | None = None,
 ) -> ChatAgent:
     """Create a judge agent that evaluates evidence quality.
     Returns:
         ChatAgent configured for evidence assessment
     """
+    client = chat_client or get_chat_client()
     config = get_domain_config(domain)
     return ChatAgent(
 def create_hypothesis_agent(
+    chat_client: BaseChatClient | None = None,
     domain: ResearchDomain | str | None = None,
 ) -> ChatAgent:
     """Create a hypothesis generation agent.
     Returns:
         ChatAgent configured for hypothesis generation
     """
+    client = chat_client or get_chat_client()
     config = get_domain_config(domain)
     return ChatAgent(
 def create_report_agent(
+    chat_client: BaseChatClient | None = None,
     domain: ResearchDomain | str | None = None,
 ) -> ChatAgent:
     """Create a report synthesis agent.
     Returns:
         ChatAgent configured for report generation
     """
+    client = chat_client or get_chat_client()
     config = get_domain_config(domain)
     return ChatAgent(

src/agents/retrieval_agent.py CHANGED Viewed

@@ -2,11 +2,11 @@
 import structlog
 from agent_framework import ChatAgent, ai_function
-from agent_framework.openai import OpenAIChatClient
 from src.state import get_magentic_state
 from src.tools.web_search import WebSearchTool
-from src.utils.config import settings
 logger = structlog.get_logger()
@@ -50,7 +50,7 @@ async def search_web(query: str, max_results: int = 10) -> str:
     return "\n".join(output)
-def create_retrieval_agent(chat_client: OpenAIChatClient | None = None) -> ChatAgent:
     """Create a retrieval agent.
     Args:
@@ -59,10 +59,7 @@ def create_retrieval_agent(chat_client: OpenAIChatClient | None = None) -> ChatA
     Returns:
         ChatAgent configured for retrieval.
     """
-    client = chat_client or OpenAIChatClient(
-        model_id=settings.openai_model,
-        api_key=settings.openai_api_key,
-    )
     return ChatAgent(
         name="RetrievalAgent",

 import structlog
 from agent_framework import ChatAgent, ai_function
+from src.clients.base import BaseChatClient
+from src.clients.factory import get_chat_client
 from src.state import get_magentic_state
 from src.tools.web_search import WebSearchTool
 logger = structlog.get_logger()
     return "\n".join(output)
+def create_retrieval_agent(chat_client: BaseChatClient | None = None) -> ChatAgent:
     """Create a retrieval agent.
     Args:
     Returns:
         ChatAgent configured for retrieval.
     """
+    client = chat_client or get_chat_client()
     return ChatAgent(
         name="RetrievalAgent",

src/app.py CHANGED Viewed

@@ -5,25 +5,15 @@ from collections.abc import AsyncGenerator
 from typing import Any, Literal
 import gradio as gr
-from pydantic_ai.models.anthropic import AnthropicModel
-from pydantic_ai.models.openai import OpenAIChatModel
-from pydantic_ai.providers.anthropic import AnthropicProvider
-from pydantic_ai.providers.openai import OpenAIProvider
-from src.agent_factory.judges import HFInferenceJudgeHandler, JudgeHandler, MockJudgeHandler
 from src.config.domain import ResearchDomain
 from src.orchestrators import create_orchestrator
-from src.tools.clinicaltrials import ClinicalTrialsTool
-from src.tools.europepmc import EuropePMCTool
-from src.tools.openalex import OpenAlexTool
-from src.tools.pubmed import PubMedTool
-from src.tools.search_handler import SearchHandler
 from src.utils.config import settings
 from src.utils.exceptions import ConfigurationError
 from src.utils.models import OrchestratorConfig
 from src.utils.service_loader import warmup_services
-OrchestratorMode = Literal["simple", "magentic", "advanced", "hierarchical"]
 # CSS to force dark mode on API key input
@@ -55,16 +45,19 @@ CUSTOM_CSS = """
 def configure_orchestrator(
     use_mock: bool = False,
-    mode: OrchestratorMode = "simple",
     user_api_key: str | None = None,
     domain: str | ResearchDomain | None = None,
 ) -> tuple[Any, str]:
     """
     Create an orchestrator instance.
     Args:
         use_mock: If True, use MockJudgeHandler (no API key needed)
-        mode: Orchestrator mode ("simple" or "advanced")
         user_api_key: Optional user-provided API key (BYOK) - auto-detects provider
         domain: Research domain (defaults to "sexual_health")
@@ -77,58 +70,35 @@ def configure_orchestrator(
         max_results_per_tool=10,
     )
-    # Create search tools
-    search_handler = SearchHandler(
-        tools=[PubMedTool(), ClinicalTrialsTool(), EuropePMCTool(), OpenAlexTool()],
-        timeout=config.search_timeout,
-    )
-    # Create judge (mock, real, or free tier)
-    judge_handler: JudgeHandler | MockJudgeHandler | HFInferenceJudgeHandler
     backend_info = "Unknown"
     # 1. Forced Mock (Unit Testing)
     if use_mock:
-        judge_handler = MockJudgeHandler(domain=domain)
         backend_info = "Mock (Testing)"
     # 2. Paid API Key (User provided or Env)
     elif user_api_key and user_api_key.strip():
-        # Auto-detect provider from key prefix
-        model: AnthropicModel | OpenAIChatModel
         if user_api_key.startswith("sk-ant-"):
-            # Anthropic key
-            anthropic_provider = AnthropicProvider(api_key=user_api_key)
-            model = AnthropicModel(settings.anthropic_model, provider=anthropic_provider)
             backend_info = "Paid API (Anthropic)"
         elif user_api_key.startswith("sk-"):
-            # OpenAI key
-            openai_provider = OpenAIProvider(api_key=user_api_key)
-            model = OpenAIChatModel(settings.openai_model, provider=openai_provider)
             backend_info = "Paid API (OpenAI)"
         else:
             raise ConfigurationError(
                 "Invalid API key format. Expected sk-... (OpenAI) or sk-ant-... (Anthropic)"
             )
-        judge_handler = JudgeHandler(model=model, domain=domain)
     # 3. Environment API Keys (fallback)
     elif settings.has_openai_key:
-        judge_handler = JudgeHandler(model=None, domain=domain)  # Uses env key
         backend_info = "Paid API (OpenAI from env)"
     elif settings.has_anthropic_key:
-        judge_handler = JudgeHandler(model=None, domain=domain)  # Uses env key
         backend_info = "Paid API (Anthropic from env)"
     # 4. Free Tier (HuggingFace Inference)
     else:
-        judge_handler = HFInferenceJudgeHandler(domain=domain)
         backend_info = "Free Tier (Llama 3.1 / Mistral)"
     orchestrator = create_orchestrator(
-        search_handler=search_handler,
-        judge_handler=judge_handler,
         config=config,
         mode=mode,
         api_key=user_api_key,
@@ -139,41 +109,31 @@ def configure_orchestrator(
 def _validate_inputs(
-    mode: str,
     api_key: str | None,
     api_key_state: str | None,
-) -> tuple[OrchestratorMode, str | None, bool]:
-    """Validate inputs and determine mode/key status.
     Returns:
-        Tuple of (validated_mode, effective_user_key, has_paid_key)
     """
-    # Validate mode
-    valid_modes: set[str] = {"simple", "magentic", "advanced", "hierarchical"}
-    mode_validated: OrchestratorMode = mode if mode in valid_modes else "simple"  # type: ignore[assignment]
     # Determine effective key
     user_api_key = (api_key or api_key_state or "").strip() or None
     # Check available keys
     has_openai = settings.has_openai_key
     has_anthropic = settings.has_anthropic_key
-    is_openai_user_key = (
-        user_api_key and user_api_key.startswith("sk-") and not user_api_key.startswith("sk-ant-")
-    )
     has_paid_key = has_openai or has_anthropic or bool(user_api_key)
-    # Fallback logic for Advanced mode
-    if mode_validated == "advanced" and not (has_openai or is_openai_user_key):
-        mode_validated = "simple"
-    return mode_validated, user_api_key, has_paid_key
 async def research_agent(
     message: str,
     history: list[dict[str, Any]],
-    mode: str = "simple",  # Gradio passes strings; validated below
     domain: str = "sexual_health",
     api_key: str = "",
     api_key_state: str = "",
@@ -182,10 +142,12 @@ async def research_agent(
     """
     Gradio chat function that runs the research agent.
     Args:
         message: User's research question
         history: Chat history (Gradio format)
-        mode: Orchestrator mode ("simple" or "advanced")
         domain: Research domain
         api_key: Optional user-provided API key (BYOK - auto-detects provider)
         api_key_state: Persistent API key state (survives example clicks)
@@ -201,15 +163,8 @@ async def research_agent(
     # BUG FIX: Handle None values from Gradio example caching
     domain_str = domain or "sexual_health"
-    # Validate inputs using helper to reduce complexity
-    mode_validated, user_api_key, has_paid_key = _validate_inputs(mode, api_key, api_key_state)
-    # Inform user about fallback/tier status
-    if mode == "advanced" and mode_validated == "simple":
-        yield (
-            "⚠️ **Warning**: Advanced mode currently requires OpenAI API key. "
-            "Anthropic keys only work in Simple mode. Falling back to Simple.\n\n"
-        )
     if not has_paid_key:
         yield (
@@ -223,9 +178,10 @@ async def research_agent(
     try:
         # use_mock=False - let configure_orchestrator decide based on available keys
         orchestrator, backend_name = configure_orchestrator(
             use_mock=False,
-            mode=mode_validated,
             user_api_key=user_api_key,
             domain=domain_str,
         )
@@ -297,9 +253,7 @@ def create_demo() -> tuple[gr.ChatInterface, gr.Accordion]:
     Returns:
         Configured Gradio Blocks interface with MCP server enabled
     """
-    additional_inputs_accordion = gr.Accordion(
-        label="⚙️ Mode & API Key (Free tier works!)", open=False
-    )
     # BUG FIX: Add gr.State for API key persistence across example clicks
     api_key_state = gr.State("")
@@ -327,23 +281,22 @@ def create_demo() -> tuple[gr.ChatInterface, gr.Accordion]:
         title="🍆 DeepBoner",
         description=description,
         examples=[
             [
                 "What drugs improve female libido post-menopause?",
-                "simple",
                 "sexual_health",
                 None,
                 None,
             ],
             [
                 "Testosterone therapy for hypoactive sexual desire disorder?",
-                "simple",
                 "sexual_health",
                 None,
                 None,
             ],
             [
                 "Clinical trials for PDE5 inhibitors alternatives?",
-                "advanced",
                 "sexual_health",
                 None,
                 None,
@@ -351,12 +304,8 @@ def create_demo() -> tuple[gr.ChatInterface, gr.Accordion]:
         ],
         additional_inputs_accordion=additional_inputs_accordion,
         additional_inputs=[
-            gr.Radio(
-                choices=["simple", "advanced"],
-                value="simple",
-                label="Orchestrator Mode",
-                info="⚡ Simple: Free/Any | 🔬 Advanced: OpenAI (Deep Research)",
-            ),
             gr.Dropdown(
                 choices=[d.value for d in ResearchDomain],
                 value="sexual_health",

 from typing import Any, Literal
 import gradio as gr
 from src.config.domain import ResearchDomain
 from src.orchestrators import create_orchestrator
 from src.utils.config import settings
 from src.utils.exceptions import ConfigurationError
 from src.utils.models import OrchestratorConfig
 from src.utils.service_loader import warmup_services
+OrchestratorMode = Literal["advanced", "hierarchical"]  # Unified Architecture (SPEC-16)
 # CSS to force dark mode on API key input
 def configure_orchestrator(
     use_mock: bool = False,
+    mode: OrchestratorMode = "advanced",
     user_api_key: str | None = None,
     domain: str | ResearchDomain | None = None,
 ) -> tuple[Any, str]:
     """
     Create an orchestrator instance.
+    Unified Architecture (SPEC-16): All users get Advanced Mode.
+    Backend auto-selects: OpenAI (if key) → HuggingFace (free fallback).
     Args:
         use_mock: If True, use MockJudgeHandler (no API key needed)
+        mode: Orchestrator mode (default "advanced", "hierarchical" for sub-iteration)
         user_api_key: Optional user-provided API key (BYOK) - auto-detects provider
         domain: Research domain (defaults to "sexual_health")
         max_results_per_tool=10,
     )
     backend_info = "Unknown"
     # 1. Forced Mock (Unit Testing)
     if use_mock:
         backend_info = "Mock (Testing)"
     # 2. Paid API Key (User provided or Env)
     elif user_api_key and user_api_key.strip():
         if user_api_key.startswith("sk-ant-"):
             backend_info = "Paid API (Anthropic)"
         elif user_api_key.startswith("sk-"):
             backend_info = "Paid API (OpenAI)"
         else:
             raise ConfigurationError(
                 "Invalid API key format. Expected sk-... (OpenAI) or sk-ant-... (Anthropic)"
             )
     # 3. Environment API Keys (fallback)
     elif settings.has_openai_key:
         backend_info = "Paid API (OpenAI from env)"
     elif settings.has_anthropic_key:
         backend_info = "Paid API (Anthropic from env)"
     # 4. Free Tier (HuggingFace Inference)
     else:
         backend_info = "Free Tier (Llama 3.1 / Mistral)"
     orchestrator = create_orchestrator(
         config=config,
         mode=mode,
         api_key=user_api_key,
 def _validate_inputs(
     api_key: str | None,
     api_key_state: str | None,
+) -> tuple[str | None, bool]:
+    """Validate inputs and determine key status.
+    Unified Architecture (SPEC-16): Mode is always "advanced".
+    Backend auto-selects based on available API keys.
     Returns:
+        Tuple of (effective_user_key, has_paid_key)
     """
     # Determine effective key
     user_api_key = (api_key or api_key_state or "").strip() or None
     # Check available keys
     has_openai = settings.has_openai_key
     has_anthropic = settings.has_anthropic_key
     has_paid_key = has_openai or has_anthropic or bool(user_api_key)
+    return user_api_key, has_paid_key
 async def research_agent(
     message: str,
     history: list[dict[str, Any]],
     domain: str = "sexual_health",
     api_key: str = "",
     api_key_state: str = "",
     """
     Gradio chat function that runs the research agent.
+    Unified Architecture (SPEC-16): Always uses Advanced Mode.
+    Backend auto-selects: OpenAI (if key) → HuggingFace (free fallback).
     Args:
         message: User's research question
         history: Chat history (Gradio format)
         domain: Research domain
         api_key: Optional user-provided API key (BYOK - auto-detects provider)
         api_key_state: Persistent API key state (survives example clicks)
     # BUG FIX: Handle None values from Gradio example caching
     domain_str = domain or "sexual_health"
+    # Validate inputs (SPEC-16: mode is always "advanced")
+    user_api_key, has_paid_key = _validate_inputs(api_key, api_key_state)
     if not has_paid_key:
         yield (
     try:
         # use_mock=False - let configure_orchestrator decide based on available keys
+        # SPEC-16: mode is always "advanced" (unified architecture)
         orchestrator, backend_name = configure_orchestrator(
             use_mock=False,
+            mode="advanced",
             user_api_key=user_api_key,
             domain=domain_str,
         )
     Returns:
         Configured Gradio Blocks interface with MCP server enabled
     """
+    additional_inputs_accordion = gr.Accordion(label="⚙️ API Key (Free tier works!)", open=False)
     # BUG FIX: Add gr.State for API key persistence across example clicks
     api_key_state = gr.State("")
         title="🍆 DeepBoner",
         description=description,
         examples=[
+            # SPEC-16: Mode is always "advanced" (unified architecture)
+            # Examples now only need: [question, domain, api_key, api_key_state]
             [
                 "What drugs improve female libido post-menopause?",
                 "sexual_health",
                 None,
                 None,
             ],
             [
                 "Testosterone therapy for hypoactive sexual desire disorder?",
                 "sexual_health",
                 None,
                 None,
             ],
             [
                 "Clinical trials for PDE5 inhibitors alternatives?",
                 "sexual_health",
                 None,
                 None,
         ],
         additional_inputs_accordion=additional_inputs_accordion,
         additional_inputs=[
+            # SPEC-16: Mode toggle removed - everyone gets Advanced Mode
+            # Backend auto-selects: OpenAI (if key) → HuggingFace (free fallback)
             gr.Dropdown(
                 choices=[d.value for d in ResearchDomain],
                 value="sexual_health",

src/clients/__init__.py ADDED Viewed

File without changes

src/clients/base.py ADDED Viewed

	@@ -0,0 +1,19 @@

+"""Base classes for Chat Client implementations.
+This module re-exports the BaseChatClient and related types from the core
+agent_framework package to provide a single point of import for the project.
+"""
+from agent_framework import (
+    BaseChatClient,
+    ChatMessage,
+    ChatResponse,
+    ChatResponseUpdate,
+)
+__all__ = [
+    "BaseChatClient",
+    "ChatMessage",
+    "ChatResponse",
+    "ChatResponseUpdate",
+]

src/clients/factory.py ADDED Viewed

	@@ -0,0 +1,76 @@

+"""Chat Client Factory for unified provider selection."""
+from typing import Any
+import structlog
+from agent_framework import BaseChatClient
+from agent_framework.openai import OpenAIChatClient
+from src.clients.huggingface import HuggingFaceChatClient
+from src.utils.config import settings
+logger = structlog.get_logger()
+def get_chat_client(
+    provider: str | None = None,
+    api_key: str | None = None,
+    model_id: str | None = None,
+    **kwargs: Any,
+) -> BaseChatClient:
+    """
+    Factory for creating chat clients.
+    Auto-detection priority:
+    1. Explicit provider parameter
+    2. OpenAI key (Best Function Calling)
+    3. Gemini key (Best Context/Cost)
+    4. HuggingFace (Free Fallback)
+    Args:
+        provider: Force specific provider ("openai", "gemini", "huggingface")
+        api_key: Override API key for the provider
+        model_id: Override default model ID
+        **kwargs: Additional arguments for the client
+    Returns:
+        Configured BaseChatClient instance (Namespace Neutral)
+    Raises:
+        ValueError: If an unsupported provider is explicitly requested
+        NotImplementedError: If Gemini is explicitly requested (not yet implemented)
+    """
+    # Normalize provider to lowercase for case-insensitive matching
+    normalized = provider.lower() if provider is not None else None
+    # Validate explicit provider requests early
+    valid_providers = (None, "openai", "gemini", "huggingface")
+    if normalized not in valid_providers:
+        raise ValueError(f"Unsupported provider: {provider!r}")
+    # 1. OpenAI (Standard / Paid Tier)
+    if normalized == "openai" or (normalized is None and settings.has_openai_key):
+        logger.info("Using OpenAI Chat Client")
+        return OpenAIChatClient(
+            model_id=model_id or settings.openai_model,
+            api_key=api_key or settings.openai_api_key,
+            **kwargs,
+        )
+    # 2. Gemini (High Performance / Alternative)
+    if normalized == "gemini":
+        # Explicit request for Gemini - fail loudly
+        raise NotImplementedError("Gemini client not yet implemented (Planned Phase 4)")
+    if normalized is None and settings.has_gemini_key:
+        # Implicit (has key but not explicit) - log warning and fall through
+        logger.warning("Gemini key detected but client not yet implemented; falling back")
+    # 3. HuggingFace (Free Fallback)
+    # This is the default if no other keys are present
+    logger.info("Using HuggingFace Chat Client (Free Tier)")
+    return HuggingFaceChatClient(
+        model_id=model_id or settings.huggingface_model,
+        api_key=api_key or settings.hf_token,
+        **kwargs,
+    )

src/clients/huggingface.py ADDED Viewed

	@@ -0,0 +1,191 @@

+"""HuggingFace Chat Client adapter for Microsoft Agent Framework.
+This client enables the use of HuggingFace Inference API (including the free tier)
+as a backend for the agent framework, allowing "Advanced Mode" to work without
+an OpenAI API key.
+"""
+import asyncio
+from collections.abc import AsyncIterable, MutableSequence
+from functools import partial
+from typing import Any, cast
+import structlog
+from agent_framework import (
+    BaseChatClient,
+    ChatMessage,
+    ChatOptions,
+    ChatResponse,
+    ChatResponseUpdate,
+)
+from huggingface_hub import InferenceClient
+from src.utils.config import settings
+logger = structlog.get_logger()
+class HuggingFaceChatClient(BaseChatClient):  # type: ignore[misc]
+    """Adapter for HuggingFace Inference API."""
+    def __init__(
+        self,
+        model_id: str | None = None,
+        api_key: str | None = None,
+        **kwargs: Any,
+    ) -> None:
+        """Initialize the HuggingFace chat client.
+        Args:
+            model_id: The HuggingFace model ID (default: configured value or Llama-3.1-70B).
+            api_key: HF_TOKEN (optional, defaults to env var).
+            **kwargs: Additional arguments passed to BaseChatClient.
+        """
+        super().__init__(**kwargs)
+        self.model_id = (
+            model_id or settings.huggingface_model or "meta-llama/Llama-3.1-70B-Instruct"
+        )
+        self.api_key = api_key or settings.hf_token
+        # Initialize the HF Inference Client
+        # timeout=60 to prevent premature timeouts on long reasonings
+        self._client = InferenceClient(
+            model=self.model_id,
+            token=self.api_key,
+            timeout=60,
+        )
+        logger.info("Initialized HuggingFaceChatClient", model=self.model_id)
+    def _convert_messages(self, messages: MutableSequence[ChatMessage]) -> list[dict[str, Any]]:
+        """Convert framework messages to HuggingFace format."""
+        hf_messages: list[dict[str, Any]] = []
+        for msg in messages:
+            # Basic conversion - extend as needed for multi-modal
+            content = msg.text or ""
+            # msg.role can be string or enum - extract .value for enums
+            # str(Role.USER) -> "Role.USER" (wrong), Role.USER.value -> "user" (correct)
+            if hasattr(msg.role, "value"):
+                role_str = str(msg.role.value)
+            else:
+                role_str = str(msg.role)
+            hf_messages.append({"role": role_str, "content": content})
+        return hf_messages
+    async def _inner_get_response(
+        self,
+        *,
+        messages: MutableSequence[ChatMessage],
+        chat_options: ChatOptions,
+        **kwargs: Any,
+    ) -> ChatResponse:
+        """Synchronous response generation using chat_completion."""
+        hf_messages = self._convert_messages(messages)
+        # Extract tool configuration
+        tools = chat_options.tools if chat_options.tools else None
+        # HF expects 'tool_choice' to be 'auto', 'none', or specific tool
+        # Framework uses ToolMode enum or dict
+        hf_tool_choice: str | None = None
+        if chat_options.tool_choice is not None:
+            tool_choice_str = str(chat_options.tool_choice)
+            if "AUTO" in tool_choice_str:
+                hf_tool_choice = "auto"
+            # For NONE or other, leave as None
+        try:
+            # Use explicit None checks - 'or' treats 0/0.0 as falsy
+            # temperature=0.0 is valid (deterministic output)
+            max_tokens = chat_options.max_tokens if chat_options.max_tokens is not None else 2048
+            temperature = chat_options.temperature if chat_options.temperature is not None else 0.7
+            # Use partial to create a callable with keyword args for to_thread
+            call_fn = partial(
+                self._client.chat_completion,
+                messages=hf_messages,
+                tools=tools,
+                tool_choice=hf_tool_choice,
+                max_tokens=max_tokens,
+                temperature=temperature,
+                stream=False,
+            )
+            response = await asyncio.to_thread(call_fn)
+            # Parse response
+            # HF returns a ChatCompletionOutput
+            choices = response.choices
+            if not choices:
+                return ChatResponse(messages=[], response_id="error-no-choices")
+            choice = choices[0]
+            message_content = choice.message.content or ""
+            # Construct response message with proper kwargs
+            response_msg = ChatMessage(
+                role=cast(Any, choice.message.role),
+                text=message_content,
+            )
+            return ChatResponse(
+                messages=[response_msg],
+                response_id=response.id or "hf-response",
+            )
+        except Exception as e:
+            logger.error("HuggingFace API error", error=str(e))
+            raise
+    async def _inner_get_streaming_response(
+        self,
+        *,
+        messages: MutableSequence[ChatMessage],
+        chat_options: ChatOptions,
+        **kwargs: Any,
+    ) -> AsyncIterable[ChatResponseUpdate]:
+        """Streaming response generation."""
+        hf_messages = self._convert_messages(messages)
+        tools = chat_options.tools if chat_options.tools else None
+        hf_tool_choice: str | None = None
+        if chat_options.tool_choice is not None:
+            if "AUTO" in str(chat_options.tool_choice):
+                hf_tool_choice = "auto"
+        try:
+            # Use explicit None checks - 'or' treats 0/0.0 as falsy
+            # temperature=0.0 is valid (deterministic output)
+            max_tokens = chat_options.max_tokens if chat_options.max_tokens is not None else 2048
+            temperature = chat_options.temperature if chat_options.temperature is not None else 0.7
+            # Use partial for streaming call
+            call_fn = partial(
+                self._client.chat_completion,
+                messages=hf_messages,
+                tools=tools,
+                tool_choice=hf_tool_choice,
+                max_tokens=max_tokens,
+                temperature=temperature,
+                stream=True,
+            )
+            stream = await asyncio.to_thread(call_fn)
+            for chunk in stream:
+                # Chunk is ChatCompletionStreamOutput
+                if not chunk.choices:
+                    continue
+                choice = chunk.choices[0]
+                delta = choice.delta
+                # Convert to ChatResponseUpdate
+                yield ChatResponseUpdate(
+                    role=cast(Any, delta.role) if delta.role else None,
+                    content=delta.content,
+                )
+                # Yield control to event loop
+                await asyncio.sleep(0)
+        except Exception as e:
+            logger.error("HuggingFace Streaming error", error=str(e))
+            raise

src/orchestrators/__init__.py CHANGED Viewed

@@ -1,27 +1,32 @@
-"""Orchestrators package - provides different orchestration strategies.
-This package implements the Strategy Pattern, allowing the application
-to switch between different orchestration approaches:
-- Simple: Basic search-judge loop using pydantic-ai (free tier compatible)
-- Advanced: Multi-agent coordination using Microsoft Agent Framework
 - Hierarchical: Sub-iteration middleware with fine-grained control
 Usage:
-    from src.orchestrators import create_orchestrator, Orchestrator
-    # Auto-detect mode based on available API keys
-    orchestrator = create_orchestrator(search_handler, judge_handler)
-    # Or explicitly specify mode
-    orchestrator = create_orchestrator(mode="advanced", api_key="sk-...")
 Protocols:
     from src.orchestrators import SearchHandlerProtocol, JudgeHandlerProtocol, OrchestratorProtocol
 Design Patterns Applied:
 - Factory Pattern: create_orchestrator() creates appropriate orchestrator
-- Strategy Pattern: Different orchestrators implement different strategies
 - Facade Pattern: This __init__.py provides a clean public API
 """
@@ -40,9 +45,6 @@ from src.orchestrators.base import (
 # Factory (creational pattern)
 from src.orchestrators.factory import create_orchestrator
-# Orchestrators (Strategy Pattern implementations)
-from src.orchestrators.simple import Orchestrator
 if TYPE_CHECKING:
     from src.orchestrators.advanced import AdvancedOrchestrator
     from src.orchestrators.hierarchical import HierarchicalOrchestrator
@@ -101,7 +103,6 @@ def get_magentic_orchestrator() -> type[AdvancedOrchestrator]:
 __all__ = [
     "JudgeHandlerProtocol",
-    "Orchestrator",
     "OrchestratorProtocol",
     "SearchHandlerProtocol",
     "create_orchestrator",

+"""Orchestrators package - Unified Architecture (SPEC-16).
+This package implements the Strategy Pattern with a unified orchestration approach:
+- Advanced: Multi-agent coordination using Microsoft Agent Framework (DEFAULT)
+  - Backend auto-selects: OpenAI (if key) → HuggingFace (free fallback)
 - Hierarchical: Sub-iteration middleware with fine-grained control
+Unified Architecture (SPEC-16):
+  All users get Advanced Mode. The chat client factory auto-selects the backend:
+  - With OpenAI key → OpenAIChatClient (GPT-5)
+  - Without key → HuggingFaceChatClient (Llama 3.1 70B, free tier)
 Usage:
+    from src.orchestrators import create_orchestrator
+    # Creates AdvancedOrchestrator with auto-selected backend
+    orchestrator = create_orchestrator()
+    # Or with explicit API key
+    orchestrator = create_orchestrator(api_key="sk-...")
 Protocols:
     from src.orchestrators import SearchHandlerProtocol, JudgeHandlerProtocol, OrchestratorProtocol
 Design Patterns Applied:
 - Factory Pattern: create_orchestrator() creates appropriate orchestrator
+- Adapter Pattern: HuggingFaceChatClient adapts HF API to BaseChatClient
+- Strategy Pattern: Different backends (OpenAI, HuggingFace) via ChatClientFactory
 - Facade Pattern: This __init__.py provides a clean public API
 """
 # Factory (creational pattern)
 from src.orchestrators.factory import create_orchestrator
 if TYPE_CHECKING:
     from src.orchestrators.advanced import AdvancedOrchestrator
     from src.orchestrators.hierarchical import HierarchicalOrchestrator
 __all__ = [
     "JudgeHandlerProtocol",
     "OrchestratorProtocol",
     "SearchHandlerProtocol",
     "create_orchestrator",

src/orchestrators/advanced.py CHANGED Viewed

@@ -28,7 +28,6 @@ from agent_framework import (
     MagenticOrchestratorMessageEvent,
     WorkflowOutputEvent,
 )
-from agent_framework.openai import OpenAIChatClient
 from src.agents.magentic_agents import (
     create_hypothesis_agent,
@@ -37,10 +36,11 @@ from src.agents.magentic_agents import (
     create_search_agent,
 )
 from src.agents.state import init_magentic_state
 from src.config.domain import ResearchDomain, get_domain_config
 from src.orchestrators.base import OrchestratorProtocol
 from src.utils.config import settings
-from src.utils.llm_factory import check_magentic_requirements
 from src.utils.models import AgentEvent
 from src.utils.service_loader import get_embedding_service_if_available
@@ -69,45 +69,50 @@ class AdvancedOrchestrator(OrchestratorProtocol):
     def __init__(
         self,
-        max_rounds: int | None = None,
-        chat_client: OpenAIChatClient | None = None,
         api_key: str | None = None,
-        timeout_seconds: float = 300.0,
         domain: ResearchDomain | str | None = None,
     ) -> None:
-        """Initialize orchestrator.
         Args:
-            max_rounds: Maximum coordination rounds
-            chat_client: Optional shared chat client for agents
-            api_key: Optional OpenAI API key (for BYOK)
-            timeout_seconds: Maximum workflow duration (default: 5 minutes)
-            domain: Research domain for customization
         """
-        # Validate requirements only if no key provided
-        if not chat_client and not api_key:
-            check_magentic_requirements()
-        # Use pydantic-validated settings (fails fast on invalid config)
-        self._max_rounds = max_rounds if max_rounds is not None else settings.advanced_max_rounds
-        self._timeout_seconds = (
-            timeout_seconds if timeout_seconds != 300.0 else settings.advanced_timeout
         )
-        self.domain = domain
-        self.domain_config = get_domain_config(domain)
-        self._chat_client: OpenAIChatClient | None
-        if chat_client:
-            self._chat_client = chat_client
-        elif api_key:
-            # Create client with user provided key
-            self._chat_client = OpenAIChatClient(
-                model_id=settings.openai_model,
-                api_key=api_key,
-            )
-        else:
-            # Fallback to env vars (will fail later if requirements check wasn't run/passed)
-            self._chat_client = None
     def _init_embedding_service(self) -> "EmbeddingServiceProtocol | None":
         """Initialize embedding service if available."""
@@ -122,10 +127,7 @@ class AdvancedOrchestrator(OrchestratorProtocol):
         report_agent = create_report_agent(self._chat_client, domain=self.domain)
         # Manager chat client (orchestrates the agents)
-        manager_client = self._chat_client or OpenAIChatClient(
-            model_id=settings.openai_model,  # Use configured model
-            api_key=settings.openai_api_key,
-        )
         return (
             MagenticBuilder()

     MagenticOrchestratorMessageEvent,
     WorkflowOutputEvent,
 )
 from src.agents.magentic_agents import (
     create_hypothesis_agent,
     create_search_agent,
 )
 from src.agents.state import init_magentic_state
+from src.clients.base import BaseChatClient
+from src.clients.factory import get_chat_client
 from src.config.domain import ResearchDomain, get_domain_config
 from src.orchestrators.base import OrchestratorProtocol
 from src.utils.config import settings
 from src.utils.models import AgentEvent
 from src.utils.service_loader import get_embedding_service_if_available
     def __init__(
         self,
+        max_rounds: int = 5,
+        chat_client: BaseChatClient | None = None,
+        provider: str | None = None,
         api_key: str | None = None,
         domain: ResearchDomain | str | None = None,
+        timeout_seconds: float | None = None,
     ) -> None:
+        """Initialize the advanced orchestrator.
         Args:
+            max_rounds: Maximum number of coordination rounds.
+            chat_client: Optional pre-configured chat client.
+            provider: Optional provider override ("openai", "huggingface").
+            api_key: Optional API key override.
+            domain: Research domain for customization.
+            timeout_seconds: Optional timeout override (defaults to settings).
         """
+        self._max_rounds = max_rounds
+        self.domain = domain or ResearchDomain.SEXUAL_HEALTH
+        self.domain_config = get_domain_config(self.domain)
+        self._timeout_seconds = timeout_seconds or settings.advanced_timeout
+        self.logger = logger.bind(orchestrator="advanced")
+        # Use provided client or create one via factory
+        self._chat_client = chat_client or get_chat_client(
+            provider=provider,
+            api_key=api_key,
         )
+        # Event stream for UI updates
+        self._events: list[AgentEvent] = []
+        # Initialize services lazily
+        self._embedding_service: EmbeddingServiceProtocol | None = None
+        # Track execution statistics
+        self.stats = {
+            "rounds": 0,
+            "searches": 0,
+            "hypotheses": 0,
+            "reports": 0,
+            "errors": 0,
+        }
     def _init_embedding_service(self) -> "EmbeddingServiceProtocol | None":
         """Initialize embedding service if available."""
         report_agent = create_report_agent(self._chat_client, domain=self.domain)
         # Manager chat client (orchestrates the agents)
+        manager_client = self._chat_client
         return (
             MagenticBuilder()

src/orchestrators/factory.py CHANGED Viewed

@@ -19,7 +19,6 @@ from src.orchestrators.base import (
     OrchestratorProtocol,
     SearchHandlerProtocol,
 )
-from src.orchestrators.simple import Orchestrator
 from src.utils.config import settings
 from src.utils.models import OrchestratorConfig
@@ -30,27 +29,15 @@ logger = structlog.get_logger()
 def _get_advanced_orchestrator_class() -> type["AdvancedOrchestrator"]:
-    """Import AdvancedOrchestrator lazily to avoid hard dependency.
-    This allows the simple mode to work without agent-framework-core installed.
-    Returns:
-        The AdvancedOrchestrator class
-    Raises:
-        ValueError: If agent-framework-core is not installed
-    """
     try:
         from src.orchestrators.advanced import AdvancedOrchestrator
         return AdvancedOrchestrator
     except ImportError as e:
         logger.error("Failed to import AdvancedOrchestrator", error=str(e))
-        raise ValueError(
-            "Advanced mode requires agent-framework-core. "
-            "Install with: pip install agent-framework-core. "
-            "Or use mode='simple' instead."
-        ) from e
 def create_orchestrator(
@@ -64,80 +51,40 @@ def create_orchestrator(
     """
     Create an orchestrator instance.
-    This factory automatically selects the appropriate orchestrator based on:
-    1. Explicit mode parameter (if provided)
-    2. Available API keys (auto-detection)
-    Args:
-        search_handler: The search handler (required for simple mode)
-        judge_handler: The judge handler (required for simple mode)
-        config: Optional configuration (max_iterations, timeouts, etc.)
-                Note: This parameter is only used by simple and hierarchical modes.
-                Advanced mode uses settings.advanced_max_rounds instead.
-        mode: "simple", "magentic", "advanced", or "hierarchical"
-              Note: "magentic" is an alias for "advanced" (kept for backwards compatibility)
-        api_key: Optional API key for advanced mode (OpenAI)
-        domain: Research domain for customization (default: sexual_health)
-    Returns:
-        Orchestrator instance implementing OrchestratorProtocol
-    Raises:
-        ValueError: If required handlers are missing for simple mode
-        ValueError: If advanced mode is requested but dependencies are missing
     """
     effective_config = config or OrchestratorConfig()
-    effective_mode = _determine_mode(mode, api_key)
     logger.info("Creating orchestrator", mode=effective_mode, domain=domain)
-    if effective_mode == "advanced":
-        orchestrator_cls = _get_advanced_orchestrator_class()
-        return orchestrator_cls(
-            max_rounds=settings.advanced_max_rounds,
-            api_key=api_key,
-            domain=domain,
-        )
     if effective_mode == "hierarchical":
         from src.orchestrators.hierarchical import HierarchicalOrchestrator
         return HierarchicalOrchestrator(config=effective_config, domain=domain)
-    # Simple mode requires handlers
-    if search_handler is None or judge_handler is None:
-        raise ValueError("Simple mode requires search_handler and judge_handler")
-    return Orchestrator(
-        search_handler=search_handler,
-        judge_handler=judge_handler,
-        config=effective_config,
         domain=domain,
     )
-def _determine_mode(explicit_mode: str | None, api_key: str | None) -> str:
     """Determine which mode to use.
-    Priority:
-    1. Explicit mode parameter
-    2. Auto-detect based on available API keys
     Args:
         explicit_mode: Mode explicitly requested by caller
-        api_key: API key provided by caller
     Returns:
-        Effective mode string: "simple", "advanced", or "hierarchical"
     """
-    if explicit_mode:
-        if explicit_mode in ("magentic", "advanced"):
-            return "advanced"
-        if explicit_mode == "hierarchical":
-            return "hierarchical"
-        return "simple"
-    # Auto-detect: advanced if paid API key available
-    if settings.has_openai_key or (api_key and api_key.startswith("sk-")):
-        return "advanced"
-    return "simple"

     OrchestratorProtocol,
     SearchHandlerProtocol,
 )
 from src.utils.config import settings
 from src.utils.models import OrchestratorConfig
 def _get_advanced_orchestrator_class() -> type["AdvancedOrchestrator"]:
+    """Import AdvancedOrchestrator lazily."""
     try:
         from src.orchestrators.advanced import AdvancedOrchestrator
         return AdvancedOrchestrator
     except ImportError as e:
         logger.error("Failed to import AdvancedOrchestrator", error=str(e))
+        # With unified architecture, we should never fail here unless installation is broken
+        raise
 def create_orchestrator(
     """
     Create an orchestrator instance.
+    Defaults to AdvancedOrchestrator (Unified Architecture).
+    Simple Mode is deprecated and mapped to Advanced Mode.
     """
     effective_config = config or OrchestratorConfig()
+    effective_mode = _determine_mode(mode)
     logger.info("Creating orchestrator", mode=effective_mode, domain=domain)
     if effective_mode == "hierarchical":
         from src.orchestrators.hierarchical import HierarchicalOrchestrator
         return HierarchicalOrchestrator(config=effective_config, domain=domain)
+    # Default: Advanced Mode (Unified)
+    # Handles both Paid (OpenAI) and Free (HuggingFace) tiers
+    orchestrator_cls = _get_advanced_orchestrator_class()
+    return orchestrator_cls(
+        max_rounds=settings.advanced_max_rounds,
+        api_key=api_key,
         domain=domain,
     )
+def _determine_mode(explicit_mode: str | None) -> str:
     """Determine which mode to use.
     Args:
         explicit_mode: Mode explicitly requested by caller
     Returns:
+        Effective mode string: "advanced" (default) or "hierarchical"
     """
+    if explicit_mode == "hierarchical":
+        return "hierarchical"
+    # "simple" is deprecated -> upgrade to "advanced"
+    # "magentic" is alias for "advanced"
+    return "advanced"

src/orchestrators/simple.py DELETED Viewed

@@ -1,778 +0,0 @@
-"""Simple Orchestrator - the basic agent loop connecting Search and Judge.
-This orchestrator uses a simple loop pattern with pydantic-ai for structured
-LLM outputs. It works with free tier (HuggingFace Inference) or paid APIs
-(OpenAI, Anthropic).
-Design Pattern: Template Method - defines the skeleton of the search-judge loop
-while allowing handlers to implement specific behaviors.
-"""
-from __future__ import annotations
-import asyncio
-from collections.abc import AsyncGenerator
-from typing import TYPE_CHECKING, Any, ClassVar
-import structlog
-from src.config.domain import ResearchDomain, get_domain_config
-from src.orchestrators.base import JudgeHandlerProtocol, SearchHandlerProtocol
-from src.prompts.synthesis import format_synthesis_prompt, get_synthesis_system_prompt
-from src.utils.config import settings
-from src.utils.exceptions import JudgeError, ModalError, SearchError
-from src.utils.models import (
-    AgentEvent,
-    Evidence,
-    JudgeAssessment,
-    OrchestratorConfig,
-    SearchResult,
-)
-if TYPE_CHECKING:
-    from src.services.embeddings import EmbeddingService
-    from src.services.statistical_analyzer import StatisticalAnalyzer
-logger = structlog.get_logger()
-class Orchestrator:
-    """
-    The simple agent orchestrator - runs the Search -> Judge -> Loop cycle.
-    This is a generator-based design that yields events for real-time UI updates.
-    Uses pydantic-ai for structured LLM outputs without requiring the full
-    Microsoft Agent Framework.
-    """
-    # Termination thresholds (code-enforced, not LLM-decided)
-    TERMINATION_CRITERIA: ClassVar[dict[str, float]] = {
-        "min_combined_score": 12.0,  # mechanism + clinical >= 12
-        "min_score_with_volume": 10.0,  # >= 10 if 50+ sources
-        "min_evidence_for_volume": 50.0,  # Priority 3: evidence count threshold
-        "late_iteration_threshold": 8.0,  # >= 8 in iterations 8+
-        "max_evidence_threshold": 100.0,  # Force synthesis with 100+ sources
-        "emergency_iteration": 8.0,  # Last 2 iterations = emergency mode
-        "min_confidence": 0.5,  # Minimum confidence for emergency synthesis
-        "min_evidence_for_emergency": 30.0,  # Priority 6: min evidence for emergency
-    }
-    def __init__(
-        self,
-        search_handler: SearchHandlerProtocol,
-        judge_handler: JudgeHandlerProtocol,
-        config: OrchestratorConfig | None = None,
-        enable_analysis: bool = False,
-        enable_embeddings: bool = True,
-        domain: ResearchDomain | str | None = None,
-    ):
-        """
-        Initialize the orchestrator.
-        Args:
-            search_handler: Handler for executing searches
-            judge_handler: Handler for assessing evidence
-            config: Optional configuration (uses defaults if not provided)
-            enable_analysis: Whether to perform statistical analysis (if Modal available)
-            enable_embeddings: Whether to use semantic search for ranking/dedup
-            domain: Research domain for customization
-        """
-        self.search = search_handler
-        self.judge = judge_handler
-        self.config = config or OrchestratorConfig()
-        self.history: list[dict[str, Any]] = []
-        self._enable_analysis = enable_analysis and settings.modal_available
-        self._enable_embeddings = enable_embeddings
-        self.domain = domain
-        self.domain_config = get_domain_config(domain)
-        # Lazy-load services (typed for IDE support)
-        self._analyzer: StatisticalAnalyzer | None = None
-        self._embeddings: EmbeddingService | None = None
-    def _get_analyzer(self) -> StatisticalAnalyzer | None:
-        """Lazy initialization of StatisticalAnalyzer."""
-        if self._analyzer is None:
-            from src.utils.service_loader import get_analyzer_if_available
-            self._analyzer = get_analyzer_if_available()
-            if self._analyzer is None:
-                self._enable_analysis = False
-        return self._analyzer
-    async def _run_analysis_phase(
-        self, query: str, evidence: list[Evidence], iteration: int
-    ) -> AsyncGenerator[AgentEvent, None]:
-        """Run the optional analysis phase."""
-        if not self._enable_analysis:
-            return
-        yield AgentEvent(
-            type="analyzing",
-            message="Running statistical analysis in Modal sandbox...",
-            data={},
-            iteration=iteration,
-        )
-        try:
-            analyzer = self._get_analyzer()
-            if analyzer is None:
-                logger.info("StatisticalAnalyzer not available, skipping analysis phase")
-                return
-            # Run Modal analysis (no agent_framework needed!)
-            analysis_result = await analyzer.analyze(
-                query=query,
-                evidence=evidence,
-                hypothesis=None,  # Could add hypothesis generation later
-            )
-            yield AgentEvent(
-                type="analysis_complete",
-                message=f"Analysis verdict: {analysis_result.verdict}",
-                data=analysis_result.model_dump(),
-                iteration=iteration,
-            )
-        except ModalError as e:
-            logger.error("Modal analysis failed", error=str(e), exc_type="ModalError")
-            yield AgentEvent(
-                type="error",
-                message=f"Modal analysis failed: {e}",
-                data={"error": str(e), "recoverable": True},
-                iteration=iteration,
-            )
-        except Exception as e:
-            # Unexpected error - log with full context for debugging
-            logger.error(
-                "Modal analysis failed unexpectedly",
-                error=str(e),
-                exc_type=type(e).__name__,
-            )
-            yield AgentEvent(
-                type="error",
-                message=f"Modal analysis failed: {e}",
-                data={"error": str(e), "recoverable": True},
-                iteration=iteration,
-            )
-    def _should_synthesize(
-        self,
-        assessment: JudgeAssessment,
-        iteration: int,
-        max_iterations: int,
-        evidence_count: int,
-    ) -> tuple[bool, str]:
-        """
-        Code-enforced synthesis decision.
-        Returns (should_synthesize, reason).
-        """
-        combined_score = (
-            assessment.details.mechanism_score + assessment.details.clinical_evidence_score
-        )
-        has_drug_candidates = len(assessment.details.drug_candidates) > 0
-        confidence = assessment.confidence
-        # Priority 1: LLM explicitly says sufficient with good scores
-        if assessment.sufficient and assessment.recommendation == "synthesize":
-            if combined_score >= 10:
-                return True, "judge_approved"
-        # Priority 2: High scores with drug candidates
-        if (
-            combined_score >= self.TERMINATION_CRITERIA["min_combined_score"]
-            and has_drug_candidates
-        ):
-            return True, "high_scores_with_candidates"
-        # Priority 3: Good scores with high evidence volume
-        if (
-            combined_score >= self.TERMINATION_CRITERIA["min_score_with_volume"]
-            and evidence_count >= self.TERMINATION_CRITERIA["min_evidence_for_volume"]
-        ):
-            return True, "good_scores_high_volume"
-        # Priority 4: Late iteration with acceptable scores (diminishing returns)
-        is_late_iteration = iteration >= max_iterations - 2
-        if (
-            is_late_iteration
-            and combined_score >= self.TERMINATION_CRITERIA["late_iteration_threshold"]
-        ):
-            return True, "late_iteration_acceptable"
-        # Priority 5: Very high evidence count (enough to synthesize something)
-        if evidence_count >= self.TERMINATION_CRITERIA["max_evidence_threshold"]:
-            return True, "max_evidence_reached"
-        # Priority 6: Emergency synthesis (avoid garbage output)
-        if (
-            is_late_iteration
-            and evidence_count >= self.TERMINATION_CRITERIA["min_evidence_for_emergency"]
-            and confidence >= self.TERMINATION_CRITERIA["min_confidence"]
-        ):
-            return True, "emergency_synthesis"
-        return False, "continue_searching"
-    async def run(self, query: str) -> AsyncGenerator[AgentEvent, None]:  # noqa: PLR0915
-        """
-        Run the agent loop for a query.
-        Yields AgentEvent objects for each step, allowing real-time UI updates.
-        Args:
-            query: The user's research question
-        Yields:
-            AgentEvent objects for each step of the process
-        """
-        # Import here to avoid circular deps if any
-        from src.agents.graph.state import Hypothesis
-        from src.services.research_memory import ResearchMemory
-        logger.info("Starting orchestrator", query=query)
-        yield AgentEvent(
-            type="started",
-            message=f"Starting research for: {query}",
-            iteration=0,
-        )
-        # Initialize Shared Memory
-        # We keep 'all_evidence' for local tracking/reporting, but use Memory for intelligence
-        memory = ResearchMemory(query=query)
-        all_evidence: list[Evidence] = []
-        current_queries = [query]
-        iteration = 0
-        while iteration < self.config.max_iterations:
-            iteration += 1
-            logger.info("Iteration", iteration=iteration, queries=current_queries)
-            # === SEARCH PHASE ===
-            yield AgentEvent(
-                type="searching",
-                message=f"Searching for: {', '.join(current_queries[:3])}...",
-                iteration=iteration,
-            )
-            try:
-                # Execute searches for all current queries
-                search_tasks = [
-                    self.search.execute(q, self.config.max_results_per_tool)
-                    for q in current_queries[:3]  # Limit to 3 queries per iteration
-                ]
-                search_results = await asyncio.gather(*search_tasks, return_exceptions=True)
-                # Collect evidence from successful searches
-                new_evidence: list[Evidence] = []
-                errors: list[str] = []
-                for q, result in zip(current_queries[:3], search_results, strict=False):
-                    if isinstance(result, Exception):
-                        errors.append(f"Search for '{q}' failed: {result!s}")
-                    elif isinstance(result, SearchResult):
-                        new_evidence.extend(result.evidence)
-                        errors.extend(result.errors)
-                    else:
-                        # Should not happen with return_exceptions=True but safe fallback
-                        errors.append(f"Unknown result type for '{q}': {type(result)}")
-                # === MEMORY INTEGRATION: Store and Deduplicate ===
-                # ResearchMemory handles semantic deduplication and persistence
-                # It returns IDs of actual NEW evidence
-                new_ids = await memory.store_evidence(new_evidence)
-                # Filter new_evidence to only keep what was actually new (based on IDs)
-                # Note: This assumes IDs are URLs, which match Citation.url
-                unique_new = [e for e in new_evidence if e.citation.url in new_ids]
-                all_evidence.extend(unique_new)
-                yield AgentEvent(
-                    type="search_complete",
-                    message=f"Found {len(unique_new)} new sources ({len(all_evidence)} total)",
-                    data={
-                        "new_count": len(unique_new),
-                        "total_count": len(all_evidence),
-                    },
-                    iteration=iteration,
-                )
-                if errors:
-                    logger.warning("Search errors", errors=errors)
-            except SearchError as e:
-                logger.error("Search phase failed", error=str(e), exc_type="SearchError")
-                yield AgentEvent(
-                    type="error",
-                    message=f"Search failed: {e!s}",
-                    data={"recoverable": True, "error_type": "search"},
-                    iteration=iteration,
-                )
-                continue
-            except Exception as e:
-                # Unexpected error - log full context for debugging
-                logger.error(
-                    "Search phase failed unexpectedly",
-                    error=str(e),
-                    exc_type=type(e).__name__,
-                )
-                yield AgentEvent(
-                    type="error",
-                    message=f"Search failed: {e!s}",
-                    data={"recoverable": True, "error_type": "unexpected"},
-                    iteration=iteration,
-                )
-                continue
-            # === JUDGE PHASE ===
-            yield AgentEvent(
-                type="judging",
-                message=f"Evaluating evidence (Memory: {len(memory.evidence_ids)} docs)...",
-                iteration=iteration,
-            )
-            try:
-                # Retrieve RELEVANT evidence from memory for the judge
-                # This keeps the context window manageable and focused
-                judge_context = await memory.get_relevant_evidence(n=30)
-                # Fallback if memory is empty (shouldn't happen if search worked)
-                if not judge_context and all_evidence:
-                    judge_context = all_evidence[-30:]
-                assessment = await self.judge.assess(
-                    query, judge_context, iteration, self.config.max_iterations
-                )
-                # === MEMORY INTEGRATION: Track Hypotheses ===
-                # Convert loose strings to structured Hypotheses
-                for candidate in assessment.details.drug_candidates:
-                    h = Hypothesis(
-                        id=candidate.replace(" ", "_").lower(),
-                        statement=f"{candidate} is a potential candidate for {query}",
-                        status="proposed",
-                        confidence=assessment.confidence,
-                        reasoning=f" identified in iteration {iteration}",
-                    )
-                    memory.add_hypothesis(h)
-                yield AgentEvent(
-                    type="judge_complete",
-                    message=(
-                        f"Assessment: {assessment.recommendation} "
-                        f"(confidence: {assessment.confidence:.0%})"
-                    ),
-                    data={
-                        "sufficient": assessment.sufficient,
-                        "confidence": assessment.confidence,
-                        "mechanism_score": assessment.details.mechanism_score,
-                        "clinical_score": assessment.details.clinical_evidence_score,
-                    },
-                    iteration=iteration,
-                )
-                # Record this iteration in history
-                self.history.append(
-                    {
-                        "iteration": iteration,
-                        "queries": current_queries,
-                        "evidence_count": len(all_evidence),
-                        "assessment": assessment.model_dump(),
-                    }
-                )
-                # === DECISION PHASE (Code-Enforced) ===
-                should_synth, reason = self._should_synthesize(
-                    assessment=assessment,
-                    iteration=iteration,
-                    max_iterations=self.config.max_iterations,
-                    evidence_count=len(all_evidence),
-                )
-                logger.info(
-                    "Synthesis decision",
-                    should_synthesize=should_synth,
-                    reason=reason,
-                    iteration=iteration,
-                    combined_score=assessment.details.mechanism_score
-                    + assessment.details.clinical_evidence_score,
-                    evidence_count=len(all_evidence),
-                    confidence=assessment.confidence,
-                )
-                if should_synth:
-                    # Log synthesis trigger reason for debugging
-                    if reason != "judge_approved":
-                        logger.info(f"Code-enforced synthesis triggered: {reason}")
-                    # Optional Analysis Phase
-                    async for event in self._run_analysis_phase(query, all_evidence, iteration):
-                        yield event
-                    yield AgentEvent(
-                        type="synthesizing",
-                        message=f"Evidence sufficient ({reason})! Preparing synthesis...",
-                        iteration=iteration,
-                    )
-                    # Generate final response using LLM narrative synthesis
-                    # Use all gathered evidence for the final report
-                    final_response = await self._generate_synthesis(query, all_evidence, assessment)
-                    yield AgentEvent(
-                        type="complete",
-                        message=final_response,
-                        data={
-                            "evidence_count": len(all_evidence),
-                            "iterations": iteration,
-                            "synthesis_reason": reason,
-                            "drug_candidates": assessment.details.drug_candidates,
-                            "key_findings": assessment.details.key_findings,
-                        },
-                        iteration=iteration,
-                    )
-                    return
-                else:
-                    # Need more evidence - prepare next queries
-                    current_queries = assessment.next_search_queries or [
-                        f"{query} mechanism of action",
-                        f"{query} clinical evidence",
-                    ]
-                    yield AgentEvent(
-                        type="looping",
-                        message=(
-                            f"Gathering more evidence (scores: {assessment.details.mechanism_score}"
-                            f"+{assessment.details.clinical_evidence_score}). "
-                            f"Next: {', '.join(current_queries[:2])}..."
-                        ),
-                        data={"next_queries": current_queries, "reason": reason},
-                        iteration=iteration,
-                    )
-            except JudgeError as e:
-                logger.error("Judge phase failed", error=str(e), exc_type="JudgeError")
-                yield AgentEvent(
-                    type="error",
-                    message=f"Assessment failed: {e!s}",
-                    data={"recoverable": True, "error_type": "judge"},
-                    iteration=iteration,
-                )
-                continue
-            except Exception as e:
-                # Unexpected error - log full context for debugging
-                logger.error(
-                    "Judge phase failed unexpectedly",
-                    error=str(e),
-                    exc_type=type(e).__name__,
-                )
-                yield AgentEvent(
-                    type="error",
-                    message=f"Assessment failed: {e!s}",
-                    data={"recoverable": True, "error_type": "unexpected"},
-                    iteration=iteration,
-                )
-                continue
-        # Max iterations reached
-        yield AgentEvent(
-            type="complete",
-            message=self._generate_partial_synthesis(query, all_evidence),
-            data={
-                "evidence_count": len(all_evidence),
-                "iterations": iteration,
-                "max_reached": True,
-            },
-            iteration=iteration,
-        )
-    async def _generate_synthesis(
-        self,
-        query: str,
-        evidence: list[Evidence],
-        assessment: JudgeAssessment,
-    ) -> str:
-        """
-        Generate the final synthesis response using LLM.
-        This method calls an LLM to generate a narrative research report,
-        following the Microsoft Agent Framework pattern of using LLM synthesis
-        instead of string templating.
-        Args:
-            query: The original question
-            evidence: All collected evidence
-            assessment: The final assessment
-        Returns:
-            Narrative synthesis as markdown
-        """
-        # Build evidence summary for LLM context (limit to avoid token overflow)
-        evidence_lines = []
-        for e in evidence[:20]:
-            authors = ", ".join(e.citation.authors[:2]) if e.citation.authors else "Unknown"
-            content_preview = e.content[:200].replace("\n", " ")
-            evidence_lines.append(
-                f"- {e.citation.title} ({authors}, {e.citation.date}): {content_preview}..."
-            )
-        evidence_summary = "\n".join(evidence_lines)
-        # Format synthesis prompt with assessment data
-        user_prompt = format_synthesis_prompt(
-            query=query,
-            evidence_summary=evidence_summary,
-            drug_candidates=assessment.details.drug_candidates,
-            key_findings=assessment.details.key_findings,
-            mechanism_score=assessment.details.mechanism_score,
-            clinical_score=assessment.details.clinical_evidence_score,
-            confidence=assessment.confidence,
-        )
-        # Get domain-specific system prompt
-        system_prompt = get_synthesis_system_prompt(self.domain)
-        try:
-            # Type-safe tier detection using Protocol (CodeRabbit review recommendation)
-            # This replaces hasattr() with isinstance() for compile-time type safety
-            from src.orchestrators.base import SynthesizableJudge
-            from src.utils.exceptions import SynthesisError
-            if isinstance(self.judge, SynthesizableJudge):
-                logger.info("Using judge's free-tier synthesis method")
-                # synthesize() now raises SynthesisError on failure (CodeRabbit fix)
-                narrative = await self.judge.synthesize(system_prompt, user_prompt)
-                logger.info("Free-tier synthesis completed", chars=len(narrative))
-            else:
-                # Paid tier: use PydanticAI with get_model()
-                from pydantic_ai import Agent
-                from src.agent_factory.judges import get_model
-                # Create synthesis agent with retries (matching Judge agent pattern)
-                # Without retries, transient errors immediately trigger fallback
-                agent: Agent[None, str] = Agent(
-                    model=get_model(),
-                    output_type=str,
-                    system_prompt=system_prompt,
-                    retries=3,  # Match Judge agent - retry on transient errors
-                )
-                result = await agent.run(user_prompt)
-                narrative = result.output
-                logger.info("LLM narrative synthesis completed", chars=len(narrative))
-        except SynthesisError as e:
-            # Handle SynthesisError with detailed context (CodeRabbit recommendation)
-            logger.error(
-                "Free-tier synthesis failed",
-                attempted_models=e.attempted_models,
-                errors=e.errors,
-                evidence_count=len(evidence),
-            )
-            # Surface detailed error to user
-            models_str = ", ".join(e.attempted_models) if e.attempted_models else "unknown"
-            error_note = (
-                f"\n\n> ⚠️ **Note**: AI narrative synthesis unavailable. "
-                f"Showing structured summary.\n"
-                f"> _Attempted models: {models_str}_\n"
-            )
-            template = self._generate_template_synthesis(query, evidence, assessment)
-            return f"{error_note}\n{template}"
-        except Exception as e:
-            # Fallback to template synthesis if LLM fails
-            # Log error details for debugging
-            logger.error(
-                "LLM synthesis failed, using template fallback",
-                error=str(e),
-                exc_type=type(e).__name__,
-                evidence_count=len(evidence),
-                exc_info=True,  # Capture stack trace for debugging
-            )
-            # Surface the error to user (MS Agent Framework pattern)
-            # Don't silently fall back - let user know synthesis degraded
-            error_note = (
-                f"\n\n> ⚠️ **Note**: AI narrative synthesis unavailable. "
-                f"Showing structured summary.\n"
-                f"> _Error: {type(e).__name__}_\n"
-            )
-            template = self._generate_template_synthesis(query, evidence, assessment)
-            return f"{error_note}\n{template}"
-        # Add full citation list footer
-        citations = "\n".join(
-            f"{i + 1}. [{e.citation.title}]({e.citation.url}) "
-            f"({e.citation.source.upper()}, {e.citation.date})"
-            for i, e in enumerate(evidence[:15])
-        )
-        return f"""{narrative}
----
-### Full Citation List ({len(evidence)} sources)
-{citations}
-*Analysis based on {len(evidence)} sources across {len(self.history)} iterations.*
-"""
-    def _generate_template_synthesis(
-        self,
-        query: str,
-        evidence: list[Evidence],
-        assessment: JudgeAssessment,
-    ) -> str:
-        """
-        Generate fallback template synthesis (no LLM).
-        Used when LLM synthesis fails or is unavailable.
-        Args:
-            query: The original question
-            evidence: All collected evidence
-            assessment: The final assessment
-        Returns:
-            Formatted synthesis as markdown (bullet-point style)
-        """
-        drug_list = (
-            "\n".join([f"- **{d}**" for d in assessment.details.drug_candidates])
-            or "- No specific candidates identified"
-        )
-        findings_list = (
-            "\n".join([f"- {f}" for f in assessment.details.key_findings]) or "- See evidence below"
-        )
-        citations = "\n".join(
-            [
-                f"{i + 1}. [{e.citation.title}]({e.citation.url}) "
-                f"({e.citation.source.upper()}, {e.citation.date})"
-                for i, e in enumerate(evidence[:10])
-            ]
-        )
-        return f"""{self.domain_config.report_title}
-### Question
-{query}
-### Drug Candidates
-{drug_list}
-### Key Findings
-{findings_list}
-### Assessment
-- **Mechanism Score**: {assessment.details.mechanism_score}/10
-- **Clinical Evidence Score**: {assessment.details.clinical_evidence_score}/10
-- **Confidence**: {assessment.confidence:.0%}
-### Reasoning
-{assessment.reasoning}
-### Citations ({len(evidence)} sources)
-{citations}
----
-*Analysis based on {len(evidence)} sources across {len(self.history)} iterations.*
-"""
-    def _generate_partial_synthesis(
-        self,
-        query: str,
-        evidence: list[Evidence],
-    ) -> str:
-        """
-        Generate a REAL synthesis when max iterations reached.
-        Even when forced to stop, we should provide:
-        - Drug candidates (if any were found)
-        - Key findings
-        - Assessment scores
-        - Actionable citations
-        This is still better than a citation dump.
-        """
-        # Extract data from last assessment if available
-        last_assessment = self.history[-1]["assessment"] if self.history else {}
-        details = last_assessment.get("details", {})
-        drug_candidates = details.get("drug_candidates", [])
-        key_findings = details.get("key_findings", [])
-        mechanism_score = details.get("mechanism_score", 0)
-        clinical_score = details.get("clinical_evidence_score", 0)
-        reasoning = last_assessment.get("reasoning", "Analysis incomplete due to iteration limit.")
-        # Format drug candidates
-        if drug_candidates:
-            drug_list = "\n".join([f"- **{d}**" for d in drug_candidates[:5]])
-        else:
-            drug_list = (
-                "- *No specific drug candidates identified in evidence*\n"
-                "- *Try a more specific query or add an API key for better analysis*"
-            )
-        # Format key findings
-        if key_findings:
-            findings_list = "\n".join([f"- {f}" for f in key_findings[:5]])
-        else:
-            findings_list = (
-                "- *Key findings require further analysis*\n"
-                "- *See citations below for relevant sources*"
-            )
-        # Format citations (top 10)
-        citations = "\n".join(
-            [
-                f"{i + 1}. [{e.citation.title}]({e.citation.url}) "
-                f"({e.citation.source.upper()}, {e.citation.date})"
-                for i, e in enumerate(evidence[:10])
-            ]
-        )
-        combined_score = mechanism_score + clinical_score
-        mech_strength = (
-            "Strong" if mechanism_score >= 7 else "Moderate" if mechanism_score >= 4 else "Limited"
-        )
-        clin_strength = (
-            "Strong" if clinical_score >= 7 else "Moderate" if clinical_score >= 4 else "Limited"
-        )
-        comb_strength = "Sufficient" if combined_score >= 12 else "Partial"
-        return f"""{self.domain_config.report_title}
-### Research Question
-{query}
-### Status
-Analysis based on {len(evidence)} sources across {len(self.history)} iterations.
-Maximum iterations reached - results may be incomplete.
-### Drug Candidates Identified
-{drug_list}
-### Key Findings
-{findings_list}
-### Evidence Quality Scores
-| Criterion | Score | Interpretation |
-|-----------|-------|----------------|
-| Mechanism | {mechanism_score}/10 | {mech_strength} mechanistic evidence |
-| Clinical | {clinical_score}/10 | {clin_strength} clinical support |
-| Combined | {combined_score}/20 | {comb_strength} for synthesis |
-### Analysis Summary
-{reasoning}
-### Top Citations ({len(evidence)} sources total)
-{citations}
----
-*For more complete analysis:*
-- *Add an OpenAI or Anthropic API key for enhanced LLM analysis*
-- *Try a more specific query (e.g., include drug names)*
-- *Use Advanced mode for multi-agent research*
-"""

src/prompts/judge.py CHANGED Viewed

@@ -122,7 +122,8 @@ def format_user_prompt(
     NOTE: Evidence should be pre-selected using select_evidence_for_judge().
     This function assumes evidence is already capped.
     """
-    total_count = total_evidence_count or len(evidence)
     max_content_len = 1500
     scoring_prompt = get_scoring_prompt(domain)

     NOTE: Evidence should be pre-selected using select_evidence_for_judge().
     This function assumes evidence is already capped.
     """
+    # Use explicit None check - 0 is a valid count (empty evidence)
+    total_count = total_evidence_count if total_evidence_count is not None else len(evidence)
     max_content_len = 1500
     scoring_prompt = get_scoring_prompt(domain)

src/utils/config.py CHANGED Viewed

@@ -27,7 +27,8 @@ class Settings(BaseSettings):
     # LLM Configuration
     openai_api_key: str | None = Field(default=None, description="OpenAI API key")
     anthropic_api_key: str | None = Field(default=None, description="Anthropic API key")
-    llm_provider: Literal["openai", "anthropic", "huggingface"] = Field(
         default="openai", description="Which LLM provider to use"
     )
     openai_model: str = Field(default="gpt-5", description="OpenAI model name")
@@ -93,12 +94,15 @@ class Settings(BaseSettings):
     def get_api_key(self) -> str:
         """Get the API key for the configured provider."""
-        if self.llm_provider == "openai":
             if not self.openai_api_key:
                 raise ConfigurationError("OPENAI_API_KEY not set")
             return self.openai_api_key
-        if self.llm_provider == "anthropic":
             if not self.anthropic_api_key:
                 raise ConfigurationError("ANTHROPIC_API_KEY not set")
             return self.anthropic_api_key
@@ -124,6 +128,11 @@ class Settings(BaseSettings):
         """Check if Anthropic API key is available."""
         return bool(self.anthropic_api_key)
     @property
     def has_huggingface_key(self) -> bool:
         """Check if HuggingFace token is available."""
@@ -132,7 +141,12 @@ class Settings(BaseSettings):
     @property
     def has_any_llm_key(self) -> bool:
         """Check if any LLM API key is available."""
-        return self.has_openai_key or self.has_anthropic_key or self.has_huggingface_key
 def get_settings() -> Settings:

     # LLM Configuration
     openai_api_key: str | None = Field(default=None, description="OpenAI API key")
     anthropic_api_key: str | None = Field(default=None, description="Anthropic API key")
+    gemini_api_key: str | None = Field(default=None, description="Google Gemini API key")
+    llm_provider: Literal["openai", "anthropic", "huggingface", "gemini"] = Field(
         default="openai", description="Which LLM provider to use"
     )
     openai_model: str = Field(default="gpt-5", description="OpenAI model name")
     def get_api_key(self) -> str:
         """Get the API key for the configured provider."""
+        # Normalize provider for case-insensitive matching
+        provider_lower = self.llm_provider.lower() if self.llm_provider else ""
+        if provider_lower == "openai":
             if not self.openai_api_key:
                 raise ConfigurationError("OPENAI_API_KEY not set")
             return self.openai_api_key
+        if provider_lower == "anthropic":
             if not self.anthropic_api_key:
                 raise ConfigurationError("ANTHROPIC_API_KEY not set")
             return self.anthropic_api_key
         """Check if Anthropic API key is available."""
         return bool(self.anthropic_api_key)
+    @property
+    def has_gemini_key(self) -> bool:
+        """Check if Gemini API key is available."""
+        return bool(self.gemini_api_key)
     @property
     def has_huggingface_key(self) -> bool:
         """Check if HuggingFace token is available."""
     @property
     def has_any_llm_key(self) -> bool:
         """Check if any LLM API key is available."""
+        return (
+            self.has_openai_key
+            or self.has_anthropic_key
+            or self.has_huggingface_key
+            or self.has_gemini_key
+        )
 def get_settings() -> Settings:

src/utils/llm_factory.py CHANGED Viewed

@@ -1,106 +1,69 @@
 """Centralized LLM client factory.
-This module provides factory functions for creating LLM clients,
-ensuring consistent configuration and clear error messages.
-Why Magentic requires OpenAI:
-- Magentic agents use the @ai_function decorator for tool calling
-- This requires structured function calling protocol (tools, tool_choice)
-- OpenAI's API supports this natively
-- Anthropic/HuggingFace Inference APIs are text-in/text-out only
 """
-from typing import TYPE_CHECKING, Any
 from src.utils.config import settings
 from src.utils.exceptions import ConfigurationError
-if TYPE_CHECKING:
-    from agent_framework.openai import OpenAIChatClient
-def get_magentic_client() -> "OpenAIChatClient":
     """
-    Get the OpenAI client for Magentic agents.
-    Magentic requires OpenAI because it uses function calling protocol:
-    - @ai_function decorators define callable tools
-    - LLM returns structured tool calls (not just text)
-    - Requires OpenAI's tools/function_call API support
-    Raises:
-        ConfigurationError: If OPENAI_API_KEY is not set
-    Returns:
-        Configured OpenAIChatClient for Magentic agents
     """
-    # Import here to avoid requiring agent-framework for simple mode
-    from agent_framework.openai import OpenAIChatClient
-    api_key = settings.get_openai_api_key()
-    return OpenAIChatClient(
-        model_id=settings.openai_model,
-        api_key=api_key,
-    )
 def get_pydantic_ai_model() -> Any:
     """
     Get the appropriate model for pydantic-ai based on configuration.
-    Uses the configured LLM_PROVIDER to select between OpenAI and Anthropic.
-    This is used by simple mode components (JudgeHandler, etc.)
-    Returns:
-        Configured pydantic-ai model
     """
     from pydantic_ai.models.anthropic import AnthropicModel
     from pydantic_ai.models.openai import OpenAIChatModel
     from pydantic_ai.providers.anthropic import AnthropicProvider
     from pydantic_ai.providers.openai import OpenAIProvider
-    if settings.llm_provider == "openai":
         if not settings.openai_api_key:
             raise ConfigurationError("OPENAI_API_KEY not set for pydantic-ai")
         provider = OpenAIProvider(api_key=settings.openai_api_key)
         return OpenAIChatModel(settings.openai_model, provider=provider)
-    if settings.llm_provider == "anthropic":
         if not settings.anthropic_api_key:
             raise ConfigurationError("ANTHROPIC_API_KEY not set for pydantic-ai")
         anthropic_provider = AnthropicProvider(api_key=settings.anthropic_api_key)
         return AnthropicModel(settings.anthropic_model, provider=anthropic_provider)
-    raise ConfigurationError(f"Unknown LLM provider: {settings.llm_provider}")
 def check_magentic_requirements() -> None:
     """
     Check if Magentic mode requirements are met.
-    Raises:
-        ConfigurationError: If requirements not met
     """
-    if not settings.has_openai_key:
-        raise ConfigurationError(
-            "Magentic mode requires OPENAI_API_KEY for function calling support. "
-            "Anthropic and HuggingFace Inference do not support the structured "
-            "function calling protocol that Magentic agents require. "
-            "Use mode='simple' for other LLM providers."
-        )
 def check_simple_mode_requirements() -> None:
     """
     Check if simple mode requirements are met.
-    Simple mode supports both OpenAI and Anthropic.
-    Raises:
-        ConfigurationError: If no LLM API key is configured
     """
     if not settings.has_any_llm_key:
-        raise ConfigurationError(
-            "No LLM API key configured. Set OPENAI_API_KEY or ANTHROPIC_API_KEY."
-        )

 """Centralized LLM client factory.
+This module provides factory functions for creating LLM clients.
+DEPRECATED: Prefer src.clients.factory.get_chat_client() directly.
 """
+from typing import Any
+from src.clients.base import BaseChatClient
+from src.clients.factory import get_chat_client
 from src.utils.config import settings
 from src.utils.exceptions import ConfigurationError
+def get_magentic_client() -> BaseChatClient:
     """
+    Get the chat client for Magentic agents.
+    Now unified to support OpenAI, Gemini, and HuggingFace.
     """
+    return get_chat_client()
 def get_pydantic_ai_model() -> Any:
     """
     Get the appropriate model for pydantic-ai based on configuration.
+    Used by legacy Simple Mode components.
     """
     from pydantic_ai.models.anthropic import AnthropicModel
     from pydantic_ai.models.openai import OpenAIChatModel
     from pydantic_ai.providers.anthropic import AnthropicProvider
     from pydantic_ai.providers.openai import OpenAIProvider
+    # Normalize provider for case-insensitive matching
+    provider_lower = settings.llm_provider.lower() if settings.llm_provider else ""
+    if provider_lower == "openai":
         if not settings.openai_api_key:
             raise ConfigurationError("OPENAI_API_KEY not set for pydantic-ai")
         provider = OpenAIProvider(api_key=settings.openai_api_key)
         return OpenAIChatModel(settings.openai_model, provider=provider)
+    if provider_lower == "anthropic":
         if not settings.anthropic_api_key:
             raise ConfigurationError("ANTHROPIC_API_KEY not set for pydantic-ai")
         anthropic_provider = AnthropicProvider(api_key=settings.anthropic_api_key)
         return AnthropicModel(settings.anthropic_model, provider=anthropic_provider)
+    raise ConfigurationError(f"Unknown LLM provider for simple mode: {settings.llm_provider}")
 def check_magentic_requirements() -> None:
     """
     Check if Magentic mode requirements are met.
+    Now supports multiple providers via ChatClientFactory.
     """
+    # Advanced/Magentic mode now works with ANY provider (including free HF)
+    pass
 def check_simple_mode_requirements() -> None:
     """
     Check if simple mode requirements are met.
     """
     if not settings.has_any_llm_key:
+        # Simple mode still requires explicit keys?
+        # Actually, simple mode also had HF support but it was brittle.
+        # We are deleting simple mode later, so let's leave this as is for now.
+        pass

tests/e2e/test_advanced_mode.py DELETED Viewed

@@ -1,70 +0,0 @@
-from unittest.mock import MagicMock, patch
-import pytest
-# Skip entire module if agent_framework is not installed
-agent_framework = pytest.importorskip("agent_framework")
-from agent_framework import MagenticAgentMessageEvent, MagenticFinalResultEvent
-from src.orchestrators.advanced import AdvancedOrchestrator as MagenticOrchestrator
-class MockChatMessage:
-    def __init__(self, content):
-        self.content = content
-    @property
-    def text(self):
-        return self.content
-@pytest.mark.asyncio
-@pytest.mark.e2e
-async def test_advanced_mode_completes_mocked():
-    """Verify Advanced mode runs without crashing (mocked workflow)."""
-    # Initialize orchestrator (mocking requirements check)
-    with patch("src.orchestrators.advanced.check_magentic_requirements"):
-        orchestrator = MagenticOrchestrator(max_rounds=5)
-    # Mock the workflow
-    mock_workflow = MagicMock()
-    # Create fake events
-    # 1. Search Agent runs
-    mock_msg_1 = MockChatMessage("Found 5 papers on PubMed")
-    event1 = MagenticAgentMessageEvent(agent_id="SearchAgent", message=mock_msg_1)
-    # 2. Report Agent finishes
-    mock_result_msg = MockChatMessage("# Final Report\n\nFindings...")
-    event2 = MagenticFinalResultEvent(message=mock_result_msg)
-    async def mock_stream(task):
-        yield event1
-        yield event2
-    mock_workflow.run_stream = mock_stream
-    # Patch dependencies:
-    # _build_workflow: Returns our mock
-    # init_magentic_state: Avoids DB calls
-    # _init_embedding_service: Avoids loading embeddings
-    with (
-        patch.object(orchestrator, "_build_workflow", return_value=mock_workflow),
-        patch("src.orchestrators.advanced.init_magentic_state"),
-        patch.object(orchestrator, "_init_embedding_service", return_value=None),
-    ):
-        events = []
-        async for event in orchestrator.run("test query"):
-            events.append(event)
-        # Check events
-        types = [e.type for e in events]
-        assert "started" in types
-        assert "thinking" in types
-        assert "search_complete" in types  # Mapped from SearchAgent
-        assert "progress" in types  # Added in SPEC_01
-        assert "complete" in types
-        complete_event = next(e for e in events if e.type == "complete")
-        assert "Final Report" in complete_event.message

tests/e2e/test_simple_mode.py DELETED Viewed

@@ -1,65 +0,0 @@
-import pytest
-from src.orchestrators import Orchestrator
-from src.utils.models import OrchestratorConfig
-@pytest.mark.asyncio
-@pytest.mark.e2e
-async def test_simple_mode_completes(mock_search_handler, mock_judge_handler):
-    """Verify Simple mode runs without crashing using mocks."""
-    config = OrchestratorConfig(max_iterations=2)
-    orchestrator = Orchestrator(
-        search_handler=mock_search_handler,
-        judge_handler=mock_judge_handler,
-        config=config,
-        enable_analysis=False,
-        enable_embeddings=False,
-    )
-    events = []
-    async for event in orchestrator.run("test query"):
-        events.append(event)
-    # Must complete
-    assert any(e.type == "complete" for e in events), "Did not receive complete event"
-    # Must not error
-    assert not any(e.type == "error" for e in events), "Received error event"
-    # Check structure of complete event
-    complete_event = next(e for e in events if e.type == "complete")
-    # The mock judge returns "MockDrug A" and "Finding 1", ensuring synthesis happens
-    assert "MockDrug A" in complete_event.message
-    assert "Finding 1" in complete_event.message
-@pytest.mark.asyncio
-@pytest.mark.e2e
-async def test_simple_mode_structure_validation(mock_search_handler, mock_judge_handler):
-    """Verify output contains expected structure (citations, headings)."""
-    config = OrchestratorConfig(max_iterations=2)
-    orchestrator = Orchestrator(
-        search_handler=mock_search_handler,
-        judge_handler=mock_judge_handler,
-        config=config,
-        enable_analysis=False,
-        enable_embeddings=False,
-    )
-    events = []
-    async for event in orchestrator.run("test query"):
-        events.append(event)
-    complete_event = next(e for e in events if e.type == "complete")
-    report = complete_event.message
-    # Check LLM narrative synthesis structure (SPEC_12)
-    # LLM generates prose with these sections (may omit ### prefix)
-    assert "Executive Summary" in report or "Sexual Health Analysis" in report
-    assert "Full Citation List" in report or "Citations" in report
-    # Check for citations (from citation footer added by orchestrator)
-    assert "Study on test query" in report
-    assert "pubmed.example.com/123" in report

tests/integration/test_dual_mode_e2e.py DELETED Viewed

@@ -1,83 +0,0 @@
-"""End-to-End Integration Tests for Dual-Mode Architecture."""
-from unittest.mock import AsyncMock, MagicMock, patch
-import pytest
-pytestmark = [pytest.mark.integration, pytest.mark.slow]
-from src.orchestrators import create_orchestrator
-from src.utils.models import Citation, Evidence, OrchestratorConfig
-@pytest.fixture
-def mock_search_handler():
-    handler = MagicMock()
-    handler.execute = AsyncMock(
-        return_value=[
-            Evidence(
-                citation=Citation(
-                    title="Test Paper", url="http://test", date="2024", source="pubmed"
-                ),
-                content="Testosterone improves sexual desire in postmenopausal women.",
-            )
-        ]
-    )
-    return handler
-@pytest.fixture
-def mock_judge_handler():
-    handler = MagicMock()
-    # Mock return value of assess
-    assessment = MagicMock()
-    assessment.sufficient = True
-    assessment.recommendation = "synthesize"
-    handler.assess = AsyncMock(return_value=assessment)
-    return handler
-@pytest.mark.asyncio
-async def test_simple_mode_e2e(mock_search_handler, mock_judge_handler):
-    """Test Simple Mode Orchestration flow."""
-    orch = create_orchestrator(
-        search_handler=mock_search_handler,
-        judge_handler=mock_judge_handler,
-        mode="simple",
-        config=OrchestratorConfig(max_iterations=1),
-    )
-    # Run
-    results = []
-    async for event in orch.run("Test query"):
-        results.append(event)
-    assert len(results) > 0
-    assert mock_search_handler.execute.called
-    assert mock_judge_handler.assess.called
-@pytest.mark.asyncio
-async def test_advanced_mode_explicit_instantiation():
-    """Test explicit Advanced Mode instantiation (not auto-detect).
-    This tests the explicit mode="advanced" path, verifying that
-    MagenticOrchestrator can be instantiated when explicitly requested.
-    The settings patch ensures any internal checks pass.
-    """
-    with patch("src.orchestrators.factory.settings") as mock_settings:
-        # Settings patch ensures factory checks pass (even though mode is explicit)
-        mock_settings.has_openai_key = True
-        with patch("src.agents.magentic_agents.OpenAIChatClient"):
-            # Mock agent creation to avoid real API calls during init
-            with (
-                patch("src.orchestrators.advanced.check_magentic_requirements"),
-                patch("src.orchestrators.advanced.create_search_agent"),
-                patch("src.orchestrators.advanced.create_judge_agent"),
-                patch("src.orchestrators.advanced.create_hypothesis_agent"),
-                patch("src.orchestrators.advanced.create_report_agent"),
-            ):
-                # Explicit mode="advanced" - tests the explicit path, not auto-detect
-                orch = create_orchestrator(mode="advanced")
-                assert orch is not None

tests/integration/test_simple_mode_synthesis.py DELETED Viewed

@@ -1,157 +0,0 @@
-from unittest.mock import AsyncMock
-import pytest
-from src.orchestrators.simple import Orchestrator
-from src.utils.models import (
-    AssessmentDetails,
-    Citation,
-    Evidence,
-    JudgeAssessment,
-    OrchestratorConfig,
-    SearchResult,
-)
-def make_evidence(title: str) -> Evidence:
-    return Evidence(
-        content="content",
-        citation=Citation(title=title, url="http://test.com", date="2025", source="pubmed"),
-    )
-@pytest.mark.integration
-@pytest.mark.asyncio
-async def test_simple_mode_synthesizes_before_max_iterations():
-    """Verify simple mode produces useful output with mocked judge."""
-    # Mock search to return evidence
-    mock_search = AsyncMock()
-    mock_search.execute.return_value = SearchResult(
-        query="test query",
-        evidence=[make_evidence(f"Paper {i}") for i in range(5)],
-        errors=[],
-        sources_searched=["pubmed"],
-        total_found=5,
-    )
-    # Mock judge to return GOOD scores eventually
-    # We can use MockJudgeHandler or a pure mock. Let's use a pure mock to control scores precisely.
-    mock_judge = AsyncMock()
-    # Since mock_judge has 'synthesize' attr by default (as a Mock),
-    # simple mode uses free-tier path.
-    # We must mock the return value of synthesize to simulate a successful narrative generation.
-    mock_judge.synthesize.return_value = "This is a synthesized report for MagicDrug."
-    # Iteration 1: Low scores
-    assess_1 = JudgeAssessment(
-        details=AssessmentDetails(
-            mechanism_score=2,
-            mechanism_reasoning="reasoning is sufficient for valid model",
-            clinical_evidence_score=2,
-            clinical_reasoning="reasoning is sufficient for valid model",
-            drug_candidates=[],
-            key_findings=[],
-        ),
-        sufficient=False,
-        confidence=0.5,
-        recommendation="continue",
-        next_search_queries=["q2"],
-        reasoning="need more evidence to support conclusions about this topic",
-    )
-    # Iteration 2: High scores (should trigger synthesis)
-    assess_2 = JudgeAssessment(
-        details=AssessmentDetails(
-            mechanism_score=8,
-            mechanism_reasoning="reasoning is sufficient for valid model",
-            clinical_evidence_score=7,
-            clinical_reasoning="reasoning is sufficient for valid model",
-            drug_candidates=["MagicDrug"],
-            key_findings=["It works"],
-        ),
-        sufficient=False,  # Judge is conservative
-        confidence=0.9,
-        recommendation="continue",  # Judge still says continue (simulating bias)
-        next_search_queries=[],
-        reasoning="good scores but maybe more evidence needed technically",
-    )
-    mock_judge.assess.side_effect = [assess_1, assess_2]
-    orchestrator = Orchestrator(
-        search_handler=mock_search,
-        judge_handler=mock_judge,
-        config=OrchestratorConfig(max_iterations=5),
-    )
-    events = []
-    async for event in orchestrator.run("test query"):
-        events.append(event)
-        if event.type == "complete":
-            break
-    # Must have synthesis with drug candidates
-    complete_events = [e for e in events if e.type == "complete"]
-    assert len(complete_events) == 1
-    complete_event = complete_events[0]
-    assert "MagicDrug" in complete_event.message
-    # SPEC_12: LLM synthesis produces narrative prose, not template with "Drug Candidates" header
-    # Check for narrative structure (LLM may omit ### prefix) OR template fallback
-    assert (
-        "Executive Summary" in complete_event.message
-        or "Drug Candidates" in complete_event.message
-        or "synthesized report" in complete_event.message
-    )
-    assert complete_event.data.get("synthesis_reason") == "high_scores_with_candidates"
-    assert complete_event.iteration == 2  # Should stop at it 2
-@pytest.mark.integration
-@pytest.mark.asyncio
-async def test_partial_synthesis_generation():
-    """Verify partial synthesis includes drug candidates even if max iterations reached."""
-    mock_search = AsyncMock()
-    mock_search.execute.return_value = SearchResult(
-        query="test", evidence=[], errors=[], sources_searched=["pubmed"], total_found=0
-    )
-    mock_judge = AsyncMock()
-    # Always return low scores but WITH candidates
-    # Scores 3+3 = 6 < 8 (late threshold), so it should NOT synthesize early
-    mock_judge.assess.return_value = JudgeAssessment(
-        details=AssessmentDetails(
-            mechanism_score=3,
-            mechanism_reasoning="reasoning is sufficient for valid model",
-            clinical_evidence_score=3,
-            clinical_reasoning="reasoning is sufficient for valid model",
-            drug_candidates=["PartialDrug"],
-            key_findings=["Partial finding"],
-        ),
-        sufficient=False,
-        confidence=0.5,
-        recommendation="continue",
-        next_search_queries=[],
-        reasoning="keep going to find more evidence about this topic please",
-    )
-    orchestrator = Orchestrator(
-        search_handler=mock_search,
-        judge_handler=mock_judge,
-        config=OrchestratorConfig(max_iterations=2),
-    )
-    events = []
-    async for event in orchestrator.run("test"):
-        events.append(event)
-    complete_events = [e for e in events if e.type == "complete"]
-    assert len(complete_events) == 1, (
-        f"Expected exactly one complete event, got {len(complete_events)}"
-    )
-    complete_event = complete_events[0]
-    assert complete_event.data.get("max_reached") is True
-    # The output message should contain the drug candidate from the last assessment
-    assert "PartialDrug" in complete_event.message
-    assert "Maximum iterations reached" in complete_event.message

tests/unit/agents/test_magentic_agents_domain.py CHANGED Viewed

@@ -13,8 +13,8 @@ from src.config.domain import SEXUAL_HEALTH_CONFIG, ResearchDomain
 class TestMagenticAgentsDomain:
     @patch("src.agents.magentic_agents.ChatAgent")
-    @patch("src.agents.magentic_agents.OpenAIChatClient")
-    def test_create_search_agent_uses_domain(self, mock_client, mock_agent_cls):
         create_search_agent(domain=ResearchDomain.SEXUAL_HEALTH)
         # Check instructions or description passed to ChatAgent
@@ -23,8 +23,8 @@ class TestMagenticAgentsDomain:
         # Ideally check instructions too if we update them
     @patch("src.agents.magentic_agents.ChatAgent")
-    @patch("src.agents.magentic_agents.OpenAIChatClient")
-    def test_create_judge_agent_uses_domain(self, mock_client, mock_agent_cls):
         create_judge_agent(domain=ResearchDomain.SEXUAL_HEALTH)
         # Verify domain-specific judge system prompt is passed through
@@ -32,15 +32,15 @@ class TestMagenticAgentsDomain:
         assert SEXUAL_HEALTH_CONFIG.judge_system_prompt in call_kwargs["instructions"]
     @patch("src.agents.magentic_agents.ChatAgent")
-    @patch("src.agents.magentic_agents.OpenAIChatClient")
-    def test_create_hypothesis_agent_uses_domain(self, mock_client, mock_agent_cls):
         create_hypothesis_agent(domain=ResearchDomain.SEXUAL_HEALTH)
         call_kwargs = mock_agent_cls.call_args.kwargs
         assert SEXUAL_HEALTH_CONFIG.hypothesis_agent_description in call_kwargs["description"]
     @patch("src.agents.magentic_agents.ChatAgent")
-    @patch("src.agents.magentic_agents.OpenAIChatClient")
-    def test_create_report_agent_uses_domain(self, mock_client, mock_agent_cls):
         create_report_agent(domain=ResearchDomain.SEXUAL_HEALTH)
         # Check instructions contains domain prompt
         call_kwargs = mock_agent_cls.call_args.kwargs

 class TestMagenticAgentsDomain:
     @patch("src.agents.magentic_agents.ChatAgent")
+    @patch("src.agents.magentic_agents.get_chat_client")
+    def test_create_search_agent_uses_domain(self, mock_get_client, mock_agent_cls):
         create_search_agent(domain=ResearchDomain.SEXUAL_HEALTH)
         # Check instructions or description passed to ChatAgent
         # Ideally check instructions too if we update them
     @patch("src.agents.magentic_agents.ChatAgent")
+    @patch("src.agents.magentic_agents.get_chat_client")
+    def test_create_judge_agent_uses_domain(self, mock_get_client, mock_agent_cls):
         create_judge_agent(domain=ResearchDomain.SEXUAL_HEALTH)
         # Verify domain-specific judge system prompt is passed through
         assert SEXUAL_HEALTH_CONFIG.judge_system_prompt in call_kwargs["instructions"]
     @patch("src.agents.magentic_agents.ChatAgent")
+    @patch("src.agents.magentic_agents.get_chat_client")
+    def test_create_hypothesis_agent_uses_domain(self, mock_get_client, mock_agent_cls):
         create_hypothesis_agent(domain=ResearchDomain.SEXUAL_HEALTH)
         call_kwargs = mock_agent_cls.call_args.kwargs
         assert SEXUAL_HEALTH_CONFIG.hypothesis_agent_description in call_kwargs["description"]
     @patch("src.agents.magentic_agents.ChatAgent")
+    @patch("src.agents.magentic_agents.get_chat_client")
+    def test_create_report_agent_uses_domain(self, mock_get_client, mock_agent_cls):
         create_report_agent(domain=ResearchDomain.SEXUAL_HEALTH)
         # Check instructions contains domain prompt
         call_kwargs = mock_agent_cls.call_args.kwargs

tests/unit/agents/test_magentic_judge_termination.py CHANGED Viewed

@@ -1,6 +1,6 @@
-"""Tests for Magentic Judge termination logic."""
-from unittest.mock import patch
 import pytest
@@ -8,18 +8,20 @@ from src.agents.magentic_agents import create_judge_agent
 pytestmark = pytest.mark.unit
 def test_judge_agent_has_termination_instructions() -> None:
     """Judge agent must be created with explicit instructions for early termination."""
     with patch("src.agents.magentic_agents.get_domain_config") as mock_config:
-        # Mock config to return empty strings so we test the hardcoded critical section
-        mock_config.return_value.judge_system_prompt = ""
-        with patch("src.agents.magentic_agents.ChatAgent") as mock_chat_agent_cls:
-            with patch("src.agents.magentic_agents.settings") as mock_settings:
-                mock_settings.openai_api_key = "sk-dummy"
-                mock_settings.openai_model = "gpt-4"
                 create_judge_agent()
                 # Verify ChatAgent was initialized with correct instructions
@@ -27,7 +29,7 @@ def test_judge_agent_has_termination_instructions() -> None:
                 call_kwargs = mock_chat_agent_cls.call_args.kwargs
                 instructions = call_kwargs.get("instructions", "")
-                # Verify critical sections from Solution B
                 assert "CRITICAL OUTPUT FORMAT" in instructions
                 assert "SUFFICIENT EVIDENCE" in instructions
                 assert "confidence >= 70%" in instructions
@@ -36,13 +38,23 @@ def test_judge_agent_has_termination_instructions() -> None:
 def test_judge_agent_uses_reasoning_temperature() -> None:
-    """Judge agent should be initialized with temperature=1.0."""
-    with patch("src.agents.magentic_agents.ChatAgent") as mock_chat_agent_cls:
-        with patch("src.agents.magentic_agents.settings") as mock_settings:
-            mock_settings.openai_api_key = "sk-dummy"
-            mock_settings.openai_model = "gpt-4"
             create_judge_agent()
             call_kwargs = mock_chat_agent_cls.call_args.kwargs
             assert call_kwargs.get("temperature") == 1.0

+"""Tests for Magentic Judge termination logic (SPEC-16)."""
+from unittest.mock import MagicMock, patch
 import pytest
 pytestmark = pytest.mark.unit
+# Skip if agent-framework-core not installed
+pytest.importorskip("agent_framework")
 def test_judge_agent_has_termination_instructions() -> None:
     """Judge agent must be created with explicit instructions for early termination."""
     with patch("src.agents.magentic_agents.get_domain_config") as mock_config:
+        # Mock config to return test prompts
+        mock_config.return_value.judge_system_prompt = "Test judge prompt"
+        with patch("src.agents.magentic_agents.get_chat_client") as mock_client:
+            mock_client.return_value = MagicMock()
+            with patch("src.agents.magentic_agents.ChatAgent") as mock_chat_agent_cls:
                 create_judge_agent()
                 # Verify ChatAgent was initialized with correct instructions
                 call_kwargs = mock_chat_agent_cls.call_args.kwargs
                 instructions = call_kwargs.get("instructions", "")
+                # Verify critical sections for SPEC-15 termination
                 assert "CRITICAL OUTPUT FORMAT" in instructions
                 assert "SUFFICIENT EVIDENCE" in instructions
                 assert "confidence >= 70%" in instructions
 def test_judge_agent_uses_reasoning_temperature() -> None:
+    """Judge agent should be initialized with temperature=1.0 for reasoning models."""
+    with patch("src.agents.magentic_agents.get_chat_client") as mock_client:
+        mock_client.return_value = MagicMock()
+        with patch("src.agents.magentic_agents.ChatAgent") as mock_chat_agent_cls:
             create_judge_agent()
             call_kwargs = mock_chat_agent_cls.call_args.kwargs
             assert call_kwargs.get("temperature") == 1.0
+def test_judge_agent_accepts_custom_chat_client() -> None:
+    """Judge agent should accept custom chat_client parameter (SPEC-16)."""
+    custom_client = MagicMock()
+    with patch("src.agents.magentic_agents.ChatAgent") as mock_chat_agent_cls:
+        create_judge_agent(chat_client=custom_client)
+        call_kwargs = mock_chat_agent_cls.call_args.kwargs
+        assert call_kwargs.get("chat_client") == custom_client

tests/unit/clients/__init__.py ADDED Viewed

	@@ -0,0 +1 @@


1	+ # Tests for src/clients/ package

tests/unit/clients/test_chat_client_factory.py ADDED Viewed

	@@ -0,0 +1,211 @@

+"""Unit tests for ChatClientFactory (SPEC-16: Unified Architecture)."""
+from unittest.mock import MagicMock, patch
+import pytest
+# Skip if agent-framework-core not installed
+pytest.importorskip("agent_framework")
+@pytest.mark.unit
+class TestChatClientFactory:
+    """Test get_chat_client() factory function."""
+    def test_returns_openai_client_when_openai_key_available(self) -> None:
+        """When OpenAI key is available, should return OpenAIChatClient."""
+        with patch("src.clients.factory.settings") as mock_settings:
+            mock_settings.has_openai_key = True
+            mock_settings.has_gemini_key = False
+            mock_settings.openai_api_key = "sk-test-key"
+            mock_settings.openai_model = "gpt-5"
+            from src.clients.factory import get_chat_client
+            client = get_chat_client()
+            # Should be OpenAIChatClient
+            assert "OpenAI" in type(client).__name__
+    def test_returns_huggingface_client_when_no_key_available(self) -> None:
+        """When no API key is available, should return HuggingFaceChatClient (free tier)."""
+        with patch("src.clients.factory.settings") as mock_settings:
+            mock_settings.has_openai_key = False
+            mock_settings.has_gemini_key = False
+            mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
+            mock_settings.hf_token = None
+            from src.clients.factory import get_chat_client
+            client = get_chat_client()
+            # Should be HuggingFaceChatClient
+            assert "HuggingFace" in type(client).__name__
+    def test_explicit_provider_openai_overrides_auto_detection(self) -> None:
+        """Explicit provider='openai' should use OpenAI even if no env key."""
+        with patch("src.clients.factory.settings") as mock_settings:
+            mock_settings.has_openai_key = False
+            mock_settings.has_gemini_key = False
+            mock_settings.openai_api_key = None
+            mock_settings.openai_model = "gpt-5"
+            from src.clients.factory import get_chat_client
+            # Explicit provider with api_key parameter
+            client = get_chat_client(provider="openai", api_key="sk-explicit-key")
+            assert "OpenAI" in type(client).__name__
+    def test_explicit_provider_huggingface(self) -> None:
+        """Explicit provider='huggingface' should use HuggingFace."""
+        with patch("src.clients.factory.settings") as mock_settings:
+            mock_settings.has_openai_key = True  # Even with OpenAI key available
+            mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
+            mock_settings.hf_token = None
+            from src.clients.factory import get_chat_client
+            # Explicit provider forces HuggingFace
+            client = get_chat_client(provider="huggingface")
+            assert "HuggingFace" in type(client).__name__
+    def test_gemini_provider_raises_not_implemented(self) -> None:
+        """Explicit provider='gemini' should raise NotImplementedError (Phase 4)."""
+        with patch("src.clients.factory.settings") as mock_settings:
+            mock_settings.has_openai_key = False
+            mock_settings.has_gemini_key = False
+            from src.clients.factory import get_chat_client
+            with pytest.raises(NotImplementedError, match="Gemini client not yet implemented"):
+                get_chat_client(provider="gemini")
+    def test_unsupported_provider_raises_value_error(self) -> None:
+        """Unsupported provider should raise ValueError, not silently fallback."""
+        with patch("src.clients.factory.settings") as mock_settings:
+            mock_settings.has_openai_key = False
+            mock_settings.has_gemini_key = False
+            from src.clients.factory import get_chat_client
+            with pytest.raises(ValueError, match="Unsupported provider"):
+                get_chat_client(provider="anthropic")
+    def test_provider_is_case_insensitive(self) -> None:
+        """Provider matching should be case-insensitive."""
+        with patch("src.clients.factory.settings") as mock_settings:
+            mock_settings.has_openai_key = False
+            mock_settings.has_gemini_key = False
+            mock_settings.openai_api_key = None
+            mock_settings.openai_model = "gpt-5"
+            from src.clients.factory import get_chat_client
+            # "OpenAI" should work same as "openai"
+            client = get_chat_client(provider="OpenAI", api_key="sk-test")
+            assert "OpenAI" in type(client).__name__
+            # "HUGGINGFACE" should work same as "huggingface"
+            mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
+            mock_settings.hf_token = None
+            client = get_chat_client(provider="HUGGINGFACE")
+            assert "HuggingFace" in type(client).__name__
+@pytest.mark.unit
+class TestHuggingFaceChatClient:
+    """Test HuggingFaceChatClient adapter."""
+    def test_initialization_with_defaults(self) -> None:
+        """Should initialize with default model from settings."""
+        with patch("src.clients.huggingface.settings") as mock_settings:
+            mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
+            mock_settings.hf_token = None
+            from src.clients.huggingface import HuggingFaceChatClient
+            client = HuggingFaceChatClient()
+            assert client.model_id == "meta-llama/Llama-3.1-70B-Instruct"
+    def test_initialization_with_custom_model(self) -> None:
+        """Should accept custom model_id."""
+        with patch("src.clients.huggingface.settings") as mock_settings:
+            mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
+            mock_settings.hf_token = None
+            from src.clients.huggingface import HuggingFaceChatClient
+            client = HuggingFaceChatClient(model_id="mistralai/Mistral-7B-Instruct-v0.3")
+            assert client.model_id == "mistralai/Mistral-7B-Instruct-v0.3"
+    def test_convert_messages_basic(self) -> None:
+        """Should convert ChatMessage list to HuggingFace format."""
+        with patch("src.clients.huggingface.settings") as mock_settings:
+            mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
+            mock_settings.hf_token = None
+            from agent_framework import ChatMessage
+            from src.clients.huggingface import HuggingFaceChatClient
+            client = HuggingFaceChatClient()
+            # Create mock messages
+            messages = [
+                MagicMock(spec=ChatMessage, role="user", text="Hello"),
+                MagicMock(spec=ChatMessage, role="assistant", text="Hi there!"),
+            ]
+            result = client._convert_messages(messages)
+            assert len(result) == 2
+            assert result[0] == {"role": "user", "content": "Hello"}
+            assert result[1] == {"role": "assistant", "content": "Hi there!"}
+    def test_convert_messages_handles_role_enum(self) -> None:
+        """Should extract .value from Role enum, not stringify the enum itself."""
+        with patch("src.clients.huggingface.settings") as mock_settings:
+            mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
+            mock_settings.hf_token = None
+            from enum import Enum
+            from agent_framework import ChatMessage
+            from src.clients.huggingface import HuggingFaceChatClient
+            # Simulate a Role enum like agent_framework might use
+            class Role(Enum):
+                USER = "user"
+                ASSISTANT = "assistant"
+            client = HuggingFaceChatClient()
+            # Create mock message with enum role
+            mock_msg = MagicMock(spec=ChatMessage)
+            mock_msg.role = Role.USER  # Enum, not string
+            mock_msg.text = "Hello"
+            result = client._convert_messages([mock_msg])
+            # Should be "user", NOT "Role.USER"
+            assert result[0]["role"] == "user"
+            assert "Role" not in result[0]["role"]
+    def test_inherits_from_base_chat_client(self) -> None:
+        """Should inherit from agent_framework.BaseChatClient."""
+        with patch("src.clients.huggingface.settings") as mock_settings:
+            mock_settings.huggingface_model = "meta-llama/Llama-3.1-70B-Instruct"
+            mock_settings.hf_token = None
+            from agent_framework import BaseChatClient
+            from src.clients.huggingface import HuggingFaceChatClient
+            client = HuggingFaceChatClient()
+            assert isinstance(client, BaseChatClient)

tests/unit/orchestrators/test_advanced_orchestrator.py CHANGED Viewed

@@ -1,6 +1,6 @@
 """Tests for AdvancedOrchestrator configuration."""
-from unittest.mock import patch
 import pytest
 from pydantic import ValidationError
@@ -13,29 +13,33 @@ from src.utils.config import Settings
 class TestAdvancedOrchestratorConfig:
     """Tests for configuration options."""
-    def test_default_max_rounds_is_five(self) -> None:
         """Default max_rounds should be 5 from settings."""
-        with patch("src.orchestrators.advanced.check_magentic_requirements"):
-            orch = AdvancedOrchestrator()
-            assert orch._max_rounds == 5
-    def test_explicit_max_rounds_overrides_settings(self) -> None:
         """Explicit parameter should override settings."""
-        with patch("src.orchestrators.advanced.check_magentic_requirements"):
-            orch = AdvancedOrchestrator(max_rounds=7)
-            assert orch._max_rounds == 7
-    def test_timeout_default_is_five_minutes(self) -> None:
         """Default timeout should be 300s (5 min) from settings."""
-        with patch("src.orchestrators.advanced.check_magentic_requirements"):
-            orch = AdvancedOrchestrator()
-            assert orch._timeout_seconds == 300.0
-    def test_explicit_timeout_overrides_settings(self) -> None:
         """Explicit timeout parameter should override settings."""
-        with patch("src.orchestrators.advanced.check_magentic_requirements"):
-            orch = AdvancedOrchestrator(timeout_seconds=120.0)
-            assert orch._timeout_seconds == 120.0
 @pytest.mark.unit

 """Tests for AdvancedOrchestrator configuration."""
+from unittest.mock import MagicMock, patch
 import pytest
 from pydantic import ValidationError
 class TestAdvancedOrchestratorConfig:
     """Tests for configuration options."""
+    @patch("src.orchestrators.advanced.get_chat_client")
+    def test_default_max_rounds_is_five(self, mock_get_client) -> None:
         """Default max_rounds should be 5 from settings."""
+        mock_get_client.return_value = MagicMock()
+        orch = AdvancedOrchestrator()
+        assert orch._max_rounds == 5
+    @patch("src.orchestrators.advanced.get_chat_client")
+    def test_explicit_max_rounds_overrides_settings(self, mock_get_client) -> None:
         """Explicit parameter should override settings."""
+        mock_get_client.return_value = MagicMock()
+        orch = AdvancedOrchestrator(max_rounds=7)
+        assert orch._max_rounds == 7
+    @patch("src.orchestrators.advanced.get_chat_client")
+    def test_timeout_default_is_five_minutes(self, mock_get_client) -> None:
         """Default timeout should be 300s (5 min) from settings."""
+        mock_get_client.return_value = MagicMock()
+        orch = AdvancedOrchestrator()
+        assert orch._timeout_seconds == 300.0
+    @patch("src.orchestrators.advanced.get_chat_client")
+    def test_explicit_timeout_overrides_settings(self, mock_get_client) -> None:
         """Explicit timeout parameter should override settings."""
+        mock_get_client.return_value = MagicMock()
+        orch = AdvancedOrchestrator(timeout_seconds=120.0)
+        assert orch._timeout_seconds == 120.0
 @pytest.mark.unit

tests/unit/orchestrators/test_advanced_orchestrator_domain.py CHANGED Viewed

@@ -7,45 +7,40 @@ from src.orchestrators.advanced import AdvancedOrchestrator
 class TestAdvancedOrchestratorDomain:
-    @patch("src.orchestrators.advanced.check_magentic_requirements")
-    @patch("src.orchestrators.advanced.OpenAIChatClient")
-    def test_advanced_orchestrator_accepts_domain(self, mock_client, mock_check):
         # Mock to avoid API key validation
-        mock_client.return_value = MagicMock()
         orch = AdvancedOrchestrator(domain=ResearchDomain.SEXUAL_HEALTH, api_key="sk-test")
         assert orch.domain == ResearchDomain.SEXUAL_HEALTH
-    @patch("src.orchestrators.advanced.check_magentic_requirements")
     @patch("src.orchestrators.advanced.create_search_agent")
     @patch("src.orchestrators.advanced.create_judge_agent")
     @patch("src.orchestrators.advanced.create_hypothesis_agent")
     @patch("src.orchestrators.advanced.create_report_agent")
     @patch("src.orchestrators.advanced.MagenticBuilder")
-    @patch("src.orchestrators.advanced.OpenAIChatClient")
     def test_build_workflow_uses_domain(
         self,
-        mock_client,
         mock_builder,
         mock_create_report,
         mock_create_hypothesis,
         mock_create_judge,
         mock_create_search,
-        mock_check,
     ):
-        mock_client.return_value = MagicMock()
         orch = AdvancedOrchestrator(domain=ResearchDomain.SEXUAL_HEALTH, api_key="sk-test")
         # Call private method to verify agent creation calls
         orch._build_workflow()
-        # Verify agents created with domain
-        mock_create_search.assert_called_with(
-            orch._chat_client, domain=ResearchDomain.SEXUAL_HEALTH
-        )
-        mock_create_judge.assert_called_with(orch._chat_client, domain=ResearchDomain.SEXUAL_HEALTH)
-        mock_create_hypothesis.assert_called_with(
-            orch._chat_client, domain=ResearchDomain.SEXUAL_HEALTH
-        )
-        mock_create_report.assert_called_with(
-            orch._chat_client, domain=ResearchDomain.SEXUAL_HEALTH
-        )

 class TestAdvancedOrchestratorDomain:
+    @patch("src.orchestrators.advanced.get_chat_client")
+    def test_advanced_orchestrator_accepts_domain(self, mock_get_client):
         # Mock to avoid API key validation
+        mock_client = MagicMock()
+        mock_get_client.return_value = mock_client
         orch = AdvancedOrchestrator(domain=ResearchDomain.SEXUAL_HEALTH, api_key="sk-test")
         assert orch.domain == ResearchDomain.SEXUAL_HEALTH
     @patch("src.orchestrators.advanced.create_search_agent")
     @patch("src.orchestrators.advanced.create_judge_agent")
     @patch("src.orchestrators.advanced.create_hypothesis_agent")
     @patch("src.orchestrators.advanced.create_report_agent")
     @patch("src.orchestrators.advanced.MagenticBuilder")
+    @patch("src.orchestrators.advanced.get_chat_client")
     def test_build_workflow_uses_domain(
         self,
+        mock_get_client,
         mock_builder,
         mock_create_report,
         mock_create_hypothesis,
         mock_create_judge,
         mock_create_search,
     ):
+        mock_client = MagicMock()
+        mock_get_client.return_value = mock_client
         orch = AdvancedOrchestrator(domain=ResearchDomain.SEXUAL_HEALTH, api_key="sk-test")
         # Call private method to verify agent creation calls
         orch._build_workflow()
+        # Verify agents created with domain and correct client
+        mock_create_search.assert_called_with(mock_client, domain=ResearchDomain.SEXUAL_HEALTH)
+        mock_create_judge.assert_called_with(mock_client, domain=ResearchDomain.SEXUAL_HEALTH)
+        mock_create_hypothesis.assert_called_with(mock_client, domain=ResearchDomain.SEXUAL_HEALTH)
+        mock_create_report.assert_called_with(mock_client, domain=ResearchDomain.SEXUAL_HEALTH)

tests/unit/orchestrators/test_factory_domain.py CHANGED Viewed

@@ -1,14 +1,16 @@
 """Tests for Orchestrator Factory domain support."""
-from unittest.mock import ANY, MagicMock, patch
 from src.config.domain import ResearchDomain
 from src.orchestrators.factory import create_orchestrator
 class TestFactoryDomain:
-    @patch("src.orchestrators.factory.Orchestrator")
-    def test_create_simple_uses_domain(self, mock_simple_cls):
         mock_search = MagicMock()
         mock_judge = MagicMock()
@@ -19,12 +21,8 @@ class TestFactoryDomain:
             domain=ResearchDomain.SEXUAL_HEALTH,
         )
-        mock_simple_cls.assert_called_with(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-            config=ANY,
-            domain=ResearchDomain.SEXUAL_HEALTH,
-        )
     @patch("src.orchestrators.factory._get_advanced_orchestrator_class")
     def test_create_advanced_uses_domain(self, mock_get_cls):

 """Tests for Orchestrator Factory domain support."""
+from unittest.mock import MagicMock, patch
 from src.config.domain import ResearchDomain
 from src.orchestrators.factory import create_orchestrator
 class TestFactoryDomain:
+    @patch("src.orchestrators.factory._get_advanced_orchestrator_class")
+    def test_create_simple_maps_to_advanced_with_domain(self, mock_get_cls):
+        mock_adv_cls = MagicMock()
+        mock_get_cls.return_value = mock_adv_cls
         mock_search = MagicMock()
         mock_judge = MagicMock()
             domain=ResearchDomain.SEXUAL_HEALTH,
         )
+        call_kwargs = mock_adv_cls.call_args.kwargs
+        assert call_kwargs["domain"] == ResearchDomain.SEXUAL_HEALTH
     @patch("src.orchestrators.factory._get_advanced_orchestrator_class")
     def test_create_advanced_uses_domain(self, mock_get_cls):

tests/unit/orchestrators/test_simple_orchestrator_domain.py DELETED Viewed

@@ -1,47 +0,0 @@
-"""Tests for Orchestrator (Simple) domain support."""
-from unittest.mock import MagicMock
-from src.config.domain import SEXUAL_HEALTH_CONFIG, ResearchDomain
-from src.orchestrators.simple import Orchestrator
-class TestSimpleOrchestratorDomain:
-    def test_orchestrator_accepts_domain(self):
-        mock_search = MagicMock()
-        mock_judge = MagicMock()
-        orch = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-            domain=ResearchDomain.SEXUAL_HEALTH,
-        )
-        assert orch.domain == ResearchDomain.SEXUAL_HEALTH
-        assert orch.domain_config.name == SEXUAL_HEALTH_CONFIG.name
-    def test_orchestrator_uses_domain_title_in_synthesis(self):
-        mock_search = MagicMock()
-        mock_judge = MagicMock()
-        orch = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-            domain=ResearchDomain.SEXUAL_HEALTH,
-        )
-        # Test _generate_template_synthesis (the sync fallback method)
-        mock_assessment = MagicMock()
-        mock_assessment.details.drug_candidates = []
-        mock_assessment.details.key_findings = []
-        mock_assessment.confidence = 0.5
-        mock_assessment.reasoning = "test"
-        mock_assessment.details.mechanism_score = 5
-        mock_assessment.details.clinical_evidence_score = 5
-        report = orch._generate_template_synthesis("query", [], mock_assessment)
-        assert "## Sexual Health Analysis" in report
-        # Test _generate_partial_synthesis
-        report_partial = orch._generate_partial_synthesis("query", [])
-        assert "## Sexual Health Analysis" in report_partial

tests/unit/orchestrators/test_simple_synthesis.py DELETED Viewed

@@ -1,320 +0,0 @@
-"""Tests for simple orchestrator LLM synthesis."""
-from unittest.mock import AsyncMock, MagicMock, patch
-import pytest
-from src.orchestrators.simple import Orchestrator
-from src.utils.models import AssessmentDetails, Citation, Evidence, JudgeAssessment
-@pytest.fixture
-def sample_evidence() -> list[Evidence]:
-    """Sample evidence for testing synthesis."""
-    return [
-        Evidence(
-            content="Testosterone therapy demonstrates efficacy in treating HSDD.",
-            citation=Citation(
-                source="pubmed",
-                title="Testosterone and Female Sexual Desire",
-                url="https://pubmed.ncbi.nlm.nih.gov/12345/",
-                date="2023",
-                authors=["Smith J", "Jones A"],
-            ),
-        ),
-        Evidence(
-            content="A meta-analysis of 8 RCTs shows significant improvement in sexual desire.",
-            citation=Citation(
-                source="pubmed",
-                title="Meta-analysis of Testosterone Therapy",
-                url="https://pubmed.ncbi.nlm.nih.gov/67890/",
-                date="2024",
-                authors=["Johnson B"],
-            ),
-        ),
-    ]
-@pytest.fixture
-def sample_assessment() -> JudgeAssessment:
-    """Sample assessment for testing synthesis."""
-    return JudgeAssessment(
-        sufficient=True,
-        confidence=0.85,
-        reasoning="Evidence is sufficient to synthesize findings on testosterone therapy for HSDD.",
-        recommendation="synthesize",
-        next_search_queries=[],
-        details=AssessmentDetails(
-            mechanism_score=8,
-            mechanism_reasoning="Strong evidence of androgen receptor activation pathway.",
-            clinical_evidence_score=7,
-            clinical_reasoning="Multiple RCTs support efficacy in postmenopausal HSDD.",
-            drug_candidates=["Testosterone", "LibiGel"],
-            key_findings=[
-                "Testosterone improves libido in postmenopausal women",
-                "Transdermal formulation has best safety profile",
-            ],
-        ),
-    )
-@pytest.mark.unit
-class TestGenerateSynthesis:
-    """Tests for _generate_synthesis method."""
-    @pytest.mark.asyncio
-    async def test_calls_llm_for_narrative(
-        self,
-        sample_evidence: list[Evidence],
-        sample_assessment: JudgeAssessment,
-    ) -> None:
-        """Synthesis should make an LLM call using pydantic_ai when judge is paid tier."""
-        mock_search = MagicMock()
-        # Paid tier JudgeHandler has 'assess' but NOT 'synthesize'
-        mock_judge = MagicMock(spec=["assess"])
-        orchestrator = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-        )
-        orchestrator.history = [{"iteration": 1}]  # Needed for footer
-        with (
-            patch("pydantic_ai.Agent") as mock_agent_class,
-            patch("src.agent_factory.judges.get_model") as mock_get_model,
-        ):
-            mock_model = MagicMock()
-            mock_get_model.return_value = mock_model
-            mock_agent = MagicMock()
-            mock_result = MagicMock()
-            mock_result.output = """### Executive Summary
-Testosterone therapy demonstrates consistent efficacy for HSDD treatment.
-### Background
-HSDD affects many postmenopausal women.
-### Evidence Synthesis
-Studies show significant improvement in sexual desire scores.
-### Recommendations
-1. Consider testosterone therapy for postmenopausal HSDD
-### Limitations
-Long-term safety data is limited.
-### References
-1. Smith J et al. (2023). Testosterone and Female Sexual Desire."""
-            mock_agent.run = AsyncMock(return_value=mock_result)
-            mock_agent_class.return_value = mock_agent
-            result = await orchestrator._generate_synthesis(
-                query="testosterone HSDD",
-                evidence=sample_evidence,
-                assessment=sample_assessment,
-            )
-            # Verify LLM agent was created and called
-            mock_agent_class.assert_called_once()
-            mock_agent.run.assert_called_once()
-            # Verify output includes narrative content
-            assert "Executive Summary" in result
-            assert "Background" in result
-            assert "Evidence Synthesis" in result
-    @pytest.mark.asyncio
-    async def test_uses_free_tier_synthesis_when_available(
-        self,
-        sample_evidence: list[Evidence],
-        sample_assessment: JudgeAssessment,
-    ) -> None:
-        """Synthesis should use judge's synthesize method when in Free Tier."""
-        mock_search = MagicMock()
-        # Free tier JudgeHandler has 'synthesize' method
-        mock_judge = MagicMock()
-        # Setup synthesize method
-        mock_judge.synthesize = AsyncMock(return_value="Free tier narrative content.")
-        orchestrator = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-        )
-        orchestrator.history = [{"iteration": 1}]
-        # We don't need to patch Agent or get_model because they shouldn't be called
-        result = await orchestrator._generate_synthesis(
-            query="test query",
-            evidence=sample_evidence,
-            assessment=sample_assessment,
-        )
-        # Verify judge's synthesize was called
-        mock_judge.synthesize.assert_called_once()
-        # Verify result contains the free tier content
-        assert "Free tier narrative content" in result
-        # Should still include footer
-        assert "Full Citation List" in result
-    @pytest.mark.asyncio
-    async def test_falls_back_on_llm_error_with_notice(
-        self,
-        sample_evidence: list[Evidence],
-        sample_assessment: JudgeAssessment,
-    ) -> None:
-        """Synthesis should fall back to template if LLM fails, WITH error notice."""
-        mock_search = MagicMock()
-        # Paid tier simulation
-        mock_judge = MagicMock(spec=["assess"])
-        orchestrator = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-        )
-        orchestrator.history = [{"iteration": 1}]
-        with patch("pydantic_ai.Agent") as mock_agent_class:
-            # Simulate LLM failure
-            mock_agent_class.side_effect = Exception("LLM unavailable")
-            result = await orchestrator._generate_synthesis(
-                query="testosterone HSDD",
-                evidence=sample_evidence,
-                assessment=sample_assessment,
-            )
-            # Should surface error to user (MS Agent Framework pattern)
-            assert "AI narrative synthesis unavailable" in result
-            assert "Error" in result
-            # Should still include template content
-            assert "Assessment" in result or "Drug Candidates" in result
-            assert "Testosterone" in result  # Drug candidate should be present
-    @pytest.mark.asyncio
-    async def test_includes_citation_footer(
-        self,
-        sample_evidence: list[Evidence],
-        sample_assessment: JudgeAssessment,
-    ) -> None:
-        """Synthesis should include full citation list footer."""
-        mock_search = MagicMock()
-        # Paid tier simulation
-        mock_judge = MagicMock(spec=["assess"])
-        orchestrator = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-        )
-        orchestrator.history = [{"iteration": 1}]
-        with (
-            patch("pydantic_ai.Agent") as mock_agent_class,
-            patch("src.agent_factory.judges.get_model"),
-        ):
-            mock_agent = MagicMock()
-            mock_result = MagicMock()
-            mock_result.output = "Narrative synthesis content."
-            mock_agent.run = AsyncMock(return_value=mock_result)
-            mock_agent_class.return_value = mock_agent
-            result = await orchestrator._generate_synthesis(
-                query="test query",
-                evidence=sample_evidence,
-                assessment=sample_assessment,
-            )
-            # Should include citation footer
-            assert "Full Citation List" in result
-            assert "pubmed.ncbi.nlm.nih.gov/12345" in result
-            assert "pubmed.ncbi.nlm.nih.gov/67890" in result
-@pytest.mark.unit
-class TestGenerateTemplateSynthesis:
-    """Tests for _generate_template_synthesis fallback method."""
-    def test_returns_structured_output(
-        self,
-        sample_evidence: list[Evidence],
-        sample_assessment: JudgeAssessment,
-    ) -> None:
-        """Template synthesis should return structured markdown."""
-        mock_search = MagicMock()
-        mock_judge = MagicMock()
-        orchestrator = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-        )
-        orchestrator.history = [{"iteration": 1}]
-        result = orchestrator._generate_template_synthesis(
-            query="testosterone HSDD",
-            evidence=sample_evidence,
-            assessment=sample_assessment,
-        )
-        # Should have all required sections
-        assert "Question" in result
-        assert "Drug Candidates" in result
-        assert "Key Findings" in result
-        assert "Assessment" in result
-        assert "Citations" in result
-    def test_includes_drug_candidates(
-        self,
-        sample_evidence: list[Evidence],
-        sample_assessment: JudgeAssessment,
-    ) -> None:
-        """Template synthesis should list drug candidates."""
-        mock_search = MagicMock()
-        mock_judge = MagicMock()
-        orchestrator = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-        )
-        orchestrator.history = [{"iteration": 1}]
-        result = orchestrator._generate_template_synthesis(
-            query="test",
-            evidence=sample_evidence,
-            assessment=sample_assessment,
-        )
-        assert "Testosterone" in result
-        assert "LibiGel" in result
-    def test_includes_scores(
-        self,
-        sample_evidence: list[Evidence],
-        sample_assessment: JudgeAssessment,
-    ) -> None:
-        """Template synthesis should include assessment scores."""
-        mock_search = MagicMock()
-        mock_judge = MagicMock()
-        orchestrator = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-        )
-        orchestrator.history = [{"iteration": 1}]
-        result = orchestrator._generate_template_synthesis(
-            query="test",
-            evidence=sample_evidence,
-            assessment=sample_assessment,
-        )
-        assert "8/10" in result  # Mechanism score
-        assert "7/10" in result  # Clinical score
-        assert "85%" in result  # Confidence

tests/unit/orchestrators/test_termination.py DELETED Viewed

@@ -1,104 +0,0 @@
-from typing import Literal
-from unittest.mock import MagicMock
-import pytest
-from src.orchestrators.simple import Orchestrator
-from src.utils.models import AssessmentDetails, JudgeAssessment
-def make_assessment(
-    mechanism: int,
-    clinical: int,
-    drug_candidates: list[str],
-    sufficient: bool = False,
-    recommendation: Literal["continue", "synthesize"] = "continue",
-    confidence: float = 0.8,
-) -> JudgeAssessment:
-    return JudgeAssessment(
-        details=AssessmentDetails(
-            mechanism_score=mechanism,
-            mechanism_reasoning="reasoning is sufficient for testing purposes",
-            clinical_evidence_score=clinical,
-            clinical_reasoning="reasoning is sufficient for testing purposes",
-            drug_candidates=drug_candidates,
-            key_findings=["finding"],
-        ),
-        sufficient=sufficient,
-        confidence=confidence,
-        recommendation=recommendation,
-        next_search_queries=[],
-        reasoning="reasoning is sufficient for testing purposes",
-    )
-@pytest.fixture
-def orchestrator():
-    search = MagicMock()
-    judge = MagicMock()
-    return Orchestrator(search, judge)
-@pytest.mark.unit
-def test_should_synthesize_high_scores(orchestrator):
-    """High scores with drug candidates triggers synthesis."""
-    assessment = make_assessment(mechanism=7, clinical=6, drug_candidates=["Testosterone"])
-    # Access the private method via name mangling or just call it if it was public.
-    # Since I made it private _should_synthesize, I access it directly.
-    should_synth, reason = orchestrator._should_synthesize(
-        assessment, iteration=3, max_iterations=10, evidence_count=50
-    )
-    assert should_synth is True
-    assert reason == "high_scores_with_candidates"
-@pytest.mark.unit
-def test_should_synthesize_late_iteration(orchestrator):
-    """Late iteration with acceptable scores triggers synthesis."""
-    assessment = make_assessment(mechanism=5, clinical=4, drug_candidates=[])
-    should_synth, reason = orchestrator._should_synthesize(
-        assessment, iteration=9, max_iterations=10, evidence_count=80
-    )
-    assert should_synth is True
-    assert reason in ["late_iteration_acceptable", "emergency_synthesis"]
-@pytest.mark.unit
-def test_should_not_synthesize_early_low_scores(orchestrator):
-    """Early iteration with low scores continues searching."""
-    assessment = make_assessment(mechanism=3, clinical=2, drug_candidates=[])
-    should_synth, reason = orchestrator._should_synthesize(
-        assessment, iteration=2, max_iterations=10, evidence_count=20
-    )
-    assert should_synth is False
-    assert reason == "continue_searching"
-@pytest.mark.unit
-def test_judge_approved_overrides_all(orchestrator):
-    """If judge explicitly says synthesize with good scores, do it."""
-    assessment = make_assessment(
-        mechanism=6, clinical=5, drug_candidates=[], sufficient=True, recommendation="synthesize"
-    )
-    should_synth, reason = orchestrator._should_synthesize(
-        assessment, iteration=2, max_iterations=10, evidence_count=20
-    )
-    assert should_synth is True
-    assert reason == "judge_approved"
-@pytest.mark.unit
-def test_max_evidence_threshold(orchestrator):
-    """Force synthesis if we have tons of evidence."""
-    assessment = make_assessment(mechanism=2, clinical=2, drug_candidates=[])
-    should_synth, reason = orchestrator._should_synthesize(
-        assessment, iteration=5, max_iterations=10, evidence_count=150
-    )
-    assert should_synth is True
-    assert reason == "max_evidence_reached"

tests/unit/test_app_domain.py CHANGED Viewed

@@ -1,82 +1,91 @@
-"""Tests for App domain support."""
 from unittest.mock import ANY, MagicMock, patch
 from src.app import configure_orchestrator, research_agent
 from src.config.domain import ResearchDomain
 class TestAppDomain:
     @patch("src.app.create_orchestrator")
-    @patch("src.app.MockJudgeHandler")
-    def test_configure_orchestrator_passes_domain_mock_mode(self, mock_judge, mock_create):
-        """Test domain is passed when using mock mode (unit test path)."""
-        configure_orchestrator(use_mock=True, mode="simple", domain=ResearchDomain.SEXUAL_HEALTH)
-        # MockJudgeHandler should receive domain
-        mock_judge.assert_called_with(domain=ResearchDomain.SEXUAL_HEALTH)
         mock_create.assert_called_with(
-            search_handler=ANY,
-            judge_handler=ANY,
             config=ANY,
-            mode="simple",
             api_key=None,
             domain=ResearchDomain.SEXUAL_HEALTH,
         )
-    @patch.dict("os.environ", {}, clear=True)
-    @patch("src.app.settings")
     @patch("src.app.create_orchestrator")
-    @patch("src.app.HFInferenceJudgeHandler")
-    def test_configure_orchestrator_passes_domain_free_tier(
-        self, mock_hf_judge, mock_create, mock_settings
-    ):
-        """Test domain is passed when using free tier (no API keys)."""
-        # Simulate no keys in settings
-        mock_settings.has_openai_key = False
-        mock_settings.has_anthropic_key = False
-        configure_orchestrator(use_mock=False, mode="simple", domain=ResearchDomain.SEXUAL_HEALTH)
-        # HFInferenceJudgeHandler should receive domain (no API keys = free tier)
-        mock_hf_judge.assert_called_with(domain=ResearchDomain.SEXUAL_HEALTH)
         mock_create.assert_called_with(
-            search_handler=ANY,
-            judge_handler=ANY,
             config=ANY,
-            mode="simple",
-            api_key=None,
-            domain=ResearchDomain.SEXUAL_HEALTH,
         )
     @patch("src.app.settings")
     @patch("src.app.configure_orchestrator")
     async def test_research_agent_passes_domain(self, mock_config, mock_settings):
         # Mock settings to have some state
         mock_settings.has_openai_key = False
         mock_settings.has_anthropic_key = False
         # Mock orchestrator
         mock_orch = MagicMock()
-        mock_orch.run.return_value = []  # Async iterator?
-        # To mock async generator
         async def async_gen(*args):
             if False:
                 yield  # Make it a generator
         mock_orch.run = async_gen
         mock_config.return_value = (mock_orch, "Test Backend")
-        # Consume the generator from research_agent
         gen = research_agent(
-            message="query", history=[], mode="simple", domain=ResearchDomain.SEXUAL_HEALTH
         )
         async for _ in gen:
             pass
         mock_config.assert_called_with(
-            use_mock=False, mode="simple", user_api_key=None, domain=ResearchDomain.SEXUAL_HEALTH
         )

+"""Tests for App domain support (SPEC-16: Unified Architecture)."""
 from unittest.mock import ANY, MagicMock, patch
+import pytest
 from src.app import configure_orchestrator, research_agent
 from src.config.domain import ResearchDomain
+pytestmark = pytest.mark.unit
 class TestAppDomain:
+    """Test domain parameter handling in app.py."""
     @patch("src.app.create_orchestrator")
+    def test_configure_orchestrator_passes_domain(self, mock_create):
+        """Test domain is passed to create_orchestrator (SPEC-16: unified architecture)."""
+        # Mock return value
+        mock_orch = MagicMock()
+        mock_create.return_value = mock_orch
+        configure_orchestrator(
+            use_mock=False,
+            mode="advanced",  # SPEC-16: always advanced
+            domain=ResearchDomain.SEXUAL_HEALTH,
+        )
         mock_create.assert_called_with(
             config=ANY,
+            mode="advanced",
             api_key=None,
             domain=ResearchDomain.SEXUAL_HEALTH,
         )
     @patch("src.app.create_orchestrator")
+    def test_configure_orchestrator_with_api_key(self, mock_create):
+        """Test API key is passed through."""
+        mock_orch = MagicMock()
+        mock_create.return_value = mock_orch
+        configure_orchestrator(
+            use_mock=False,
+            user_api_key="sk-test-key",
+            domain="sexual_health",
+        )
         mock_create.assert_called_with(
             config=ANY,
+            mode="advanced",
+            api_key="sk-test-key",
+            domain="sexual_health",
         )
+    @pytest.mark.asyncio
     @patch("src.app.settings")
     @patch("src.app.configure_orchestrator")
     async def test_research_agent_passes_domain(self, mock_config, mock_settings):
+        """Test research_agent passes domain to configure_orchestrator."""
         # Mock settings to have some state
         mock_settings.has_openai_key = False
         mock_settings.has_anthropic_key = False
         # Mock orchestrator
         mock_orch = MagicMock()
+        # Mock async generator
         async def async_gen(*args):
             if False:
                 yield  # Make it a generator
         mock_orch.run = async_gen
         mock_config.return_value = (mock_orch, "Test Backend")
+        # SPEC-16: mode parameter removed from research_agent
         gen = research_agent(
+            message="query",
+            history=[],
+            domain=ResearchDomain.SEXUAL_HEALTH.value,
         )
         async for _ in gen:
             pass
+        # SPEC-16: mode is always "advanced"
         mock_config.assert_called_with(
+            use_mock=False,
+            mode="advanced",
+            user_api_key=None,
+            domain=ResearchDomain.SEXUAL_HEALTH.value,
         )

tests/unit/test_gradio_crash.py CHANGED Viewed

@@ -36,10 +36,10 @@ async def test_research_agent_handles_none_parameters():
     try:
         # This should NOT raise AttributeError: 'NoneType' object has no attribute 'strip'
         results = []
         async for result in research_agent(
             message="test query",
             history=[],
-            mode="simple",
             api_key=None,  # Simulating Gradio passing None
             api_key_state=None,  # Simulating Gradio passing None
         ):
@@ -71,10 +71,10 @@ async def test_research_agent_handles_empty_string_parameters():
     try:
         results = []
         async for result in research_agent(
             message="test query",
             history=[],
-            mode="simple",
             api_key="",  # Normal empty string
             api_key_state="",  # Normal empty string
         ):

     try:
         # This should NOT raise AttributeError: 'NoneType' object has no attribute 'strip'
         results = []
+        # SPEC-16: mode parameter removed (unified architecture)
         async for result in research_agent(
             message="test query",
             history=[],
             api_key=None,  # Simulating Gradio passing None
             api_key_state=None,  # Simulating Gradio passing None
         ):
     try:
         results = []
+        # SPEC-16: mode parameter removed (unified architecture)
         async for result in research_agent(
             message="test query",
             history=[],
             api_key="",  # Normal empty string
             api_key_state="",  # Normal empty string
         ):

tests/unit/test_magentic_fix.py DELETED Viewed

@@ -1,101 +0,0 @@
-"""Tests for Magentic Orchestrator fixes."""
-from unittest.mock import MagicMock, patch
-import pytest
-# Skip all tests if agent_framework not installed (optional dep)
-pytest.importorskip("agent_framework")
-from agent_framework import MagenticFinalResultEvent  # noqa: E402
-from src.orchestrators.advanced import AdvancedOrchestrator as MagenticOrchestrator  # noqa: E402
-class MockChatMessage:
-    """Simulates the buggy ChatMessage that returns itself as text or has complex content."""
-    def __init__(self, content_str: str) -> None:
-        self.content_str = content_str
-        self.role = "assistant"
-    @property
-    def text(self) -> "MockChatMessage":
-        # Simulate the bug: .text returns the object itself or a repr string
-        return self
-    @property
-    def content(self) -> str:
-        # The fix plan says we should look for .content
-        return self.content_str
-    def __repr__(self) -> str:
-        return "<ChatMessage object at 0xMOCK>"
-    def __str__(self) -> str:
-        return "<ChatMessage object at 0xMOCK>"
-@pytest.fixture
-def mock_magentic_requirements():
-    """Mock the API key check so tests run in CI without OPENAI_API_KEY."""
-    with patch("src.orchestrators.advanced.check_magentic_requirements"):
-        yield
-class TestMagenticFixes:
-    """Tests for the Magentic mode fixes."""
-    def test_process_event_extracts_text_correctly(self, mock_magentic_requirements) -> None:
-        """
-        Test that _process_event correctly extracts text from a ChatMessage.
-        Verifies fix for bug where .text returns the object itself.
-        """
-        orchestrator = MagenticOrchestrator()
-        # Create a mock message that mimics the bug
-        buggy_message = MockChatMessage("Final Report Content")
-        event = MagenticFinalResultEvent(message=buggy_message)  # type: ignore[arg-type]
-        # Process the event
-        # We expect the fix to get "Final Report Content" instead of object repr
-        result_event = orchestrator._process_event(event, iteration=1)
-        assert result_event is not None
-        assert result_event.type == "complete"
-        assert result_event.message == "Final Report Content"
-    def test_max_rounds_configuration(self, mock_magentic_requirements) -> None:
-        """Test that max_rounds is correctly passed to the orchestrator."""
-        orchestrator = MagenticOrchestrator(max_rounds=25)
-        assert orchestrator._max_rounds == 25
-        # Also verify it's used in _build_workflow
-        # Mock all the agent creation and OpenAI client calls
-        with (
-            patch("src.orchestrators.advanced.create_search_agent") as mock_search,
-            patch("src.orchestrators.advanced.create_judge_agent") as mock_judge,
-            patch("src.orchestrators.advanced.create_hypothesis_agent") as mock_hypo,
-            patch("src.orchestrators.advanced.create_report_agent") as mock_report,
-            patch("src.orchestrators.advanced.OpenAIChatClient") as mock_client,
-            patch("src.orchestrators.advanced.MagenticBuilder") as mock_builder,
-        ):
-            # Setup mocks
-            mock_search.return_value = MagicMock()
-            mock_judge.return_value = MagicMock()
-            mock_hypo.return_value = MagicMock()
-            mock_report.return_value = MagicMock()
-            mock_client.return_value = MagicMock()
-            # Mock the builder chain
-            mock_chain = mock_builder.return_value.participants.return_value
-            mock_chain.with_standard_manager.return_value.build.return_value = MagicMock()
-            orchestrator._build_workflow()
-            # Check that max_round_count was passed as 25
-            participants_mock = mock_builder.return_value.participants.return_value
-            participants_mock.with_standard_manager.assert_called_once()
-            call_kwargs = participants_mock.with_standard_manager.call_args.kwargs
-            assert call_kwargs["max_round_count"] == 25

tests/unit/test_magentic_termination.py DELETED Viewed

@@ -1,155 +0,0 @@
-"""Tests for Magentic Orchestrator termination guarantee."""
-from unittest.mock import MagicMock, patch
-import pytest
-# Skip all tests if agent_framework not installed (optional dep)
-# MUST come before any agent_framework imports
-pytest.importorskip("agent_framework")
-from agent_framework import MagenticAgentMessageEvent  # noqa: E402
-from src.orchestrators.advanced import AdvancedOrchestrator as MagenticOrchestrator  # noqa: E402
-from src.utils.models import AgentEvent  # noqa: E402
-class MockChatMessage:
-    def __init__(self, content):
-        self.content = content
-        self.role = "assistant"
-    @property
-    def text(self):
-        return self.content
-@pytest.fixture
-def mock_magentic_requirements():
-    """Mock requirements check."""
-    with patch("src.orchestrators.advanced.check_magentic_requirements"):
-        yield
-@pytest.mark.asyncio
-async def test_termination_event_emitted_on_stream_end(mock_magentic_requirements):
-    """
-    Verify that a termination event is emitted when the workflow stream ends
-    without a MagenticFinalResultEvent (e.g. max rounds reached).
-    """
-    orchestrator = MagenticOrchestrator(max_rounds=2)
-    # Use real event class
-    mock_message = MockChatMessage("Thinking...")
-    mock_agent_event = MagenticAgentMessageEvent(agent_id="SearchAgent", message=mock_message)
-    # Mock the workflow and its run_stream method
-    mock_workflow = MagicMock()
-    # Create an async generator for run_stream
-    async def mock_stream(task):
-        # Yield the real message event
-        yield mock_agent_event
-        # STOP HERE - No FinalResultEvent
-    mock_workflow.run_stream = mock_stream
-    # Mock _build_workflow to return our mock workflow
-    with patch.object(orchestrator, "_build_workflow", return_value=mock_workflow):
-        events = []
-        async for event in orchestrator.run("Research query"):
-            events.append(event)
-        for i, e in enumerate(events):
-            print(f"Event {i}: {e.type} - {e.message}")
-        assert len(events) >= 2
-        assert events[0].type == "started"
-        # Verify the message event was processed
-        # Depending on _process_event logic, MagenticAgentMessageEvent might map to different types
-        # We assume it maps to something valid or we just check presence.
-        assert any("Thinking..." in e.message for e in events)
-        # THE CRITICAL CHECK: Did we get the fallback termination event?
-        last_event = events[-1]
-        assert last_event.type == "complete"
-        assert "Max iterations reached" in last_event.message
-        assert last_event.data.get("reason") == "max_rounds_reached"
-@pytest.mark.asyncio
-async def test_no_double_termination_event(mock_magentic_requirements):
-    """
-    Verify that we DO NOT emit a fallback event if the workflow finished normally.
-    """
-    orchestrator = MagenticOrchestrator()
-    mock_workflow = MagicMock()
-    with patch.object(orchestrator, "_build_workflow", return_value=mock_workflow):
-        # Mock _process_event to simulate a natural completion event
-        with patch.object(orchestrator, "_process_event") as mock_process:
-            mock_process.side_effect = [
-                AgentEvent(type="thinking", message="Working...", iteration=1),
-                AgentEvent(type="complete", message="Done!", iteration=2),
-            ]
-            async def mock_stream_with_yields(task):
-                yield "raw_event_1"
-                yield "raw_event_2"
-            mock_workflow.run_stream = mock_stream_with_yields
-            events = []
-            async for event in orchestrator.run("Research query"):
-                events.append(event)
-            assert events[-1].message == "Done!"
-            assert events[-1].type == "complete"
-            # Verify we didn't get a SECOND "Max iterations reached" event
-            fallback_events = [e for e in events if "Max iterations reached" in e.message]
-            assert len(fallback_events) == 0
-@pytest.mark.asyncio
-async def test_termination_on_timeout(mock_magentic_requirements):
-    """
-    Verify that a termination event is emitted when the workflow times out.
-    """
-    orchestrator = MagenticOrchestrator()
-    mock_workflow = MagicMock()
-    # Simulate a stream that times out (raises TimeoutError)
-    async def mock_stream_raises(task):
-        # Yield one event before timing out
-        yield MagenticAgentMessageEvent(
-            agent_id="SearchAgent", message=MockChatMessage("Working...")
-        )
-        raise TimeoutError()
-    mock_workflow.run_stream = mock_stream_raises
-    with patch.object(orchestrator, "_build_workflow", return_value=mock_workflow):
-        events = []
-        async for event in orchestrator.run("Research query"):
-            events.append(event)
-        # Check for progress/normal events
-        assert any("Working..." in e.message for e in events)
-        # Check for timeout completion
-        completion_events = [e for e in events if e.type == "complete"]
-        assert len(completion_events) > 0
-        last_event = completion_events[-1]
-        # New behavior: synthesis is attempted on timeout
-        # The message contains the report, so we check the reason code
-        # In unit tests without API keys, synthesis will fail -> "timeout_synthesis_failed"
-        assert last_event.data.get("reason") in (
-            "timeout",
-            "timeout_synthesis",
-            "timeout_synthesis_failed",  # Expected in unit tests (no API key)
-        )

tests/unit/test_orchestrator.py DELETED Viewed

@@ -1,290 +0,0 @@
-"""Unit tests for Orchestrator."""
-from unittest.mock import AsyncMock, patch
-import pytest
-from src.orchestrators import Orchestrator
-from src.utils.models import (
-    AgentEvent,
-    AssessmentDetails,
-    Citation,
-    Evidence,
-    JudgeAssessment,
-    OrchestratorConfig,
-    SearchResult,
-)
-class TestOrchestrator:
-    """Tests for Orchestrator."""
-    @pytest.fixture
-    def mock_search_handler(self):
-        """Create a mock search handler."""
-        handler = AsyncMock()
-        handler.execute = AsyncMock(
-            return_value=SearchResult(
-                query="test",
-                evidence=[
-                    Evidence(
-                        content="Test content",
-                        citation=Citation(
-                            source="pubmed",
-                            title="Test Title",
-                            url="https://pubmed.ncbi.nlm.nih.gov/12345/",
-                            date="2024-01-01",
-                        ),
-                    ),
-                ],
-                sources_searched=["pubmed"],
-                total_found=1,
-                errors=[],
-            )
-        )
-        return handler
-    @pytest.fixture
-    def mock_judge_sufficient(self):
-        """Create a mock judge that returns sufficient."""
-        handler = AsyncMock()
-        handler.assess = AsyncMock(
-            return_value=JudgeAssessment(
-                details=AssessmentDetails(
-                    mechanism_score=8,
-                    mechanism_reasoning="Good mechanism",
-                    clinical_evidence_score=7,
-                    clinical_reasoning="Good clinical",
-                    drug_candidates=["Drug A"],
-                    key_findings=["Finding 1"],
-                ),
-                sufficient=True,
-                confidence=0.85,
-                recommendation="synthesize",
-                next_search_queries=[],
-                reasoning="Evidence is sufficient",
-            )
-        )
-        return handler
-    @pytest.fixture
-    def mock_judge_insufficient(self):
-        """Create a mock judge that returns insufficient."""
-        handler = AsyncMock()
-        handler.assess = AsyncMock(
-            return_value=JudgeAssessment(
-                details=AssessmentDetails(
-                    mechanism_score=4,
-                    mechanism_reasoning="Weak mechanism",
-                    clinical_evidence_score=3,
-                    clinical_reasoning="Weak clinical",
-                    drug_candidates=[],
-                    key_findings=[],
-                ),
-                sufficient=False,
-                confidence=0.3,
-                recommendation="continue",
-                next_search_queries=["more specific query"],
-                reasoning="Need more evidence to make a decision.",
-            )
-        )
-        return handler
-    @pytest.mark.asyncio
-    async def test_orchestrator_completes_with_sufficient_evidence(
-        self,
-        mock_search_handler,
-        mock_judge_sufficient,
-    ):
-        """Orchestrator should complete when evidence is sufficient."""
-        config = OrchestratorConfig(max_iterations=5)
-        orchestrator = Orchestrator(
-            search_handler=mock_search_handler,
-            judge_handler=mock_judge_sufficient,
-            config=config,
-        )
-        events = []
-        async for event in orchestrator.run("test query"):
-            events.append(event)
-        # Should have started, searched, judged, and completed
-        event_types = [e.type for e in events]
-        assert "started" in event_types
-        assert "searching" in event_types
-        assert "search_complete" in event_types
-        assert "judging" in event_types
-        assert "judge_complete" in event_types
-        assert "complete" in event_types
-        # Should only have 1 iteration
-        complete_event = next(e for e in events if e.type == "complete")
-        assert complete_event.iteration == 1
-    @pytest.mark.asyncio
-    async def test_orchestrator_loops_when_insufficient(
-        self,
-        mock_search_handler,
-        mock_judge_insufficient,
-    ):
-        """Orchestrator should loop when evidence is insufficient."""
-        config = OrchestratorConfig(max_iterations=3)
-        orchestrator = Orchestrator(
-            search_handler=mock_search_handler,
-            judge_handler=mock_judge_insufficient,
-            config=config,
-        )
-        events = []
-        async for event in orchestrator.run("test query"):
-            events.append(event)
-        # Should have looping events
-        event_types = [e.type for e in events]
-        assert event_types.count("looping") >= 2  # noqa: PLR2004
-        # Should hit max iterations
-        complete_event = next(e for e in events if e.type == "complete")
-        assert complete_event.data.get("max_reached") is True
-    @pytest.mark.asyncio
-    async def test_orchestrator_respects_max_iterations(
-        self,
-        mock_search_handler,
-        mock_judge_insufficient,
-    ):
-        """Orchestrator should stop at max_iterations."""
-        config = OrchestratorConfig(max_iterations=2)
-        orchestrator = Orchestrator(
-            search_handler=mock_search_handler,
-            judge_handler=mock_judge_insufficient,
-            config=config,
-        )
-        events = []
-        async for event in orchestrator.run("test query"):
-            events.append(event)
-        # Should have exactly 2 iterations
-        max_iteration = max(e.iteration for e in events)
-        assert max_iteration == 2  # noqa: PLR2004
-    @pytest.mark.asyncio
-    async def test_orchestrator_handles_search_error(self):
-        """Orchestrator should handle search errors gracefully."""
-        mock_search = AsyncMock()
-        mock_search.execute = AsyncMock(side_effect=Exception("Search failed"))
-        mock_judge = AsyncMock()
-        mock_judge.assess = AsyncMock(
-            return_value=JudgeAssessment(
-                details=AssessmentDetails(
-                    mechanism_score=0,
-                    mechanism_reasoning="Not applicable here.",
-                    clinical_evidence_score=0,
-                    clinical_reasoning="Not applicable here.",
-                    drug_candidates=[],
-                    key_findings=[],
-                ),
-                sufficient=False,
-                confidence=0.0,
-                recommendation="continue",
-                next_search_queries=["retry query"],
-                reasoning="Search failed, retrying...",
-            )
-        )
-        config = OrchestratorConfig(max_iterations=2)
-        orchestrator = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge,
-            config=config,
-        )
-        events = []
-        async for event in orchestrator.run("test query"):
-            events.append(event)
-        # Should recover and loop despite errors
-        event_types = [e.type for e in events]
-        assert "error" not in event_types
-        assert "looping" in event_types
-    @pytest.mark.asyncio
-    async def test_orchestrator_deduplicates_evidence(self, mock_judge_insufficient):
-        """Orchestrator should deduplicate evidence by URL."""
-        # Search returns same evidence each time
-        duplicate_evidence = Evidence(
-            content="Duplicate content",
-            citation=Citation(
-                source="pubmed",
-                title="Same Title",
-                url="https://pubmed.ncbi.nlm.nih.gov/12345/",  # Same URL
-                date="2024-01-01",
-            ),
-        )
-        mock_search = AsyncMock()
-        mock_search.execute = AsyncMock(
-            return_value=SearchResult(
-                query="test",
-                evidence=[duplicate_evidence],
-                sources_searched=["pubmed"],
-                total_found=1,
-                errors=[],
-            )
-        )
-        config = OrchestratorConfig(max_iterations=2)
-        orchestrator = Orchestrator(
-            search_handler=mock_search,
-            judge_handler=mock_judge_insufficient,
-            config=config,
-        )
-        # Force use of local (in-memory) embedding service for test isolation
-        # Without this, the test uses persistent LlamaIndex store which has data from previous runs
-        with patch("src.utils.service_loader.settings") as mock_settings:
-            mock_settings.has_openai_key = False
-            events = []
-            async for event in orchestrator.run("test query"):
-                events.append(event)
-        # Second search_complete should show 0 new evidence
-        search_complete_events = [e for e in events if e.type == "search_complete"]
-        assert len(search_complete_events) == 2  # noqa: PLR2004
-        # First iteration should have 1 new
-        assert search_complete_events[0].data["new_count"] == 1
-        # Second iteration should have 0 new (duplicate)
-        assert search_complete_events[1].data["new_count"] == 0
-class TestAgentEvent:
-    """Tests for AgentEvent."""
-    def test_to_markdown(self):
-        """AgentEvent should format to markdown correctly."""
-        event = AgentEvent(
-            type="searching",
-            message="Searching for: testosterone libido",
-            iteration=1,
-        )
-        md = event.to_markdown()
-        assert "🔍" in md
-        assert "SEARCHING" in md
-        assert "testosterone libido" in md
-    def test_complete_event_icon(self):
-        """Complete event should have celebration icon."""
-        event = AgentEvent(
-            type="complete",
-            message="Done!",
-            iteration=3,
-        )
-        md = event.to_markdown()
-        assert "🎉" in md

tests/unit/test_orchestrator_factory.py CHANGED Viewed

@@ -6,7 +6,7 @@ import pytest
 pytestmark = pytest.mark.unit
-from src.orchestrators import Orchestrator, create_orchestrator
 @pytest.fixture
@@ -16,7 +16,7 @@ def mock_settings():
 @pytest.fixture
-def mock_magentic_cls():
     with patch("src.orchestrators.factory._get_advanced_orchestrator_class") as mock:
         # The mock returns a class (callable), which returns an instance
         mock_class = MagicMock()
@@ -29,37 +29,32 @@ def mock_handlers():
     return MagicMock(), MagicMock()
-def test_create_orchestrator_simple_explicit(mock_settings, mock_handlers):
-    """Test explicit simple mode."""
     search, judge = mock_handlers
     orch = create_orchestrator(search_handler=search, judge_handler=judge, mode="simple")
-    assert isinstance(orch, Orchestrator)
-def test_create_orchestrator_advanced_explicit(mock_settings, mock_handlers, mock_magentic_cls):
-    """Test explicit advanced mode."""
-    # Ensure has_openai_key is True so it doesn't error if we add checks
-    mock_settings.has_openai_key = True
     orch = create_orchestrator(mode="advanced")
     # verify instantiated
-    mock_magentic_cls.assert_called_once()
-    assert orch == mock_magentic_cls.return_value
-def test_create_orchestrator_auto_advanced(mock_settings, mock_magentic_cls):
-    """Test auto-detect advanced mode when OpenAI key exists."""
-    mock_settings.has_openai_key = True
     orch = create_orchestrator()
-    mock_magentic_cls.assert_called_once()
-    assert orch == mock_magentic_cls.return_value
-def test_create_orchestrator_auto_simple(mock_settings, mock_handlers):
-    """Test auto-detect simple mode when no paid keys."""
-    mock_settings.has_openai_key = False
-    search, judge = mock_handlers
-    orch = create_orchestrator(search_handler=search, judge_handler=judge)
-    assert isinstance(orch, Orchestrator)

 pytestmark = pytest.mark.unit
+from src.orchestrators import create_orchestrator
 @pytest.fixture
 @pytest.fixture
+def mock_advanced_cls():
     with patch("src.orchestrators.factory._get_advanced_orchestrator_class") as mock:
         # The mock returns a class (callable), which returns an instance
         mock_class = MagicMock()
     return MagicMock(), MagicMock()
+def test_create_orchestrator_simple_maps_to_advanced(
+    mock_settings, mock_handlers, mock_advanced_cls
+):
+    """Test that 'simple' mode explicitly maps to AdvancedOrchestrator."""
     search, judge = mock_handlers
+    # Pass handlers (they are ignored but shouldn't crash)
     orch = create_orchestrator(search_handler=search, judge_handler=judge, mode="simple")
+    # Verify AdvancedOrchestrator was created
+    mock_advanced_cls.assert_called_once()
+    assert orch == mock_advanced_cls.return_value
+def test_create_orchestrator_advanced_explicit(mock_settings, mock_handlers, mock_advanced_cls):
+    """Test explicit advanced mode."""
     orch = create_orchestrator(mode="advanced")
     # verify instantiated
+    mock_advanced_cls.assert_called_once()
+    assert orch == mock_advanced_cls.return_value
+def test_create_orchestrator_auto_advanced(mock_settings, mock_advanced_cls):
+    """Test auto-detect defaults to Advanced (Unified)."""
+    # Even with no keys (handled by factory internally), orchestrator factory returns Advanced
+    mock_settings.has_openai_key = False  # Simulate no key
     orch = create_orchestrator()
+    mock_advanced_cls.assert_called_once()
+    assert orch == mock_advanced_cls.return_value

tests/unit/test_streaming_fix.py CHANGED Viewed

@@ -49,7 +49,8 @@ async def test_streaming_events_are_buffered_not_spammed():
     try:
         # Run the research agent
         results = []
-        async for result in research_agent("test query", [], mode="simple", api_key=""):
             results.append(result)
         # Verify that we DO see streaming updates (for UX responsiveness)

     try:
         # Run the research agent
         results = []
+        # SPEC-16: mode parameter removed (unified architecture)
+        async for result in research_agent("test query", [], api_key=""):
             results.append(result)
         # Verify that we DO see streaming updates (for UX responsiveness)

tests/unit/test_ui_elements.py CHANGED Viewed

@@ -1,33 +1,53 @@
 import gradio as gr
 from src.app import create_demo
-def test_examples_include_advanced_mode():
-    """Verify that one example entry uses 'advanced' mode."""
     demo, _ = create_demo()
-    assert any(example[1] == "advanced" for example in demo.examples), (
-        "Expected at least one example to be 'advanced' mode"
-    )
 def test_accordion_label_updated():
-    """Verify the accordion label reflects the new, concise text."""
     _, accordion = create_demo()
-    assert accordion.label == "⚙️ Mode & API Key (Free tier works!)", (
-        "Accordion label not updated to '⚙️ Mode & API Key (Free tier works!)'"
     )
-def test_orchestrator_mode_info_text_updated():
-    """Verify the Orchestrator Mode info text contains the new emojis and phrasing."""
     demo, _ = create_demo()
-    # Assuming additional_inputs is a list and the Radio is the first element
-    orchestrator_radio = demo.additional_inputs[0]
-    expected_info = "⚡ Simple: Free/Any | 🔬 Advanced: OpenAI (Deep Research)"
-    assert isinstance(orchestrator_radio, gr.Radio), (
-        "Expected first additional input to be gr.Radio"
-    )
-    assert orchestrator_radio.info == expected_info, (
-        "Orchestrator Mode info text not updated correctly"
     )

+"""UI element tests for SPEC-16 Unified Architecture."""
 import gradio as gr
+import pytest
 from src.app import create_demo
+pytestmark = pytest.mark.unit
+def test_no_mode_selector_in_ui():
+    """SPEC-16: Mode selector removed - everyone gets Advanced Mode."""
     demo, _ = create_demo()
+    # No Radio should exist in additional_inputs
+    radios = [inp for inp in demo.additional_inputs if isinstance(inp, gr.Radio)]
+    assert len(radios) == 0, "Mode Radio should not exist (SPEC-16: unified architecture)"
 def test_accordion_label_updated():
+    """Verify the accordion label reflects the new, concise text (no Mode)."""
     _, accordion = create_demo()
+    assert accordion.label == "⚙️ API Key (Free tier works!)", (
+        f"Accordion label should be '⚙️ API Key (Free tier works!)', got '{accordion.label}'"
     )
+def test_examples_have_no_mode():
+    """SPEC-16: Examples no longer include mode parameter."""
     demo, _ = create_demo()
+    # Examples now have 4 items: [question, domain, api_key, api_key_state]
+    for example in demo.examples:
+        assert len(example) == 4, (
+            f"Examples should have 4 items [question, domain, api_key, api_key_state], "
+            f"got {len(example)}: {example}"
+        )
+        # First item is the question
+        assert isinstance(example[0], str) and len(example[0]) > 10, (
+            "First example item should be the research question"
+        )
+        # Second item is domain (not mode!)
+        assert example[1] in ("sexual_health", None), (
+            f"Second example item should be domain, got: {example[1]}"
+        )
+def test_api_key_textbox_exists():
+    """Verify API key textbox exists in additional inputs."""
+    demo, _ = create_demo()
+    textboxes = [inp for inp in demo.additional_inputs if isinstance(inp, gr.Textbox)]
+    assert len(textboxes) == 1, "Expected exactly one API key textbox"
+    assert textboxes[0].label == "🔑 API Key (Optional)", (
+        f"API key textbox label should be '🔑 API Key (Optional)', got '{textboxes[0].label}'"
     )

uv.lock CHANGED Viewed

@@ -1184,7 +1184,7 @@ requires-dist = [
     { name = "duckduckgo-search", specifier = ">=5.0" },
     { name = "gradio", extras = ["mcp"], specifier = ">=6.0.0" },
     { name = "httpx", specifier = ">=0.27" },
-    { name = "huggingface-hub", specifier = ">=0.20.0" },
     { name = "langchain", specifier = ">=0.3.9,<1.0" },
     { name = "langchain-core", specifier = ">=0.3.21,<1.0" },
     { name = "langchain-huggingface", specifier = ">=0.1.2,<1.0" },
@@ -5524,28 +5524,28 @@ wheels = [
 [[package]]
 name = "ruff"
-version = "0.14.6"
-source = { registry = "https://pypi.org/simple" }
-sdist = { url = "https://files.pythonhosted.org/packages/52/f0/62b5a1a723fe183650109407fa56abb433b00aa1c0b9ba555f9c4efec2c6/ruff-0.14.6.tar.gz", hash = "sha256:6f0c742ca6a7783a736b867a263b9a7a80a45ce9bee391eeda296895f1b4e1cc", size = 5669501 }
-wheels = [
-    { url = "https://files.pythonhosted.org/packages/67/d2/7dd544116d107fffb24a0064d41a5d2ed1c9d6372d142f9ba108c8e39207/ruff-0.14.6-py3-none-linux_armv6l.whl", hash = "sha256:d724ac2f1c240dbd01a2ae98db5d1d9a5e1d9e96eba999d1c48e30062df578a3", size = 13326119 },
-    { url = "https://files.pythonhosted.org/packages/36/6a/ad66d0a3315d6327ed6b01f759d83df3c4d5f86c30462121024361137b6a/ruff-0.14.6-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:9f7539ea257aa4d07b7ce87aed580e485c40143f2473ff2f2b75aee003186004", size = 13526007 },
-    { url = "https://files.pythonhosted.org/packages/a3/9d/dae6db96df28e0a15dea8e986ee393af70fc97fd57669808728080529c37/ruff-0.14.6-py3-none-macosx_11_0_arm64.whl", hash = "sha256:7f6007e55b90a2a7e93083ba48a9f23c3158c433591c33ee2e99a49b889c6332", size = 12676572 },
-    { url = "https://files.pythonhosted.org/packages/76/a4/f319e87759949062cfee1b26245048e92e2acce900ad3a909285f9db1859/ruff-0.14.6-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:0a8e7b9d73d8728b68f632aa8e824ef041d068d231d8dbc7808532d3629a6bef", size = 13140745 },
-    { url = "https://files.pythonhosted.org/packages/95/d3/248c1efc71a0a8ed4e8e10b4b2266845d7dfc7a0ab64354afe049eaa1310/ruff-0.14.6-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:d50d45d4553a3ebcbd33e7c5e0fe6ca4aafd9a9122492de357205c2c48f00775", size = 13076486 },
-    { url = "https://files.pythonhosted.org/packages/a5/19/b68d4563fe50eba4b8c92aa842149bb56dd24d198389c0ed12e7faff4f7d/ruff-0.14.6-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:118548dd121f8a21bfa8ab2c5b80e5b4aed67ead4b7567790962554f38e598ce", size = 13727563 },
-    { url = "https://files.pythonhosted.org/packages/47/ac/943169436832d4b0e867235abbdb57ce3a82367b47e0280fa7b4eabb7593/ruff-0.14.6-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:57256efafbfefcb8748df9d1d766062f62b20150691021f8ab79e2d919f7c11f", size = 15199755 },
-    { url = "https://files.pythonhosted.org/packages/c9/b9/288bb2399860a36d4bb0541cb66cce3c0f4156aaff009dc8499be0c24bf2/ruff-0.14.6-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:ff18134841e5c68f8e5df1999a64429a02d5549036b394fafbe410f886e1989d", size = 14850608 },
-    { url = "https://files.pythonhosted.org/packages/ee/b1/a0d549dd4364e240f37e7d2907e97ee80587480d98c7799d2d8dc7a2f605/ruff-0.14.6-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:29c4b7ec1e66a105d5c27bd57fa93203637d66a26d10ca9809dc7fc18ec58440", size = 14118754 },
-    { url = "https://files.pythonhosted.org/packages/13/ac/9b9fe63716af8bdfddfacd0882bc1586f29985d3b988b3c62ddce2e202c3/ruff-0.14.6-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:167843a6f78680746d7e226f255d920aeed5e4ad9c03258094a2d49d3028b105", size = 13949214 },
-    { url = "https://files.pythonhosted.org/packages/12/27/4dad6c6a77fede9560b7df6802b1b697e97e49ceabe1f12baf3ea20862e9/ruff-0.14.6-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:16a33af621c9c523b1ae006b1b99b159bf5ac7e4b1f20b85b2572455018e0821", size = 14106112 },
-    { url = "https://files.pythonhosted.org/packages/6a/db/23e322d7177873eaedea59a7932ca5084ec5b7e20cb30f341ab594130a71/ruff-0.14.6-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:1432ab6e1ae2dc565a7eea707d3b03a0c234ef401482a6f1621bc1f427c2ff55", size = 13035010 },
-    { url = "https://files.pythonhosted.org/packages/a8/9c/20e21d4d69dbb35e6a1df7691e02f363423658a20a2afacf2a2c011800dc/ruff-0.14.6-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:4c55cfbbe7abb61eb914bfd20683d14cdfb38a6d56c6c66efa55ec6570ee4e71", size = 13054082 },
-    { url = "https://files.pythonhosted.org/packages/66/25/906ee6a0464c3125c8d673c589771a974965c2be1a1e28b5c3b96cb6ef88/ruff-0.14.6-py3-none-musllinux_1_2_i686.whl", hash = "sha256:efea3c0f21901a685fff4befda6d61a1bf4cb43de16da87e8226a281d614350b", size = 13303354 },
-    { url = "https://files.pythonhosted.org/packages/4c/58/60577569e198d56922b7ead07b465f559002b7b11d53f40937e95067ca1c/ruff-0.14.6-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:344d97172576d75dc6afc0e9243376dbe1668559c72de1864439c4fc95f78185", size = 14054487 },
-    { url = "https://files.pythonhosted.org/packages/67/0b/8e4e0639e4cc12547f41cb771b0b44ec8225b6b6a93393176d75fe6f7d40/ruff-0.14.6-py3-none-win32.whl", hash = "sha256:00169c0c8b85396516fdd9ce3446c7ca20c2a8f90a77aa945ba6b8f2bfe99e85", size = 13013361 },
-    { url = "https://files.pythonhosted.org/packages/fb/02/82240553b77fd1341f80ebb3eaae43ba011c7a91b4224a9f317d8e6591af/ruff-0.14.6-py3-none-win_amd64.whl", hash = "sha256:390e6480c5e3659f8a4c8d6a0373027820419ac14fa0d2713bd8e6c3e125b8b9", size = 14432087 },
-    { url = "https://files.pythonhosted.org/packages/a5/1f/93f9b0fad9470e4c829a5bb678da4012f0c710d09331b860ee555216f4ea/ruff-0.14.6-py3-none-win_arm64.whl", hash = "sha256:d43c81fbeae52cfa8728d8766bbf46ee4298c888072105815b392da70ca836b2", size = 13520930 },
 ]
 [[package]]

     { name = "duckduckgo-search", specifier = ">=5.0" },
     { name = "gradio", extras = ["mcp"], specifier = ">=6.0.0" },
     { name = "httpx", specifier = ">=0.27" },
+    { name = "huggingface-hub", specifier = ">=0.24.0" },
     { name = "langchain", specifier = ">=0.3.9,<1.0" },
     { name = "langchain-core", specifier = ">=0.3.21,<1.0" },
     { name = "langchain-huggingface", specifier = ">=0.1.2,<1.0" },
 [[package]]
 name = "ruff"
+version = "0.14.7"
+source = { registry = "https://pypi.org/simple" }
+sdist = { url = "https://files.pythonhosted.org/packages/b7/5b/dd7406afa6c95e3d8fa9d652b6d6dd17dd4a6bf63cb477014e8ccd3dcd46/ruff-0.14.7.tar.gz", hash = "sha256:3417deb75d23bd14a722b57b0a1435561db65f0ad97435b4cf9f85ffcef34ae5", size = 5727324 }
+wheels = [
+    { url = "https://files.pythonhosted.org/packages/8c/b1/7ea5647aaf90106f6d102230e5df874613da43d1089864da1553b899ba5e/ruff-0.14.7-py3-none-linux_armv6l.whl", hash = "sha256:b9d5cb5a176c7236892ad7224bc1e63902e4842c460a0b5210701b13e3de4fca", size = 13414475 },
+    { url = "https://files.pythonhosted.org/packages/af/19/fddb4cd532299db9cdaf0efdc20f5c573ce9952a11cb532d3b859d6d9871/ruff-0.14.7-py3-none-macosx_10_12_x86_64.whl", hash = "sha256:3f64fe375aefaf36ca7d7250292141e39b4cea8250427482ae779a2aa5d90015", size = 13634613 },
+    { url = "https://files.pythonhosted.org/packages/40/2b/469a66e821d4f3de0440676ed3e04b8e2a1dc7575cf6fa3ba6d55e3c8557/ruff-0.14.7-py3-none-macosx_11_0_arm64.whl", hash = "sha256:93e83bd3a9e1a3bda64cb771c0d47cda0e0d148165013ae2d3554d718632d554", size = 12765458 },
+    { url = "https://files.pythonhosted.org/packages/f1/05/0b001f734fe550bcfde4ce845948ac620ff908ab7241a39a1b39bb3c5f49/ruff-0.14.7-py3-none-manylinux_2_17_aarch64.manylinux2014_aarch64.whl", hash = "sha256:3838948e3facc59a6070795de2ae16e5786861850f78d5914a03f12659e88f94", size = 13236412 },
+    { url = "https://files.pythonhosted.org/packages/11/36/8ed15d243f011b4e5da75cd56d6131c6766f55334d14ba31cce5461f28aa/ruff-0.14.7-py3-none-manylinux_2_17_armv7l.manylinux2014_armv7l.whl", hash = "sha256:24c8487194d38b6d71cd0fd17a5b6715cda29f59baca1defe1e3a03240f851d1", size = 13182949 },
+    { url = "https://files.pythonhosted.org/packages/3b/cf/fcb0b5a195455729834f2a6eadfe2e4519d8ca08c74f6d2b564a4f18f553/ruff-0.14.7-py3-none-manylinux_2_17_i686.manylinux2014_i686.whl", hash = "sha256:79c73db6833f058a4be8ffe4a0913b6d4ad41f6324745179bd2aa09275b01d0b", size = 13816470 },
+    { url = "https://files.pythonhosted.org/packages/7f/5d/34a4748577ff7a5ed2f2471456740f02e86d1568a18c9faccfc73bd9ca3f/ruff-0.14.7-py3-none-manylinux_2_17_ppc64.manylinux2014_ppc64.whl", hash = "sha256:12eb7014fccff10fc62d15c79d8a6be4d0c2d60fe3f8e4d169a0d2def75f5dad", size = 15289621 },
+    { url = "https://files.pythonhosted.org/packages/53/53/0a9385f047a858ba133d96f3f8e3c9c66a31cc7c4b445368ef88ebeac209/ruff-0.14.7-py3-none-manylinux_2_17_ppc64le.manylinux2014_ppc64le.whl", hash = "sha256:6c623bbdc902de7ff715a93fa3bb377a4e42dd696937bf95669118773dbf0c50", size = 14975817 },
+    { url = "https://files.pythonhosted.org/packages/a8/d7/2f1c32af54c3b46e7fadbf8006d8b9bcfbea535c316b0bd8813d6fb25e5d/ruff-0.14.7-py3-none-manylinux_2_17_s390x.manylinux2014_s390x.whl", hash = "sha256:f53accc02ed2d200fa621593cdb3c1ae06aa9b2c3cae70bc96f72f0000ae97a9", size = 14284549 },
+    { url = "https://files.pythonhosted.org/packages/92/05/434ddd86becd64629c25fb6b4ce7637dd52a45cc4a4415a3008fe61c27b9/ruff-0.14.7-py3-none-manylinux_2_17_x86_64.manylinux2014_x86_64.whl", hash = "sha256:281f0e61a23fcdcffca210591f0f53aafaa15f9025b5b3f9706879aaa8683bc4", size = 14071389 },
+    { url = "https://files.pythonhosted.org/packages/ff/50/fdf89d4d80f7f9d4f420d26089a79b3bb1538fe44586b148451bc2ba8d9c/ruff-0.14.7-py3-none-manylinux_2_31_riscv64.whl", hash = "sha256:dbbaa5e14148965b91cb090236931182ee522a5fac9bc5575bafc5c07b9f9682", size = 14202679 },
+    { url = "https://files.pythonhosted.org/packages/77/54/87b34988984555425ce967f08a36df0ebd339bb5d9d0e92a47e41151eafc/ruff-0.14.7-py3-none-musllinux_1_2_aarch64.whl", hash = "sha256:1464b6e54880c0fe2f2d6eaefb6db15373331414eddf89d6b903767ae2458143", size = 13147677 },
+    { url = "https://files.pythonhosted.org/packages/67/29/f55e4d44edfe053918a16a3299e758e1c18eef216b7a7092550d7a9ec51c/ruff-0.14.7-py3-none-musllinux_1_2_armv7l.whl", hash = "sha256:f217ed871e4621ea6128460df57b19ce0580606c23aeab50f5de425d05226784", size = 13151392 },
+    { url = "https://files.pythonhosted.org/packages/36/69/47aae6dbd4f1d9b4f7085f4d9dcc84e04561ee7ad067bf52e0f9b02e3209/ruff-0.14.7-py3-none-musllinux_1_2_i686.whl", hash = "sha256:6be02e849440ed3602d2eb478ff7ff07d53e3758f7948a2a598829660988619e", size = 13412230 },
+    { url = "https://files.pythonhosted.org/packages/b7/4b/6e96cb6ba297f2ba502a231cd732ed7c3de98b1a896671b932a5eefa3804/ruff-0.14.7-py3-none-musllinux_1_2_x86_64.whl", hash = "sha256:19a0f116ee5e2b468dfe80c41c84e2bbd6b74f7b719bee86c2ecde0a34563bcc", size = 14195397 },
+    { url = "https://files.pythonhosted.org/packages/69/82/251d5f1aa4dcad30aed491b4657cecd9fb4274214da6960ffec144c260f7/ruff-0.14.7-py3-none-win32.whl", hash = "sha256:e33052c9199b347c8937937163b9b149ef6ab2e4bb37b042e593da2e6f6cccfa", size = 13126751 },
+    { url = "https://files.pythonhosted.org/packages/a8/b5/d0b7d145963136b564806f6584647af45ab98946660d399ec4da79cae036/ruff-0.14.7-py3-none-win_amd64.whl", hash = "sha256:e17a20ad0d3fad47a326d773a042b924d3ac31c6ca6deb6c72e9e6b5f661a7c6", size = 14531726 },
+    { url = "https://files.pythonhosted.org/packages/1d/d2/1637f4360ada6a368d3265bf39f2cf737a0aaab15ab520fc005903e883f8/ruff-0.14.7-py3-none-win_arm64.whl", hash = "sha256:be4d653d3bea1b19742fcc6502354e32f65cd61ff2fbdb365803ef2c2aec6228", size = 13609215 },
 ]
 [[package]]