Spaces:

VibecoderMcSwaggins
/

DeepBoner

Paused

App Files Files Community

DeepBoner / docs /bugs /archive /P3_ARCHITECTURAL_GAP_STRUCTURED_MEMORY.md

VibecoderMcSwaggins

feat(search): SPEC_13 Evidence Deduplication (#98)

2c5db87 unverified 21 days ago

preview code

raw

history blame

6.18 kB

P3: Missing Structured Cognitive Memory (Shared Blackboard)

Status: OPEN Priority: P3 (Architecture/Enhancement) Found By: Deep Codebase Investigation Date: 2025-11-29 Spec: SPEC_07_LANGGRAPH_MEMORY_ARCH.md

Executive Summary

DeepBoner's AdvancedOrchestrator has Data Memory (vector store for papers) but lacks Cognitive Memory (structured state for hypotheses, conflicts, and research plan). This causes "context drift" on long runs and prevents intelligent conflict resolution.

Current Architecture (What We Have)

1. MagenticState (`src/agents/state.py:18-91`)

class MagenticState(BaseModel):
    evidence: list[Evidence] = Field(default_factory=list)
    embedding_service: Any = None  # ChromaDB connection

    def add_evidence(self, new_evidence: list[Evidence]) -> int: ...
    async def search_related(self, query: str, n_results: int = 5) -> list[Evidence]: ...

What it does: Stores Evidence objects, URL-based deduplication, semantic search via embeddings.
What it DOESN'T do: Track hypotheses, conflicts, or research plan status.

2. EmbeddingService (`src/services/embeddings.py:29-180`)

self._client = chromadb.Client()  # In-memory (Line 44)
self._collection = self._client.create_collection(
    name=f"evidence_{uuid.uuid4().hex}",  # Random name per session (Line 45-47)
    ...
)

What it does: In-session semantic search/deduplication.
Limitation: New collection per session, no persistence despite settings.chroma_db_path existing.

3. AdvancedOrchestrator (`src/orchestrators/advanced.py:51-371`)

Uses Microsoft's agent-framework-core (MagenticBuilder)
State is implicit in chat history passed between agents
Manager decides next step by reading conversation, not structured state

The Problem

Issue	Impact	Evidence
No Hypothesis Tracking	Can't update hypothesis confidence systematically	`MagenticState` has no `hypotheses` field
No Conflict Detection	Contradictory sources are ignored	No `conflicts` list to flag Source A vs Source B
Context Drift	Manager forgets original query after 50+ messages	State lives only in chat, not structured object
No Plan State	Can't pause/resume research	No `research_plan` or `next_step` tracking

The Solution: LangGraph State Graph (Nov 2025 Best Practice)

Why LangGraph?

Based on comprehensive analysis:

Explicit State Schema: TypedDict/Pydantic model that ALL agents read/write
State Reducers: Annotated[List[X], operator.add] for appending (not overwriting)
HuggingFace Compatible: Works with langchain-huggingface (Llama 3.1)
Production-Ready: MongoDB checkpointer for persistence, SQLite for dev

Target Architecture

# src/agents/graph/state.py (IMPLEMENTED)
from typing import Annotated, TypedDict, Literal
import operator
from pydantic import BaseModel, Field
from langchain_core.messages import BaseMessage

class Hypothesis(BaseModel):
    id: str
    statement: str
    status: Literal["proposed", "validating", "confirmed", "refuted"]
    confidence: float
    supporting_evidence_ids: list[str]
    contradicting_evidence_ids: list[str]

class Conflict(BaseModel):
    id: str
    description: str
    source_a_id: str
    source_b_id: str
    status: Literal["open", "resolved"]
    resolution: str | None

class ResearchState(TypedDict):
    query: str  # Immutable original question
    hypotheses: Annotated[list[Hypothesis], operator.add]
    conflicts: Annotated[list[Conflict], operator.add]
    evidence_ids: Annotated[list[str], operator.add]  # Links to ChromaDB
    messages: Annotated[list[BaseMessage], operator.add]
    next_step: Literal["search", "judge", "resolve", "synthesize", "finish"]
    iteration_count: int

Implementation Dependencies

Package	Purpose	Install
`langgraph>=0.2`	State graph framework	`uv add langgraph`
`langchain>=0.3`	Base abstractions	`uv add langchain`
`langchain-huggingface`	Llama 3.1 integration	`uv add langchain-huggingface`
`langgraph-checkpoint-sqlite`	Dev persistence	`uv add langgraph-checkpoint-sqlite`

Note: MongoDB checkpointer (langgraph-checkpoint-mongodb) recommended for production per MongoDB blog.

Alternative Considered: Mem0

Mem0 specializes in long-term memory and outperformed OpenAI by 26% in benchmarks. However:

Mem0 excels at: User personalization, cross-session memory
LangGraph excels at: Workflow orchestration, state machines
Verdict: Use LangGraph for orchestration + optionally add Mem0 for user-level memory later

Quick Win (Separate from LangGraph)

Enable ChromaDB persistence in src/services/embeddings.py:44:

# FROM:
self._client = chromadb.Client()  # In-memory

# TO:
self._client = chromadb.PersistentClient(path=settings.chroma_db_path)

This alone gives cross-session evidence persistence (P3_ARCHITECTURAL_GAP_EPHEMERAL_MEMORY fix).