corgi-qwen3-vl-demo / PROJECT_PLAN.md
dung-vpt-uney
Deploy latest CoRGI Gradio demo
b6a01d6
# CoRGI Custom Demo — Project Plan
## Context
- **Objective**: ship a runnable CoRGI demo (CLI + Gradio) powered entirely by Qwen3-VL for structured reasoning, ROI evidence extraction, and answer synthesis.
- **Scope**: stay within the `corgi_custom` package, reuse Qwen3-VL cookbooks where possible, keep dependency footprint minimal (no extra detectors/rerankers).
- **Environment**: Conda env `pytorch`, default VLM `Qwen/Qwen3-VL-8B-Thinking`.
## Milestones
| Status | Milestone | Notes |
| --- | --- | --- |
| ✅ | Core pipeline skeleton (dataclasses, parsers, Qwen client wrappers) | Already merged in repo. |
| ✅ | Project documentation & progress tracking scaffolding | Plan + progress log committed. |
| ✅ | CLI runner that prints step-by-step pipeline output | Supports overrides + JSON export. |
| ✅ | Gradio demo mirroring CLI functionality | Blocks UI with markdown report messaging. |
| ✅ | Automated tests for new modules | CLI + Gradio helpers covered with unit tests. |
| ✅ | HF Space deployment automation | Bash script + app harness for zerogpu Spaces. |
| 🟡 | Final verification (unit tests, smoke instructions) | Document how to run `pytest` and the demos. |
## Work Breakdown Structure
1. **Docs & Tracking**
- [x] Finalize plan and progress log templates.
- [x] Document environment setup expectations.
2. **Pipeline UX**
- [x] Implement CLI entrypoint (`corgi.cli:main`).
- [x] Provide structured stdout for steps/evidence/answer.
- [x] Allow optional JSON dump for downstream tooling.
3. **Interactive Demo**
- [x] Build Gradio app harness (image upload + question textbox).
- [ ] Stream progress (optional) and display textual reasoning/evidence.
- [x] Handle model loading errors gracefully.
4. **Testing & Tooling**
- [x] Add fixture-friendly helpers to avoid heavy model loads in tests.
- [x] Write unit tests for CLI argument parsing + formatting.
- [ ] Add regression test for pipeline serialization.
5. **Docs & Hand-off**
- [ ] Update README/demo instructions.
- [ ] Provide sample command sequences for CLI/Gradio.
- [ ] Capture open risks & future enhancements.
6. **Deployment & Ops**
- [x] Add Hugging Face Space entrypoint (`app.py`).
- [x] Write deployment helper script (`scripts/push_space.sh`).
- [ ] Add automated checklists/logs for Space updates.
## Risks & Mitigations
- **Model loading latency / VRAM** → expose config knobs and mention 4B fallback.
- **Parsing drift from Qwen outputs** → keep parser tolerant; add debug flag to dump raw responses.
- **Test runtime** → mock Qwen client via fixtures; avoid loading real model in unit tests.
## Progress Tracking
- Refer to `PROGRESS_LOG.md` for dated status updates.
- Update milestone table whenever a deliverable completes.