Spaces:
Running
Running
| title: Pip - Emotional AI Companion | |
| emoji: 🫧 | |
| colorFrom: blue | |
| colorTo: purple | |
| sdk: gradio | |
| sdk_version: 6.0.1 | |
| app_file: app.py | |
| pinned: true | |
| license: mit | |
| short_description: A blob friend who transforms your feelings into visual art | |
| tags: | |
| - mcp-in-action-track-creative | |
| - mcp-in-action-track-consumer | |
| - agents | |
| - mcp | |
| > 🎥 **Demo Video**: https://youtu.be/bWDj4gyngNI | |
| > | |
| > 📢 **Social Post**: https://x.com/07amit10/status/1995270517251801162 | |
| > | |
| > 👥 **Team**: @Itsjustamit | |
| # 🫧 Pip - Your Emotional AI Companion | |
| **Pip is a cute blob companion that understands your emotions and responds with conversation, context-specific imagery, and soothing voice.** | |
| Not a generic assistant - Pip is an emotional friend who knows when to reflect, celebrate, or gently intervene. | |
| --- | |
| ## ✨ What Makes Pip Special | |
| ### Emotional Intelligence | |
| Pip doesn't just respond - it **understands**. Using Claude's nuanced emotional analysis, Pip detects: | |
| - Multiple co-existing emotions | |
| - Emotional intensity | |
| - Underlying needs (validation, comfort, celebration) | |
| - When gentle intervention might help | |
| ### Context-Specific Imagery | |
| Every image Pip creates is **unique to your conversation**. Not generic stock photos - visual art that captures YOUR emotional moment: | |
| - Mood Alchemist: Transform emotions into magical artifacts | |
| - Day's Artist: Turn your day into impressionistic art | |
| - Dream Weaver: Visualize thoughts in surreal imagery | |
| - Night Companion: Calming visuals for 3am moments | |
| ### Multi-Service Architecture | |
| Pip uses **multiple AI services** intelligently: | |
| | Service | Role | | |
| |---------|------| | |
| | **Anthropic Claude** | Deep emotional analysis, intervention logic | | |
| | **SambaNova** | Fast acknowledgments, prompt enhancement | | |
| | **OpenAI** | Image generation, speech-to-text (Whisper) | | |
| | **Google Gemini** | Image generation (load balanced) | | |
| | **Flux/SDXL** | Artistic image generation (via Modal/HuggingFace) | | |
| | **ElevenLabs** | Expressive voice with emotional tone matching | | |
| ### Low-Latency Design | |
| Pip is designed for **responsiveness**: | |
| - Quick acknowledgment (< 500ms) | |
| - Progressive state changes while processing | |
| - Parallel task execution | |
| - Streaming responses | |
| --- | |
| ## 🎮 How to Use | |
| ### Chat Interface | |
| 1. Type how you're feeling or what's on your mind | |
| 2. Watch Pip's expression change as it processes | |
| 3. Receive a thoughtful response + custom image | |
| 4. Optionally enable voice to hear Pip speak | |
| ### Voice Input | |
| 1. Click the microphone button | |
| 2. Speak your thoughts | |
| 3. Pip transcribes and responds with voice | |
| ### Modes | |
| - **Auto**: Pip decides the best visualization style | |
| - **Alchemist**: Emotions become magical artifacts | |
| - **Artist**: Your day becomes a painting | |
| - **Dream**: Thoughts become surreal visions | |
| - **Night**: Calming imagery for late hours | |
| --- | |
| ## 🤖 MCP Integration | |
| Pip is available as an **MCP (Model Context Protocol) server**. Connect your AI agent! | |
| ### For SSE-compatible clients (Cursor, Windsurf, Cline): | |
| ```json | |
| { | |
| "mcpServers": { | |
| "Pip": { | |
| "url": "https://YOUR-SPACE.hf.space/gradio_api/mcp/" | |
| } | |
| } | |
| } | |
| ``` | |
| ### For stdio clients (Claude Desktop): | |
| ```json | |
| { | |
| "mcpServers": { | |
| "Pip": { | |
| "command": "npx", | |
| "args": [ | |
| "mcp-remote", | |
| "https://YOUR-SPACE.hf.space/gradio_api/mcp/sse", | |
| "--transport", | |
| "sse-only" | |
| ] | |
| } | |
| } | |
| } | |
| ``` | |
| ### Available MCP Tools | |
| - `chat_with_pip(message, session_id)` - Talk to Pip | |
| - `generate_mood_artifact(emotion, context)` - Create emotional art | |
| - `get_pip_gallery(session_id)` - View conversation history | |
| - `set_pip_mode(mode, session_id)` - Change interaction mode | |
| --- | |
| ## 🧠 The Architecture | |
| ``` | |
| User Input | |
| ↓ | |
| ┌─────────────────────────────────────┐ | |
| │ SambaNova: Quick Acknowledgment │ ← Immediate response | |
| └─────────────────────────────────────┘ | |
| ↓ | |
| ┌─────────────────────────────────────┐ | |
| │ Claude: Emotion Analysis │ ← Deep understanding | |
| │ - Primary emotions │ | |
| │ - Intensity (1-10) │ | |
| │ - Intervention needed? │ | |
| └─────────────────────────────────────┘ | |
| ↓ | |
| ┌─────────────────────────────────────┐ | |
| │ Claude: Action Decision │ ← What should Pip do? | |
| │ - reflect / celebrate / comfort │ | |
| │ - calm / energize / intervene │ | |
| └─────────────────────────────────────┘ | |
| ↓ | |
| ┌─────────────────────────────────────┐ | |
| │ SambaNova: Prompt Enhancement │ ← Create vivid image prompt | |
| │ (Context-specific, never generic) │ | |
| └─────────────────────────────────────┘ | |
| ↓ | |
| ┌─────────────────────────────────────┐ | |
| │ Image Generation (Load Balanced) │ | |
| │ ┌────────┐ ┌────────┐ ┌────────┐ │ | |
| │ │ OpenAI │ │ Gemini │ │ Flux │ │ | |
| │ └────────┘ └────────┘ └────────┘ │ | |
| └─────────────────────────────────────┘ | |
| ↓ | |
| ┌─────────────────────────────────────┐ | |
| │ Claude/SambaNova: Response │ ← Streaming text | |
| │ (Load balanced for efficiency) │ | |
| └─────────────────────────────────────┘ | |
| ↓ | |
| ┌─────────────────────────────────────┐ | |
| │ ElevenLabs: Voice (Optional) │ ← Emotional tone matching | |
| └─────────────────────────────────────┘ | |
| ``` | |
| --- | |
| ## 🎨 Pip's Expressions | |
| Pip has **10 distinct emotional states** with unique animations: | |
| - Neutral (gentle wobble) | |
| - Happy (bouncing) | |
| - Sad (drooping) | |
| - Thinking (looking up, swaying) | |
| - Concerned (worried eyebrows, shaking) | |
| - Excited (energetic bouncing with sparkles) | |
| - Sleepy (half-closed eyes, breathing) | |
| - Listening (wide eyes, pulsing) | |
| - Attentive (leaning forward) | |
| - Speaking (animated mouth) | |
| --- | |
| ## 💡 Key Features | |
| ### Intervention Without Preaching | |
| When Pip detects concerning emotional signals, it doesn't lecture. Instead: | |
| - Brief acknowledgment | |
| - Gentle redirect to curiosity/wonder | |
| - Show something beautiful or intriguing | |
| - Invite engagement, not advice | |
| ### Not Generic | |
| Every image prompt is crafted from YOUR specific words and context. Pip extracts: | |
| - Specific details you mentioned | |
| - Emotional undertones | |
| - Time/context clues | |
| - Your unique situation | |
| --- | |
| ## 🛠️ Tech Stack | |
| - **Frontend**: Gradio | |
| - **Character**: SVG + CSS animations | |
| - **LLMs**: Anthropic Claude, SambaNova (Llama) | |
| - **Images**: OpenAI DALL-E 3, Google Imagen, Flux | |
| - **Voice**: ElevenLabs (Flash v2.5 for speed, v3 for expression) | |
| - **STT**: OpenAI Whisper | |
| - **Compute**: Modal (for Flux/SDXL) | |
| - **Hosting**: HuggingFace Spaces | |
| --- | |
| ## 🔧 Environment Variables | |
| ``` | |
| ANTHROPIC_API_KEY=your_key | |
| SAMBANOVA_API_KEY=your_key | |
| OPENAI_API_KEY=your_key | |
| GOOGLE_API_KEY=your_key | |
| ELEVENLABS_API_KEY=your_key | |
| HF_TOKEN=your_token (optional, for HuggingFace models) | |
| ``` | |
| --- | |
| ## 📝 License | |
| MIT License - Feel free to use, modify, and share! | |
| --- | |
| *Built with 💙 for MCP's 1st Birthday Hackathon 2025* | |
| *Pip uses: Anthropic ($25K), OpenAI ($25), HuggingFace ($25), SambaNova ($25), ElevenLabs ($44), Modal ($250), Blaxel ($250)* | |