Spaces:

MCP-1st-Birthday
/

pipV1

Running

App Files Files Community

pipV1 / README.md

Itsjustamit

Update README.md

bb26a57 verified 17 days ago

preview code

raw

history blame contribute delete

8.12 kB

	---
	title: Pip - Emotional AI Companion
	emoji: 🫧
	colorFrom: blue
	colorTo: purple
	sdk: gradio
	sdk_version: 6.0.1
	app_file: app.py
	pinned: true
	license: mit
	short_description: A blob friend who transforms your feelings into visual art
	tags:
	- mcp-in-action-track-creative
	- mcp-in-action-track-consumer
	- agents
	- mcp
	---

	> 🎥 Demo Video: https://youtu.be/bWDj4gyngNI
	>
	> 📢 Social Post: https://x.com/07amit10/status/1995270517251801162
	>
	> 👥 Team: @Itsjustamit

	# 🫧 Pip - Your Emotional AI Companion

	Pip is a cute blob companion that understands your emotions and responds with conversation, context-specific imagery, and soothing voice.

	Not a generic assistant - Pip is an emotional friend who knows when to reflect, celebrate, or gently intervene.

	---

	## ✨ What Makes Pip Special

	### Emotional Intelligence
	Pip doesn't just respond - it understands. Using Claude's nuanced emotional analysis, Pip detects:
	- Multiple co-existing emotions
	- Emotional intensity
	- Underlying needs (validation, comfort, celebration)
	- When gentle intervention might help

	### Context-Specific Imagery
	Every image Pip creates is unique to your conversation. Not generic stock photos - visual art that captures YOUR emotional moment:
	- Mood Alchemist: Transform emotions into magical artifacts
	- Day's Artist: Turn your day into impressionistic art
	- Dream Weaver: Visualize thoughts in surreal imagery
	- Night Companion: Calming visuals for 3am moments

	### Multi-Service Architecture
	Pip uses multiple AI services intelligently:

	\| Service \| Role \|
	\|---------\|------\|
	\| Anthropic Claude \| Deep emotional analysis, intervention logic \|
	\| SambaNova \| Fast acknowledgments, prompt enhancement \|
	\| OpenAI \| Image generation, speech-to-text (Whisper) \|
	\| Google Gemini \| Image generation (load balanced) \|
	\| Flux/SDXL \| Artistic image generation (via Modal/HuggingFace) \|
	\| ElevenLabs \| Expressive voice with emotional tone matching \|

	### Low-Latency Design
	Pip is designed for responsiveness:
	- Quick acknowledgment (< 500ms)
	- Progressive state changes while processing
	- Parallel task execution
	- Streaming responses

	---

	## 🎮 How to Use

	### Chat Interface
	1. Type how you're feeling or what's on your mind
	2. Watch Pip's expression change as it processes
	3. Receive a thoughtful response + custom image
	4. Optionally enable voice to hear Pip speak

	### Voice Input
	1. Click the microphone button
	2. Speak your thoughts
	3. Pip transcribes and responds with voice

	### Modes
	- Auto: Pip decides the best visualization style
	- Alchemist: Emotions become magical artifacts
	- Artist: Your day becomes a painting
	- Dream: Thoughts become surreal visions
	- Night: Calming imagery for late hours

	---

	## 🤖 MCP Integration

	Pip is available as an MCP (Model Context Protocol) server. Connect your AI agent!

	### For SSE-compatible clients (Cursor, Windsurf, Cline):
	```json
	{
	"mcpServers": {
	"Pip": {
	"url": "https://YOUR-SPACE.hf.space/gradio_api/mcp/"
	}
	}
	}
	```

	### For stdio clients (Claude Desktop):
	```json
	{
	"mcpServers": {
	"Pip": {
	"command": "npx",
	"args": [
	"mcp-remote",
	"https://YOUR-SPACE.hf.space/gradio_api/mcp/sse",
	"--transport",
	"sse-only"
	]
	}
	}
	}
	```

	### Available MCP Tools
	- `chat_with_pip(message, session_id)` - Talk to Pip
	- `generate_mood_artifact(emotion, context)` - Create emotional art
	- `get_pip_gallery(session_id)` - View conversation history
	- `set_pip_mode(mode, session_id)` - Change interaction mode

	---

	## 🧠 The Architecture

	```
	User Input
	↓
	┌─────────────────────────────────────┐
	│ SambaNova: Quick Acknowledgment │ ← Immediate response
	└─────────────────────────────────────┘
	↓
	┌─────────────────────────────────────┐
	│ Claude: Emotion Analysis │ ← Deep understanding
	│ - Primary emotions │
	│ - Intensity (1-10) │
	│ - Intervention needed? │
	└─────────────────────────────────────┘
	↓
	┌─────────────────────────────────────┐
	│ Claude: Action Decision │ ← What should Pip do?
	│ - reflect / celebrate / comfort │
	│ - calm / energize / intervene │
	└─────────────────────────────────────┘
	↓
	┌─────────────────────────────────────┐
	│ SambaNova: Prompt Enhancement │ ← Create vivid image prompt
	│ (Context-specific, never generic) │
	└─────────────────────────────────────┘
	↓
	┌─────────────────────────────────────┐
	│ Image Generation (Load Balanced) │
	│ ┌────────┐ ┌────────┐ ┌────────┐ │
	│ │ OpenAI │ │ Gemini │ │ Flux │ │
	│ └────────┘ └────────┘ └────────┘ │
	└─────────────────────────────────────┘
	↓
	┌─────────────────────────────────────┐
	│ Claude/SambaNova: Response │ ← Streaming text
	│ (Load balanced for efficiency) │
	└─────────────────────────────────────┘
	↓
	┌─────────────────────────────────────┐
	│ ElevenLabs: Voice (Optional) │ ← Emotional tone matching
	└─────────────────────────────────────┘
	```

	---

	## 🎨 Pip's Expressions

	Pip has 10 distinct emotional states with unique animations:
	- Neutral (gentle wobble)
	- Happy (bouncing)
	- Sad (drooping)
	- Thinking (looking up, swaying)
	- Concerned (worried eyebrows, shaking)
	- Excited (energetic bouncing with sparkles)
	- Sleepy (half-closed eyes, breathing)
	- Listening (wide eyes, pulsing)
	- Attentive (leaning forward)
	- Speaking (animated mouth)

	---

	## 💡 Key Features

	### Intervention Without Preaching
	When Pip detects concerning emotional signals, it doesn't lecture. Instead:
	- Brief acknowledgment
	- Gentle redirect to curiosity/wonder
	- Show something beautiful or intriguing
	- Invite engagement, not advice

	### Not Generic
	Every image prompt is crafted from YOUR specific words and context. Pip extracts:
	- Specific details you mentioned
	- Emotional undertones
	- Time/context clues
	- Your unique situation

	---

	## 🛠️ Tech Stack

	- Frontend: Gradio
	- Character: SVG + CSS animations
	- LLMs: Anthropic Claude, SambaNova (Llama)
	- Images: OpenAI DALL-E 3, Google Imagen, Flux
	- Voice: ElevenLabs (Flash v2.5 for speed, v3 for expression)
	- STT: OpenAI Whisper
	- Compute: Modal (for Flux/SDXL)
	- Hosting: HuggingFace Spaces

	---

	## 🔧 Environment Variables

	```
	ANTHROPIC_API_KEY=your_key
	SAMBANOVA_API_KEY=your_key
	OPENAI_API_KEY=your_key
	GOOGLE_API_KEY=your_key
	ELEVENLABS_API_KEY=your_key
	HF_TOKEN=your_token (optional, for HuggingFace models)
	```

	---

	## 📝 License

	MIT License - Feel free to use, modify, and share!

	---

	Built with 💙 for MCP's 1st Birthday Hackathon 2025

	Pip uses: Anthropic ($25K), OpenAI ($25), HuggingFace ($25), SambaNova ($25), ElevenLabs ($44), Modal ($250), Blaxel ($250)