Spaces:

likhonsheikh
/

memo-text-to-video

Running

App Files Files Community

likhonsheikh commited on 8 days ago

Commit

f16ace1

verified ·

1 Parent(s): c2f1523

Upload README_space.md with huggingface_hub

Browse files

Files changed (1) hide show

README_space.md +243 -0

README_space.md ADDED Viewed

	@@ -0,0 +1,243 @@

+---
+license: apache-2.0
+language:
+- bn
+- en
+tags:
+- transformers
+- safetensors
+- stable-diffusion
+- bangla
+- text-to-video
+- lora
+- scene-planning
+- computer-vision
+- natural-language-processing
+- mlops
+- production-grade
+pipeline_tag: text-to-video
+model-index:
+- name: memo
+  results: []
+---
+# Memo: Production-Grade Transformers + Safetensors Implementation
+![Memo Logo](https://img.shields.io/badge/Memo-Transformers%20%2B%20Safetensors-brightgreen?style=for-the-badge)
+![Transformers](https://img.shields.io/badge/Transformers-4.57.3-blue?style=flat-square)
+![Safetensors](https://img.shields.io/badge/Safetensors-0.7.0-red?style=flat-square)
+![License](https://img.shields.io/badge/License-Apache%202.0-green?style=flat-square)
+## Overview
+This is the complete transformation of Memo to use **Transformers + Safetensors** properly, replacing unsafe pickle files and toy logic with enterprise-grade machine learning infrastructure.
+## What We've Built
+### ✅ Core Requirements Met
+1. **Transformers Integration**
+   - Bangla text parsing using `google/mt5-small`
+   - Proper tokenization and model loading
+   - Deterministic scene extraction with controlled parameters
+   - Memory optimization with device mapping
+2. **Safetensors Security**
+   - **MANDATORY** `use_safetensors=True` for all model loading
+   - No .bin, .ckpt, or pickle files anywhere
+   - Model weight validation and security checks
+   - Signature verification for LoRA files
+3. **Production Architecture**
+   - Tier-based model management (Free/Pro/Enterprise)
+   - Memory optimization and performance tuning
+   - Background processing for long-running tasks
+   - Proper error handling and logging
+## File Structure
+```
+📁 Memo/
+├── 📄 requirements.txt                    # Production dependencies
+├── 📁 models/
+│   └── 📁 text/
+│       └── 📄 bangla_parser.py           # Transformer-based Bangla parser
+├── 📁 core/
+│   └── 📄 scene_planner.py               # ML-based scene planning
+├── 📁 models/
+│   └── 📁 image/
+│       └── 📄 sd_generator.py            # Stable Diffusion + Safetensors
+├── 📁 data/
+│   └── 📁 lora/
+│       └── 📄 README.md                  # LoRA configuration (safetensors only)
+├── 📁 scripts/
+│   └── 📄 train_scene_lora.py            # Training with safetensors output
+├── 📁 config/
+│   └── 📄 model_tiers.py                 # Tier management system
+└── 📁 api/
+    └── 📄 main.py                        # Production API endpoint
+```
+## Key Features
+### 🔒 Security (Non-Negotiable)
+- **Safetensors-only model loading** - No unsafe formats
+- **Model signature validation** - Verify weight integrity
+- **LoRA security checks** - Ensure only .safetensors files
+- **Memory-safe loading** - Prevent buffer overflows
+### 🚀 Performance
+- **Memory optimization** - xFormers, attention slicing, CPU offload
+- **FP16 precision** - 50% memory reduction with maintained quality
+- **LCM acceleration** - Faster inference when available
+- **Device mapping** - Optimal GPU/CPU utilization
+### 🏢 Enterprise Features
+- **Tier-based pricing** - Free/Pro/Enterprise configurations
+- **Resource management** - Memory limits and concurrent request handling
+- **Security compliance** - Audit trails and validation
+- **Scalability** - Background processing and proper async handling
+## Model Tiers
+### Free Tier
+- Base SDXL model (512x512)
+- 15 inference steps
+- No LoRA
+- 1 concurrent request
+### Pro Tier
+- Base SDXL model (768x768)
+- 25 inference steps
+- Scene LoRA enabled
+- LCM acceleration
+- 3 concurrent requests
+### Enterprise Tier
+- Base SDXL model (1024x1024)
+- 30 inference steps
+- Custom LoRA support
+- LCM acceleration
+- 10 concurrent requests
+## Usage Examples
+### Basic Scene Planning
+```python
+from core.scene_planner import plan_scenes
+scenes = plan_scenes(
+    text_bn="আজকের দিনটি খুব সুন্দর ছিল।",
+    duration=15
+)
+```
+### Tier-Based Generation
+```python
+from config.model_tiers import get_tier_config
+from models.image.sd_generator import get_generator
+config = get_tier_config("pro")
+generator = get_generator(
+    model_id=config.image_model_id,
+    lora_path=config.lora_path,
+    use_lcm=config.lcm_enabled
+)
+frames = generator.generate_frames(
+    prompt="Beautiful landscape scene",
+    frames=5
+)
+```
+### API Usage
+```bash
+curl -X POST "http://localhost:8000/generate" \\
+  -H "Content-Type: application/json" \\
+  -d '{
+    "text": "আজকের দিনটি খুব সুন্দর ছিল।",
+    "duration": 15,
+    "tier": "pro"
+  }'
+```
+## Training Custom LoRA
+```python
+from scripts.train_scene_lora import SceneLoRATrainer, TrainingConfig
+config = TrainingConfig(
+    base_model="google/mt5-small",
+    rank=32,
+    alpha=64,
+    save_safetensors=True  # MANDATORY
+)
+trainer = SceneLoRATrainer(config)
+trainer.load_model()
+trainer.setup_lora()
+trainer.train(training_data)
+```
+## Security Validation
+```python
+from config.model_tiers import validate_model_weights_security
+result = validate_model_weights_security("data/lora/memo-scene-lora.safetensors")
+print(f"Secure: {result['is_secure']}")
+print(f"Issues: {result['issues']}")
+```
+## What This Guarantees
+✅ **Transformers-based** - Real ML, not toy logic
+✅ **Safetensors-only** - No security vulnerabilities
+✅ **Production-ready** - Enterprise architecture
+✅ **Memory optimized** - Proper resource management
+✅ **Tier-based** - Scalable pricing model
+✅ **Audit compliant** - Security validation built-in
+## What This Doesn't Do
+❌ Make GPUs cheap
+❌ Fix bad prompts
+❌ Read your mind
+❌ Guarantee perfect results
+## Next Steps
+If you're serious about production deployment:
+1. **Cold-start optimization** - Preload frequently used models
+2. **Model versioning** - Track changes per tier
+3. **A/B testing** - Compare model performance
+4. **Monitoring** - Track usage and performance metrics
+5. **Load balancing** - Distribute across multiple GPUs
+## Running the System
+```bash
+# Install dependencies
+pip install -r requirements.txt
+# Train custom LoRA
+python scripts/train_scene_lora.py
+# Start API server
+python api/main.py
+# Check health
+curl http://localhost:8000/health
+```
+## Reality Check
+This implementation is now:
+- ✅ **Correct** - Uses proper ML frameworks
+- ✅ **Modern** - Transformers + Safetensors
+- ✅ **Secure** - No unsafe model formats
+- ✅ **Scalable** - Tier-based architecture
+- ✅ **Defensible** - Production-grade security
+If your API claims "state-of-the-art" without these features, you're lying. Memo now actually delivers on that promise.