abdull4h commited on
Commit
4f052fd
·
verified ·
1 Parent(s): 3867882

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +73 -123
README.md CHANGED
@@ -9,138 +9,88 @@ app_file: app.py
9
  pinned: false
10
  ---
11
 
12
- # AI-Powered Bilingual Storyteller & Illustrator
13
-
14
- ## Overview
15
-
16
- This application generates high-quality stories in both English and Arabic with emotional analysis and optional illustrations. It uses a robust template-based approach combined with AI models to ensure culturally appropriate, engaging, and safe content generation.
17
-
18
- ## Key Features
19
-
20
- ### 1. Reliable Bilingual Story Generation
21
- - **English Stories**: High-quality narrative generation with emotional analysis
22
- - **Arabic Stories**: Template-based system with culturally appropriate content
23
- - **Automatic Language Detection**: Seamlessly handles input in either language
24
-
25
- ### 2. Multiple Creation Modes
26
- - **Basic Story Mode**: Generate stories from simple prompts
27
- - **Template Story Mode**: Guided creation using structured templates
28
- - **Visual Story Mode**: Create stories with illustrated scenes
29
-
30
- ### 3. Advanced Visualization
31
- - Generate scene sequences from stories (1-5 scenes)
32
- - Multiple artistic styles: realistic, anime, fantasy
33
- - Automatic prompt enhancement for better image quality
34
-
35
- ### 4. Content Safety System
36
- - Multi-layered content filtering to prevent inappropriate material
37
- - Language consistency verification
38
- - Repetition detection to maintain story quality
39
- - Graceful fallbacks to ensure reliable output
40
-
41
- ## Technical Implementation
42
-
43
- ### Story Generation Architecture
44
-
45
- The system uses a hybrid approach to story generation:
46
-
47
- 1. **English Generation**:
48
- - Uses EleutherAI/gpt-neo-1.3B with optimization for storytelling
49
- - Enhanced with template options for consistency
50
-
51
- 2. **Arabic Generation**:
52
- - Template-based system with curated high-quality narratives
53
- - Dynamic template selection based on prompt analysis
54
- - Parameter extraction to customize stories
55
- - Multiple fallback mechanisms to ensure appropriate content
56
-
57
- 3. **Emotion Analysis**:
58
- - English: distilbert-based sentiment analysis
59
- - Arabic: CAMeL-Lab/bert-base-arabic-sentiment when available
60
- - Cross-lingual sentiment analysis for comprehensive coverage
61
-
62
- 4. **Translation Capabilities**:
63
- - Arabic-to-English: Helsinki-NLP/opus-mt-ar-en
64
- - English-to-Arabic: Helsinki-NLP/opus-mt-en-ar (when available)
65
- - Used for cross-lingual operations and image generation
66
 
67
  ### Visual Generation
68
-
69
- The application uses Stable Diffusion (runwayml/stable-diffusion-v1-5) for image generation with:
70
-
71
- - Efficient GPU resource management
 
 
 
 
72
  - Scene extraction from story content
73
- - Style-specific prompt enhancement
74
- - Comprehensive error handling
75
-
76
- ## Usage Instructions
77
-
78
- ### Basic Story Generation
79
- 1. Enter a prompt in English or Arabic
80
- 2. Select your desired output language
81
- 3. Click "Generate Story"
82
- 4. Review your story with emotional analysis
83
-
84
- ### Template Story Creation
85
- 1. Choose a template type (Adventure, Friendship, Fantasy)
86
- 2. Fill in the template parameters or use defaults
87
- 3. Select output language
88
- 4. Generate your customized story
89
-
90
- ### Visual Storytelling
91
- 1. Enter your story prompt
92
- 2. Choose output language
93
- 3. Select the number of scenes (1-5)
94
- 4. Pick your preferred artistic style
95
- 5. Generate a story with matching illustrations
96
-
97
- ## Template System
98
-
99
- The application includes a sophisticated template system with:
100
 
101
- - **Adventure Templates**: Exploration and discovery narratives
102
- - **Friendship Templates**: Stories about connections and relationships
103
- - **Fantasy Templates**: Tales of magic and extraordinary powers
 
 
104
 
105
- Each template category includes multiple variations in both languages, ensuring fresh and engaging content each time. The system automatically:
106
 
107
- 1. Analyzes user prompts for keywords
108
- 2. Selects the most appropriate template type
109
- 3. Extracts parameters from the prompt when possible
110
- 4. Uses default parameters when needed
111
- 5. Customizes the selected template for a personalized story
112
 
113
- ## Safety Features
 
 
 
114
 
115
- The application prioritizes content safety through:
 
 
 
116
 
117
- 1. **Content Filtering**: Detection of inappropriate terms or patterns
118
- 2. **Language Consistency**: Verification of output language integrity
119
- 3. **Quality Control**: Detection of repetitive or nonsensical content
120
- 4. **Fallback Mechanisms**: Multiple layers of backup generation options
121
 
122
  ## Technical Requirements
123
-
124
  - Python 3.8+
125
- - CUDA-capable GPU recommended for image generation
126
- - Dependencies listed in requirements.txt
127
-
128
- ## Future Enhancements
129
-
130
- - Enhanced Arabic image prompt understanding
131
- - Voice narration for stories
132
- - Interactive branching narratives
133
- - Additional language support
134
- - Expanded template library
135
-
136
- ## License & Acknowledgements
137
-
138
- - [Hugging Face Transformers](https://github.com/huggingface/transformers)
139
- - [Diffusers](https://github.com/huggingface/diffusers)
140
- - [CAMeL-Lab](https://huggingface.co/CAMeL-Lab)
141
- - [Gradio](https://github.com/gradio-app/gradio)
142
- - [Helsinki-NLP](https://huggingface.co/Helsinki-NLP)
143
-
144
- ## Contact
145
-
146
- For questions or support, please open an issue in the repository.
 
9
  pinned: false
10
  ---
11
 
12
+ # AI-Powered Bilingual Storyteller & Illustrator - Technical Summary
13
+
14
+ ## Core Functionality
15
+ - Generates stories in English and Arabic with emotional analysis and optional illustrations
16
+ - Uses template-based approach with AI models to ensure quality and safety
17
+
18
+ ## Technical Architecture
19
+
20
+ ### Story Generation
21
+
22
+ #### NLP Pipelines
23
+ - **English Text Generation Pipeline**:
24
+ ```python
25
+ pipeline("text-generation", model="EleutherAI/gpt-neo-1.3B", device="cpu")
26
+ ```
27
+
28
+ - **Arabic Generation**:
29
+ ```python
30
+ # Uses MT5 instead of standard pipeline
31
+ AutoTokenizer.from_pretrained("google/mt5-small")
32
+ AutoModelForSeq2SeqLM.from_pretrained("google/mt5-small")
33
+ ```
34
+
35
+ - **Sentiment Analysis Pipelines**:
36
+ ```python
37
+ # English
38
+ pipeline("sentiment-analysis", model="distilbert-base-uncased-finetuned-sst-2-english", device="cpu")
39
+
40
+ # Arabic
41
+ pipeline("sentiment-analysis", model="CAMeL-Lab/bert-base-arabic-sentiment", device="cpu")
42
+ ```
43
+
44
+ - **Translation Pipelines**:
45
+ ```python
46
+ # Arabic to English
47
+ pipeline("translation", model="Helsinki-NLP/opus-mt-ar-en", device="cpu")
48
+
49
+ # English to Arabic
50
+ pipeline("translation", model="Helsinki-NLP/opus-mt-en-ar", device="cpu")
51
+ ```
 
 
 
 
 
 
 
 
 
 
 
 
 
 
52
 
53
  ### Visual Generation
54
+ - **Image Generation Pipeline**:
55
+ ```python
56
+ pipe = StableDiffusionPipeline.from_pretrained(
57
+ "runwayml/stable-diffusion-v1-5",
58
+ torch_dtype=torch.float16
59
+ )
60
+ ```
61
+ - Efficient GPU resource management via @spaces.GPU decorator
62
  - Scene extraction from story content
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
63
 
64
+ ### Content Safety System
65
+ - Multi-layered content filtering
66
+ - Regex pattern detection for inappropriate content
67
+ - Repetition detection (unique word ratio < 0.4)
68
+ - Fallback mechanisms to reliable templates
69
 
70
+ ## Implementation Highlights
71
 
72
+ ### MultilingualStoryGenerator Class
73
+ - Central class managing generation in both languages
74
+ - Handles language detection, content safety, and sentiment analysis
75
+ - Template selection logic based on keyword matching
76
+ - Parameter extraction from prompts
77
 
78
+ ### Story Templates
79
+ - Three categories: Adventure, Friendship, Fantasy
80
+ - Multiple variations in both languages
81
+ - Dynamic parameter filling
82
 
83
+ ### GPU Resource Management
84
+ - @spaces.GPU decorator for efficient GPU allocation
85
+ - Pipeline moved to GPU only when needed for image generation
86
+ - Proper cleanup with torch.cuda.empty_cache() and gc.collect()
87
 
88
+ ### Error Handling
89
+ - Comprehensive logging system
90
+ - Graceful degradation for missing components
91
+ - Multiple fallback mechanisms
92
 
93
  ## Technical Requirements
 
94
  - Python 3.8+
95
+ - CUDA-capable GPU (for image generation)
96
+ - Key dependencies: transformers, diffusers, gradio, torch