VideoX Fun
bubbliiiing commited on
Commit
5f2e9bf
·
verified ·
1 Parent(s): ff5b6b1

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +115 -114
README.md CHANGED
@@ -1,115 +1,116 @@
1
- ---
2
- license: other
3
- license_name: flux-dev-non-commercial-license
4
- license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICENSE.txt
5
- ---
6
- # Flux.2-dev-Fun-Controlnet-Union
7
-
8
- [![Github](https://img.shields.io/badge/🎬%20Code-Github-blue)](https://github.com/aigc-apps/VideoX-Fun)
9
-
10
- # Model features
11
- - This ControlNet is added on 4 double blocks.
12
- - The model was trained from scratch for 10,000 steps on a dataset of 1 million high-quality images covering both general and human-centric content. Training was performed at 1328 resolution using BFloat16 precision, with a batch size of 64, a learning rate of 2e-5, and a text dropout ratio of 0.10.
13
- - It supports multiple control conditions—including Canny, HED, depth maps, pose estimation, and MLSD can be used like a standard ControlNet.
14
- - Inpainting mode is also supported.
15
- - You can adjust controlnet_conditioning_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for controlnet_conditioning_scale is from 0.65 to 0.80.
16
- - Although Flux.2‑dev supports certain image‑editing capabilities, its generation speed slows down when handling multiple images, and it sometimes produces similarity issues or fails to follow the control images. Compared with edit‑based methods, using ControlNet adheres more reliably to control instructions and makes it easier to apply multiple types of control.
17
-
18
- # TODO
19
- - [ ] Train more data and steps.
20
-
21
- # Results
22
-
23
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
24
- <tr>
25
- <td>Pose</td>
26
- <td>Output</td>
27
- </tr>
28
- <tr>
29
- <td><img src="asset/ref.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /></td>
30
- <td><img src="results/inpaint.png" width="100%" /></td>
31
- </tr>
32
- </table>
33
-
34
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
35
- <tr>
36
- <td>Pose</td>
37
- <td>Output</td>
38
- </tr>
39
- <tr>
40
- <td><img src="asset/pose.jpg" width="100%" /><img src="asset/ref.jpg" width="100%" /></td>
41
- <td><img src="results/pose_ref.png" width="100%" /></td>
42
- </tr>
43
- </table>
44
-
45
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
46
- <tr>
47
- <td>Pose</td>
48
- <td>Output</td>
49
- </tr>
50
- <tr>
51
- <td><img src="asset/pose.jpg" width="100%" /></td>
52
- <td><img src="results/pose.png" width="100%" /></td>
53
- </tr>
54
- </table>
55
-
56
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
57
- <tr>
58
- <td>Pose</td>
59
- <td>Output</td>
60
- </tr>
61
- <tr>
62
- <td><img src="asset/pose2.jpg" width="100%" /></td>
63
- <td><img src="results/pose2.png" width="100%" /></td>
64
- </tr>
65
- </table>
66
-
67
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
68
- <tr>
69
- <td>Canny</td>
70
- <td>Output</td>
71
- </tr>
72
- <tr>
73
- <td><img src="asset/canny.jpg" width="100%" /></td>
74
- <td><img src="results/canny.png" width="100%" /></td>
75
- </tr>
76
- </table>
77
-
78
- <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
79
- <tr>
80
- <td>Canny</td>
81
- <td>Output</td>
82
- </tr>
83
- <tr>
84
- <td><img src="asset/depth.jpg" width="100%" /></td>
85
- <td><img src="results/depth.png" width="100%" /></td>
86
- </tr>
87
- </table>
88
-
89
- # Inference
90
- Go to VideoX-Fun repository for more details.
91
-
92
- Please git clone VideoX-Fun and mkdirs.
93
- ```sh
94
- # clone code
95
- git clone https://github.com/aigc-apps/VideoX-Fun.git
96
-
97
- # enter VideoX-Fun's dir
98
- cd VideoX-Fun
99
-
100
- # download weights
101
- mkdir models/Diffusion_Transformer
102
- mkdir models/Personalized_Model
103
- ```
104
-
105
- Then download weights to models/Diffusion_Transformer and models/Personalized_Model.
106
-
107
- ```
108
- 📦 models/
109
- ├── 📂 Diffusion_Transformer/
110
- │ └── 📂 FLUX.2-dev/
111
- ├── 📂 Personalized_Model/
112
- │ └── "models/Personalized_Model/FLUX.2-dev-Fun-Controlnet-Union.safetensors"
113
- ```
114
-
 
115
  Then run the file `examples/flux2_fun/predict_t2i_control.py`.
 
1
+ ---
2
+ library_name: videox_fun
3
+ license: other
4
+ license_name: flux-dev-non-commercial-license
5
+ license_link: https://huggingface.co/black-forest-labs/FLUX.2-dev/blob/main/LICENSE.txt
6
+ ---
7
+ # Flux.2-dev-Fun-Controlnet-Union
8
+
9
+ [![Github](https://img.shields.io/badge/🎬%20Code-Github-blue)](https://github.com/aigc-apps/VideoX-Fun)
10
+
11
+ # Model features
12
+ - This ControlNet is added on 4 double blocks.
13
+ - The model was trained from scratch for 10,000 steps on a dataset of 1 million high-quality images covering both general and human-centric content. Training was performed at 1328 resolution using BFloat16 precision, with a batch size of 64, a learning rate of 2e-5, and a text dropout ratio of 0.10.
14
+ - It supports multiple control conditions—including Canny, HED, depth maps, pose estimation, and MLSD can be used like a standard ControlNet.
15
+ - Inpainting mode is also supported.
16
+ - You can adjust controlnet_conditioning_scale for stronger control and better detail preservation. For better stability, we highly recommend using a detailed prompt. The optimal range for controlnet_conditioning_scale is from 0.65 to 0.80.
17
+ - Although Flux.2‑dev supports certain image‑editing capabilities, its generation speed slows down when handling multiple images, and it sometimes produces similarity issues or fails to follow the control images. Compared with edit‑based methods, using ControlNet adheres more reliably to control instructions and makes it easier to apply multiple types of control.
18
+
19
+ # TODO
20
+ - [ ] Train more data and steps.
21
+
22
+ # Results
23
+
24
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
25
+ <tr>
26
+ <td>Pose</td>
27
+ <td>Output</td>
28
+ </tr>
29
+ <tr>
30
+ <td><img src="asset/ref.jpg" width="100%" /><img src="asset/mask.jpg" width="100%" /></td>
31
+ <td><img src="results/inpaint.png" width="100%" /></td>
32
+ </tr>
33
+ </table>
34
+
35
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
36
+ <tr>
37
+ <td>Pose</td>
38
+ <td>Output</td>
39
+ </tr>
40
+ <tr>
41
+ <td><img src="asset/pose.jpg" width="100%" /><img src="asset/ref.jpg" width="100%" /></td>
42
+ <td><img src="results/pose_ref.png" width="100%" /></td>
43
+ </tr>
44
+ </table>
45
+
46
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
47
+ <tr>
48
+ <td>Pose</td>
49
+ <td>Output</td>
50
+ </tr>
51
+ <tr>
52
+ <td><img src="asset/pose.jpg" width="100%" /></td>
53
+ <td><img src="results/pose.png" width="100%" /></td>
54
+ </tr>
55
+ </table>
56
+
57
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
58
+ <tr>
59
+ <td>Pose</td>
60
+ <td>Output</td>
61
+ </tr>
62
+ <tr>
63
+ <td><img src="asset/pose2.jpg" width="100%" /></td>
64
+ <td><img src="results/pose2.png" width="100%" /></td>
65
+ </tr>
66
+ </table>
67
+
68
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
69
+ <tr>
70
+ <td>Canny</td>
71
+ <td>Output</td>
72
+ </tr>
73
+ <tr>
74
+ <td><img src="asset/canny.jpg" width="100%" /></td>
75
+ <td><img src="results/canny.png" width="100%" /></td>
76
+ </tr>
77
+ </table>
78
+
79
+ <table border="0" style="width: 100%; text-align: left; margin-top: 20px;">
80
+ <tr>
81
+ <td>Canny</td>
82
+ <td>Output</td>
83
+ </tr>
84
+ <tr>
85
+ <td><img src="asset/depth.jpg" width="100%" /></td>
86
+ <td><img src="results/depth.png" width="100%" /></td>
87
+ </tr>
88
+ </table>
89
+
90
+ # Inference
91
+ Go to VideoX-Fun repository for more details.
92
+
93
+ Please git clone VideoX-Fun and mkdirs.
94
+ ```sh
95
+ # clone code
96
+ git clone https://github.com/aigc-apps/VideoX-Fun.git
97
+
98
+ # enter VideoX-Fun's dir
99
+ cd VideoX-Fun
100
+
101
+ # download weights
102
+ mkdir models/Diffusion_Transformer
103
+ mkdir models/Personalized_Model
104
+ ```
105
+
106
+ Then download weights to models/Diffusion_Transformer and models/Personalized_Model.
107
+
108
+ ```
109
+ 📦 models/
110
+ ├── 📂 Diffusion_Transformer/
111
+ │ └── 📂 FLUX.2-dev/
112
+ ├── 📂 Personalized_Model/
113
+ │ └── "models/Personalized_Model/FLUX.2-dev-Fun-Controlnet-Union.safetensors"
114
+ ```
115
+
116
  Then run the file `examples/flux2_fun/predict_t2i_control.py`.