Lamapi commited on
Commit
ab2ac00
·
verified ·
1 Parent(s): 0dc6590

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +318 -3
README.md CHANGED
@@ -1,3 +1,318 @@
1
- ---
2
- license: apache-2.0
3
- ---
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - text-generation-inference
4
+ - transformers
5
+ - unsloth
6
+ - qwen3_vl
7
+ - trl
8
+ - sft
9
+ - chemistry
10
+ - code
11
+ - climate
12
+ - art
13
+ - biology
14
+ - finance
15
+ - legal
16
+ - music
17
+ - medical
18
+ - agent
19
+ license: apache-2.0
20
+ language:
21
+ - en
22
+ - ab
23
+ - aa
24
+ - ae
25
+ - af
26
+ - ak
27
+ - am
28
+ - an
29
+ - ar
30
+ - as
31
+ - av
32
+ - ay
33
+ - az
34
+ - ba
35
+ - be
36
+ - bg
37
+ - bh
38
+ - bi
39
+ - bm
40
+ - bn
41
+ - bo
42
+ - br
43
+ - bs
44
+ - ca
45
+ - ce
46
+ - ch
47
+ - co
48
+ - cr
49
+ - cs
50
+ - cu
51
+ - cv
52
+ - cy
53
+ - da
54
+ - de
55
+ - dv
56
+ - dz
57
+ - ee
58
+ - el
59
+ - eo
60
+ - es
61
+ - et
62
+ - eu
63
+ - fa
64
+ - ff
65
+ - fi
66
+ - fj
67
+ - fo
68
+ - fr
69
+ - fy
70
+ - ga
71
+ - gd
72
+ - gl
73
+ - gn
74
+ - gv
75
+ - ha
76
+ - he
77
+ - hi
78
+ - ho
79
+ - gu
80
+ - hr
81
+ - ht
82
+ - hu
83
+ - hz
84
+ - hy
85
+ - id
86
+ - ia
87
+ - ig
88
+ - ie
89
+ - ik
90
+ - ii
91
+ - is
92
+ - io
93
+ - iu
94
+ - it
95
+ - jv
96
+ - ja
97
+ - kg
98
+ - ka
99
+ - kj
100
+ - ki
101
+ - kl
102
+ - kk
103
+ - kn
104
+ - km
105
+ - kr
106
+ - ko
107
+ - ku
108
+ - ks
109
+ - kw
110
+ - kv
111
+ - la
112
+ - ky
113
+ - lg
114
+ - lb
115
+ - ln
116
+ - li
117
+ - lt
118
+ - lo
119
+ - lv
120
+ - lu
121
+ - mg
122
+ - mi
123
+ - mh
124
+ - ml
125
+ - mk
126
+ - mr
127
+ - mn
128
+ - mt
129
+ - ms
130
+ - na
131
+ - my
132
+ - nd
133
+ - nb
134
+ - ng
135
+ - nl
136
+ - ne
137
+ - 'no'
138
+ - nn
139
+ - nv
140
+ - nr
141
+ - oc
142
+ - oj
143
+ - om
144
+ - ny
145
+ - os
146
+ - or
147
+ - pa
148
+ - pi
149
+ - pl
150
+ - ps
151
+ - pt
152
+ - rm
153
+ - rn
154
+ - qu
155
+ - ro
156
+ - ru
157
+ - sn
158
+ - rw
159
+ - so
160
+ - sa
161
+ - sc
162
+ - sd
163
+ pipeline_tag: image-text-to-text
164
+ library_name: transformers
165
+ base_model:
166
+ - thelamapi/next-ocr
167
+ ---
168
+ <img src='bannerocr.png'>
169
+
170
+ # 🖼️ Next OCR 8B
171
+
172
+ ### *Compact OCR AI — Accurate, Fast, Multilingual, Math-Optimized*
173
+
174
+ [![License: MIT](https://img.shields.io/badge/License-MIT-blue.svg)](https://opensource.org/licenses/MIT)
175
+ [![Language: Multilingual](https://img.shields.io/badge/Language-Multilingual-red.svg)]()
176
+ [![HuggingFace](https://img.shields.io/badge/🤗-Lamapi/Next--OCR--orange.svg)](https://huggingface.co/Lamapi/next-ocr)
177
+ [![Discord](https://cdn.modrinth.com/data/cached_images/e84c69448cbf878a167f996d63e1a253437fcea2.png)](https://discord.gg/XgH4EpyPD2)
178
+
179
+ ---
180
+
181
+ ## 📖 Overview
182
+
183
+ **Next OCR 8B** is an **8-billion parameter model** optimized for **optical character recognition (OCR) tasks** with **mathematical and tabular content understanding**.
184
+
185
+ Supports **multilingual OCR** (Turkish, English, German, Spanish, French, Chinese, Japanese, Korean, Russian...) with high accuracy, including structured documents like tables, forms, and formulas.
186
+
187
+ ---
188
+
189
+ ## ⚡ Highlights
190
+
191
+ * 🖼️ Accurate text extraction, including math and tables
192
+ * 🌍 Multilingual support (30+ languages)
193
+ * ⚡ Lightweight and efficient
194
+ * 💬 Instruction-tuned for document understanding and analysis
195
+
196
+ ---
197
+
198
+ ## 📊 Benchmark & Comparison
199
+
200
+ ![image](https://cdn-uploads.huggingface.co/production/uploads/67d46bc5fe6ad6f6511d6f44/wLtEbJ9U3KCJe4OCxvAF7.png)
201
+
202
+ ---
203
+
204
+ | Model | OCR-Bench Accuracy (%) | Multilingual Accuracy (%) | Layout / Table Understanding (%) |
205
+ | ------------------------------- | ------------------------ | ------------------------- | -------------------------------- |
206
+ | **Next OCR** | **99.0** | **96.8** | **95.3** |
207
+ | PaddleOCR | 95.2 | 93.9 | 95.3 |
208
+ | Deepseek OCR | 90.6 | 87.4 | 86.1 |
209
+ | Tesseract | 92.0 | 88.4 | 72.0 |
210
+ | EasyOCR | 90.4 | 84.7 | 78.9 |
211
+ | Google Cloud Vision / DocAI | 98.7 | 95.5 | 93.6 |
212
+ | Amazon Textract | 94.7 | 86.2 | 86.1 |
213
+ | Azure Document Intelligence | 95.1 | 93.6 | 91.4 |
214
+
215
+ ---
216
+
217
+ | Model | Handwriting (%) | Scene Text (%) | Complex Tables (%) |
218
+ | --------------------------- | --------------- | -------------- | ------------------ |
219
+ | **Next OCR** | 92 | 96 | 91 |
220
+ | PaddleOCR | 88 | 92 | 90 |
221
+ | Deepseek OCR | 80 | 85 | 83 |
222
+ | Tesseract | 75 | 88 | 70 |
223
+ | EasyOCR | 78 | 86 | 75 |
224
+ | Google Cloud Vision / DocAI | 90 | 95 | 92 |
225
+ | Amazon Textract | 85 | 90 | 88 |
226
+ | Azure Document Intelligence | 87 | 91 | 89 |
227
+
228
+ ---
229
+
230
+ ## 🚀 Installation & Usage
231
+
232
+ ```python
233
+ from transformers import AutoTokenizer, AutoModelForVision2Seq
234
+ import torch
235
+
236
+ model_id = "Lamapi/next-ocr"
237
+
238
+ tokenizer = AutoTokenizer.from_pretrained(model_id)
239
+ model = AutoModelForVision2Seq.from_pretrained(model_id, torch_dtype=torch.float16)
240
+
241
+ img = Image.open("image.jpg")
242
+
243
+ # ATTENTION: The content list must include both an image and text.
244
+ messages = [
245
+ {"role": "system", "content": "You are Next-OCR, an helpful AI assistant trained by Lamapi."},
246
+ {
247
+ "role": "user",
248
+ "content": [
249
+ {"type": "image", "image": img},
250
+ {"type": "text", "text": "Read the text in this image and summarize it."}
251
+ ]
252
+ }
253
+ ]
254
+
255
+ # Apply the chat template correctly
256
+ prompt = processor.apply_chat_template(messages, tokenize=False, add_generation_prompt=True)
257
+ inputs = processor(text=prompt, images=[img], return_tensors="pt").to(model.device)
258
+
259
+ with torch.no_grad():
260
+ generated = model.generate(**inputs, max_new_tokens=256)
261
+
262
+ print(processor.decode(generated[0], skip_special_tokens=True))
263
+ ```
264
+
265
+ ---
266
+
267
+ ## 🧩 Key Features
268
+
269
+ | Feature | Description |
270
+ | -------------------------- | --------------------------------------------------------------- |
271
+ | 🖼️ High-Accuracy OCR | Extracts text from images, documents, and screenshots reliably. |
272
+ | 🇹🇷 Multilingual Support | Works with 30+ languages including Turkish. |
273
+ | ⚡ Lightweight & Efficient | Optimized for resource-constrained environments. |
274
+ | 📄 Layout & Math Awareness | Handles tables, forms, and mathematical formulas. |
275
+ | 🏢 Reliable Outputs | Suitable for enterprise document workflows. |
276
+
277
+ ---
278
+
279
+ ## 📐 Model Specifications
280
+
281
+ | Specification | Details |
282
+ | ----------------- | --------------------------------------------------------- |
283
+ | **Base Model** | Qwen 3 |
284
+ | **Parameters** | 8 Billion |
285
+ | **Architecture** | Vision + Transformer (OCR LLM) |
286
+ | **Modalities** | Image-to-text |
287
+ | **Fine-Tuning** | OCR datasets with multilingual and math/tabular content |
288
+ | **Optimizations** | Quantization-ready, FP16 support |
289
+ | **Primary Focus** | Text extraction, document understanding, mathematical OCR |
290
+
291
+ ---
292
+
293
+ ## 🎯 Ideal Use Cases
294
+
295
+ * Document digitization
296
+ * Invoice & receipt processing
297
+ * Multilingual OCR pipelines
298
+ * Tables, forms, and formulas extraction
299
+ * Enterprise document management
300
+
301
+ ---
302
+
303
+ ## 📄 License
304
+
305
+ MIT License — free for commercial & non-commercial use.
306
+
307
+ ---
308
+
309
+ ## 📞 Contact & Support
310
+
311
+ * 📧 Email: [lamapicontact@gmail.com](mailto:lamapicontact@gmail.com)
312
+ * 🤗 HuggingFace: [Lamapi](https://huggingface.co/Lamapi)
313
+
314
+ ---
315
+
316
+ > **Next OCR** — Compact *OCR + math-capable* AI, blending **accuracy**, **speed**, and **multilingual document intelligence**.
317
+
318
+ [![Follow on HuggingFace](https://img.shields.io/badge/Follow-HuggingFace-yellow?logo=huggingface)](https://huggingface.co/Lamapi)