Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Log In
Sign Up
sbintuitions
/
sarashina2.2-ocr
like
23
Follow
SB Intuitions
287
Image-to-Text
Transformers
Safetensors
Japanese
English
sarashina2_vision
text-generation
multimodal
ocr
document-understanding
vision-language
custom_code
arxiv:
2503.09208
License:
mit
Model card
Files
Files and versions
xet
Community
Deploy
Use this model
main
sarashina2.2-ocr
7.81 GB
Ctrl+K
Ctrl+K
2 contributors
History:
6 commits
tkmtakada-sbint
Update README.md
eafb8d4
verified
5 days ago
assets
Initial commit
11 days ago
.gitattributes
Safe
2.46 kB
Initial commit
11 days ago
LICENSE
Safe
1.07 kB
Initial commit
11 days ago
README.md
Safe
10.8 kB
Update README.md
5 days ago
chat_template.jinja
Safe
1.72 kB
Initial commit
11 days ago
config.json
Safe
1.84 kB
Initial commit
11 days ago
configuration_sarashina2_vision.py
Safe
3.5 kB
Initial commit
11 days ago
generation_config.json
Safe
154 Bytes
Initial commit
11 days ago
model.safetensors
7.8 GB
xet
Initial commit
11 days ago
modeling_sarashina2_vision.py
Safe
36.8 kB
Initial commit
11 days ago
preprocessor_config.json
Safe
646 Bytes
Initial commit
11 days ago
processing_sarashina2_vision.py
Safe
32.2 kB
Initial commit
11 days ago
processor_config.json
Safe
152 Bytes
Initial commit
11 days ago
special_tokens_map.json
Safe
968 Bytes
Initial commit
11 days ago
tokenizer.json
Safe
6.72 MB
Initial commit
11 days ago
tokenizer.model
Safe
1.83 MB
xet
Initial commit
11 days ago
tokenizer_config.json
Safe
3.93 kB
Initial commit
11 days ago
video_preprocessor_config.json
Safe
1.11 kB
Initial commit
11 days ago