|
|
|
|
|
--- |
|
|
license: apache-2.0 |
|
|
tags: |
|
|
- trocr |
|
|
- math-ocr |
|
|
- handwritten-math |
|
|
- latex |
|
|
- computer-vision |
|
|
- image-to-text |
|
|
datasets: |
|
|
- custom-math-dataset |
|
|
--- |
|
|
|
|
|
# TrOCR Fine-tuned for Mathematical Expressions |
|
|
|
|
|
This model is a fine-tuned version of [fhswf/TrOCR_Math_handwritten](https://huggingface.co/fhswf/TrOCR_Math_handwritten) for recognizing handwritten mathematical expressions and converting them to LaTeX format. |
|
|
|
|
|
## Model Description |
|
|
|
|
|
- **Architecture**: VisionEncoderDecoder (ViT + Transformer) |
|
|
- **Base Model**: fhswf/TrOCR_Math_handwritten |
|
|
- **Training Data**: Custom mathematical expressions dataset |
|
|
- **Purpose**: Convert images of mathematical equations to LaTeX code |
|
|
|
|
|
## Intended Uses |
|
|
|
|
|
- Digitizing handwritten math equations |
|
|
- Educational applications |
|
|
- Scientific document processing |
|
|
- Math notation recognition |
|
|
|
|
|
## Usage |
|
|
|
|
|
```python |
|
|
from transformers import TrOCRProcessor, VisionEncoderDecoderModel |
|
|
from PIL import Image |
|
|
|
|
|
processor = TrOCRProcessor.from_pretrained("Ntsako12/TrOCR_Tuned") |
|
|
model = VisionEncoderDecoderModel.from_pretrained("Ntsako12/TrOCR_Tuned") |
|
|
|
|
|
image = Image.open("math_equation.jpg").convert("RGB") |
|
|
pixel_values = processor(image, return_tensors="pt").pixel_values |
|
|
generated_ids = model.generate(pixel_values) |
|
|
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0] |
|
|
|
|
|
print(generated_text) # Output: rac{1}{2} + rac{3}{4} |
|
|
|
|
|
Training |
|
|
Epochs: 10 |
|
|
|
|
|
Batch Size: 16 |
|
|
|
|
|
Learning Rate: 5e-5 |
|
|
|
|
|
Framework: PyTorch with Hugging Face Transformers |
|
|
|
|
|
Limitations |
|
|
Performance may vary with different handwriting styles |
|
|
|
|
|
Complex nested expressions might be challenging |
|
|
|
|
|
Requires clear, well-written mathematical expressions |
|
|
|