TrOCR Fine-tuned for Mathematical Expressions

This model is a fine-tuned version of fhswf/TrOCR_Math_handwritten for recognizing handwritten mathematical expressions and converting them to LaTeX format.

Model Description

  • Architecture: VisionEncoderDecoder (ViT + Transformer)
  • Base Model: fhswf/TrOCR_Math_handwritten
  • Training Data: Custom mathematical expressions dataset
  • Purpose: Convert images of mathematical equations to LaTeX code

Intended Uses

  • Digitizing handwritten math equations
  • Educational applications
  • Scientific document processing
  • Math notation recognition

Usage

from transformers import TrOCRProcessor, VisionEncoderDecoderModel
from PIL import Image

processor = TrOCRProcessor.from_pretrained("Ntsako12/TrOCR_Tuned")
model = VisionEncoderDecoderModel.from_pretrained("Ntsako12/TrOCR_Tuned")

image = Image.open("math_equation.jpg").convert("RGB")
pixel_values = processor(image, return_tensors="pt").pixel_values
generated_ids = model.generate(pixel_values)
generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

print(generated_text)  # Output: rac{1}{2} + rac{3}{4}

Training
Epochs: 10

Batch Size: 16

Learning Rate: 5e-5

Framework: PyTorch with Hugging Face Transformers

Limitations
Performance may vary with different handwriting styles

Complex nested expressions might be challenging

Requires clear, well-written mathematical expressions
Downloads last month
13
Safetensors
Model size
0.6B params
Tensor type
F32
ยท
Inference Providers NEW
This model isn't deployed by any Inference Provider. ๐Ÿ™‹ Ask for provider support