Ntsako12
/

trocr_Tuned

vision-encoder-decoder

handwritten-math

computer-vision

Model card Files Files and versions

trocr_Tuned / README.md

Ntsako12's picture

Upload README.md with huggingface_hub

4c436f5 verified 3 months ago

|

history blame contribute delete

1.66 kB


	---
	license: apache-2.0
	tags:
	- trocr
	- math-ocr
	- handwritten-math
	- latex
	- computer-vision
	- image-to-text
	datasets:
	- custom-math-dataset
	---

	# TrOCR Fine-tuned for Mathematical Expressions

	This model is a fine-tuned version of [fhswf/TrOCR_Math_handwritten](https://huggingface.co/fhswf/TrOCR_Math_handwritten) for recognizing handwritten mathematical expressions and converting them to LaTeX format.

	## Model Description

	- Architecture: VisionEncoderDecoder (ViT + Transformer)
	- Base Model: fhswf/TrOCR_Math_handwritten
	- Training Data: Custom mathematical expressions dataset
	- Purpose: Convert images of mathematical equations to LaTeX code

	## Intended Uses

	- Digitizing handwritten math equations
	- Educational applications
	- Scientific document processing
	- Math notation recognition

	## Usage

	```python
	from transformers import TrOCRProcessor, VisionEncoderDecoderModel
	from PIL import Image

	processor = TrOCRProcessor.from_pretrained("Ntsako12/TrOCR_Tuned")
	model = VisionEncoderDecoderModel.from_pretrained("Ntsako12/TrOCR_Tuned")

	image = Image.open("math_equation.jpg").convert("RGB")
	pixel_values = processor(image, return_tensors="pt").pixel_values
	generated_ids = model.generate(pixel_values)
	generated_text = processor.batch_decode(generated_ids, skip_special_tokens=True)[0]

	print(generated_text) # Output: rac{1}{2} + rac{3}{4}

	Training
	Epochs: 10

	Batch Size: 16

	Learning Rate: 5e-5

	Framework: PyTorch with Hugging Face Transformers

	Limitations
	Performance may vary with different handwriting styles

	Complex nested expressions might be challenging

	Requires clear, well-written mathematical expressions