Skip to content

Text recognition (optical character recognition) with deep learning methods, ICCV 2019

License

Notifications You must be signed in to change notification settings

achyutkneupane/deep-text-recognition-benchmark

 
 

Repository files navigation

Easy OCR Training

This repository contains the code for training Easy OCR model on custom dataset.

Installation

Install the required packages using the following command:

pip install -r requirements.txt

Usage

Prepare the dataset

In this case, we have dataset from KEC marksheets. The training data is in train_data folder and the validation data is in val_data folder.

Each folder has a labels.txt file which contains the labels for the images in the folder.

Distribution

There are 5500 (91.92%) images in the training dataset and 483 (8.08%) images in the validation dataset. The ratio is 11.38:1.

Create LMDB dataset

Before creating the dataset, update the create_lmdb_dataset.py to have below on line 47:

From:

imagePath, label = datalist[i].strip('\n').split('\t')
imagePath = os.path.join(inputPath, imagePath)

To:

imagePath, label = datalist[i].strip('\n').split('.jpg,')
imagePath += '.jpg'
imagePath = os.path.join(inputPath, imagePath)

For training dataset

python create_lmdb_dataset.py train_data train_data/labels.txt train_lmdb

For validation dataset

python create_lmdb_dataset.py val_data val_data/labels.txt val_lmdb

Obtaining a pre-trained model

For this step, we use a pretrained model which we can fine-tune on. We can download them from here.

In this project, I used TPS-ResNet-BiLSTM-Attn.pth and placed it in models folder.

Train the model

Now, we can train the model using the following command:

python train.py --train_data train_lmdb --valid_data val_lmdb --select_data "/" --batch_ratio 1.0 --Transformation TPS --FeatureExtraction ResNet --SequenceModeling BiLSTM --Prediction Attn --saved_model models/TPS-ResNet-BiLSTM-Attn.pth --batch_size 8 --data_filtering_off --workers 4 --batch_max_length 80 --num_iter 10 --valInterval 5 --FT

About

Text recognition (optical character recognition) with deep learning methods, ICCV 2019

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Jupyter Notebook 90.1%
  • Python 9.9%