New characteristics

1、Add the functions to generate datas for text detection and recognition at the same time!

2、Also add functions to generate Chinese and english or other langs with varying length!

3、Add the function of vertical texts generation!

Command

python main.py --strict --direction vertical

The direction is to confirm which texts direction you want to generate, the default value is horizonal.

Generate text detection data(e.g. PSENET)

corresponding lables: 542,257,665,257,665,270,542,270,Riders of Rohan in gt_img_1.txt in img directory.

corresponding lables: 645,6,666,6,666,396,645,396,有恐怖的感受，那就大大地增 in gt_img_2.txt in img directory.

Generate text recognition data(e.g. CRNN)

Todo

1、Add multi sentences with one or more langs in one pic.

2、Add curve and multi directions texts in one pic.

Origin readme(Text Renderer)

Generate text images for training deep learning OCR model (e.g. CRNN). Support both latin and non-latin text.

Setup

Ubuntu 16.04
python 3.5+

Install dependencies:

pip3 install -r requirements.txt

Demo

By default, simply run python3 main.py will generate 20 text images and a labels.txt file in output/default/.

Use your own data to generate image

Please run python3 main.py --help to see all optional arguments and their meanings. And put your own data in corresponding folder.
Config text effects and fraction in configs/default.yaml file(or create a new config file and use it by --config_file option), here are some examples:

Effect name	Image
Origin(Font size 25)
Perspective Transform
Random Crop
Curve
Light border
Dark border
Random char space big
Random char space small
Middle line
Table line
Under line
Emboss
Reverse color
Blur
Text color
Line color

Run main.py file.

Strict mode

For no-latin language(e.g Chinese), it's very common that some fonts only support limited chars. In this case, you will get bad results like these:

Select fonts that support all chars in --chars_file is annoying. Run main.py with --strict option, renderer will retry get text from corpus during generate processing until all chars are supported by a font.

Tools

You can use check_font.py script to check how many chars your font not support in --chars_file:

python3 tools/check_font.py

checking font ./data/fonts/eng/Hack-Regular.ttf
chars not supported(4971):
['第', '朱', '广', '沪', '联', '自', '治', '县', '驼', '身', '进', '行', '纳', '税', '防', '火', '墙', '掏', '心', '内', '容', '万', '警','钟', '上', '了', '解'...]
0 fonts support all chars(5071) in ./data/chars/chn.txt:
[]

Generate image using GPU

If you want to use GPU to make generate image faster, first compile opencv with CUDA. Compiling OpenCV with CUDA support

Then build Cython part, and add --gpu option when run main.py

cd libs/gpu
python3 setup.py build_ext --inplace

Debug mode

Run python3 main.py --debug will save images with extract information. You can see how perspectiveTransform works and all bounding/rotated boxes.

Todo

See https://github.com/Sanster/text_renderer/projects/1

Name		Name	Last commit message	Last commit date
Latest commit History 7 Commits
configs		configs
data		data
gists		gists
imgs		imgs
libs		libs
textrenderer		textrenderer
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
help_runner.py		help_runner.py
main.py		main.py
parse_args.py		parse_args.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

New characteristics

Command

Generate text detection data(e.g. PSENET)

Generate text recognition data(e.g. CRNN)

Todo

Origin readme(Text Renderer)

Setup

Demo

Use your own data to generate image

Strict mode

Tools

Generate image using GPU

Debug mode

Todo

About

Releases

Packages

Languages

License

yangsuhui/text_data_generator

Folders and files

Latest commit

History

Repository files navigation

New characteristics

Command

Generate text detection data(e.g. PSENET)

Generate text recognition data(e.g. CRNN)

Todo

Origin readme(Text Renderer)

Setup

Demo

Use your own data to generate image

Strict mode

Tools

Generate image using GPU

Debug mode

Todo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages