Skip to content

Latest commit

 

History

History
211 lines (185 loc) · 6.51 KB

optical_text_recognition.md

File metadata and controls

211 lines (185 loc) · 6.51 KB

Optical Text Recognition

VNOnDB

ICFHR2018 Competition on Vietnamese Online Handwritten Text Recognition using HANDS-VNOnDB (VNOnDB in short) database is the first attempt to bring together researchers working on handwritten text recognition and provide them a proper benchmark to compare their approaches on the tasks of transcribing Vietnamese online handwritten text. The goal of this competition is to encourage the studies on Vietnamese online handwritten text recognition and analyze the different approaches of the participants.

This competition (VNOnDB2018) is organized in the framework of the ICFHR 2018 competitions by Nakagawa Laboratory of Tokyo University of Agriculture and Technology, Department of Computer and Information Sciences.

In order to share the ideas and systems for other researchers, we encourage all participants to present their approaches in a conference paper at ICFHR 2018 and also publish their source codes after the competition results have been announced.

Task 1: Word level (VNOnDB-Word)

In task 1, the segmented handwritten words and their ground truth are provided. We verified and eliminated the words which contain the long-distance delayed strokes such as the delayed strokes written after finished other words, or even a sentence. Thus, task 1 is used to evaluate the performance of recognizers with short-distance delayed strokes since in this task, there are only short-distance delayed strokes.

Task 2: Text line level (VNOnDB-Line)

In task 2, the text lines and their ground truth are provided. In this task, there is both long-distance, and short-distance delayed strokes which is appropriate for evaluating the robustness of systems with different kinds of delayed strokes.

Task 3: Paragraph level (VNOnDB-Paragraph)

In task 3, there are the handwritten text, which usually contains multiple text lines, and the paragraph level ground truth, which is a long sequence of characters. Task 3 is suitable for measuring the limitation of recognition system on the long sequences with many delayed strokes.

Leaderboard

Task 1: Word level (VNOnDB-Word)

Public test set Secret test set Paper/Source Code
CER WER CER WER
MyScriptTask1
Segmentation+Feedforward Neural Network (FNN) & BLSTM+CTC
Syllable-based unigram VTB + others
2.91 6.46 6.01 12.66
IVTOVTask1
2 BLSTM layers + CTC/Dictionary/VTB
2.92 6.47 7.31 15.38
GoogleTask1
Multi LSTM layers + CTC/Character & word n-gram
6.09 13.18 9.81 20.45

Task 2: Text line level (VNOnDB-Line)

Public test set Secret test set Paper/Source Code
CER WER CER WER
MyScriptTask2_1
Segmentation+ FNN & BLSTM+CTC
Syllable-based trigram/VTB
1.02 2.02 1.02 3.39
MyScriptTask2_2
Segmentation+FNN & BLSTM+CTC
Syllable-based trigram/VTB + others
1.57 4.02 1.71 5.16
IVTOVTask2
2 BLSTM layers + CTC/Dictionary/VTB
3.24 14.11 5.65 21.07
GoogleTask2
Multi LSTM layers + CTC
Character & word n-gram/Other
6.86 19 10.26 27.05

Task 3: Paragraph level (VNOnDB-Paragraph)

Public test set Secret test set Paper/Source Code
CER WER CER WER
MyScriptTask3_1
Segmentation+FNN & BLSTM+CTC
word-based trigram/VTB
0.78 1.38 1.92 5.81
MyScriptTask3_2
Segmentation+FNN & BLSTM+CTC
syllable-based trigram/VTB + others
1.32 3.4 2.62 7.74
MyScritpTask3_3
Segmentation+FNN & BLSTM+CTC with
Post-processing for Paragraph word-based trigram/VTB
0.4 1.05 3.69 7.84
IVTOVTask3
2 BLSTM layers + CTC/VTB/Dictionary
3.75 16.09 7.31 24.07

Cinnamon AI Marathon: Handwriting OCR for Vietnamese Address

Given an image of a handwritten line, participants are required to create an OCR model to transcribe the image into text.

Leaderboard

Model WER Method Reference Code
CRNN 0.1 Blog Post Official

Miscellaneous

📁 Open sources