Skip to content

Latest commit

 

History

History
123 lines (109 loc) · 4.73 KB

word_segmentation.md

File metadata and controls

123 lines (109 loc) · 4.73 KB

Word Segmentation

VLSP 2013

The training set consists of 75k manually word-segmented sentences (about 23 words per sentence in average). The test set consists of 2120 sentences (about 31 words per sentence) in 10 files from 800001.seg to 800010.seg.

Leaderboard

Model F1 Method Reference Code
UITws-v1 98.06 Nguyen et al. PACLING'19 Official
RDRsegmenter 97.90 Nguyen et al. LREC'18 Official
jPTDP-v2 97.90 Nguyen et al. CoNLL'18 Nguyen '18 Official
Biaffine 97.90 Dozat and Manning ICLR'17 Nguyen '18
UETsegmenter 97.87 Nguyen et al. RIVF'16 Official
JointWPD 97.78 Nguyen '18
vnTokenizer 97.33 Le et al. LATA'08 Official
JVnSegmenter 97.06 Nguyen et al. PACLIC'06 Official
DongDu 96.90 Official

VietTreeBank

References

Miscellaneous

📜 Papers

💫 Services:

📁 Open sources