NLP_read_abbreviation

read string as an input, and read all abbreviation into english words

read_abbre_main.py

reads read_abbre.pkl to clean the abbreviation
abbre_then_replace('abbre_then_replace('i h8 yuo fuckin c-u-n-t, yuo shold dickh3adddddd fuckkkkkkk di3 @$$h0l3 - i will k1ll u @TEOTD')')
OUTPUT >>> 'i hate you fucking cunt you should dickhead fuck die asshole i will kill u it the end of the day'

big.txt

english words corpus contains 205k unique words (10% of badwords contains in here)

main.py

return toxicity levels of words

DEPENDENCIES FILES

these following files need to be locate the same place as the main file to read its function

- read_abbre_main.py
- reads sonar_func.py
- correct_repeatedBadWords.pkl
- New_allcalled.pkl # the file is too large to be stored in github, so I stored in Kaggle dataset
(https://www.kaggle.com/chadapamettapun/nlp-hatespeech)

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
README.md		README.md
big.txt		big.txt
correct_repeatedBadWords.pkl		correct_repeatedBadWords.pkl
corrects_words.py		corrects_words.py
digit_to_en.py		digit_to_en.py
main.py		main.py
nlp_abbre_en.csv		nlp_abbre_en.csv
nlp_read_abbre.py		nlp_read_abbre.py
read_abbre.pkl		read_abbre.pkl
read_abbre.py		read_abbre.py
read_abbre_main.py		read_abbre_main.py
sonar_func.py		sonar_func.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP_read_abbreviation

DEPENDENCIES FILES

About

Releases

Packages

Languages

yokkm/NLP_read_abbreviation

Folders and files

Latest commit

History

Repository files navigation

NLP_read_abbreviation

DEPENDENCIES FILES

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages