Skip to content

read string as an input, and read all abbreviation into en_words

Notifications You must be signed in to change notification settings

yokkm/NLP_read_abbreviation

Repository files navigation

NLP_read_abbreviation

read string as an input, and read all abbreviation into english words

read_abbre_main.py

  • reads read_abbre.pkl to clean the abbreviation
  • abbre_then_replace('abbre_then_replace('i h8 yuo fuckin c-u-n-t, yuo shold dickh3adddddd fuckkkkkkk di3 @$$h0l3 - i will k1ll u @TEOTD')')
  • OUTPUT >>> 'i hate you fucking cunt you should dickhead fuck die asshole i will kill u it the end of the day'

big.txt

  • english words corpus contains 205k unique words (10% of badwords contains in here)

main.py

  • return toxicity levels of words

DEPENDENCIES FILES

these following files need to be locate the same place as the main file to read its function

- read_abbre_main.py
- reads sonar_func.py
- correct_repeatedBadWords.pkl
- New_allcalled.pkl # the file is too large to be stored in github, so I stored in Kaggle dataset
(https://www.kaggle.com/chadapamettapun/nlp-hatespeech)

About

read string as an input, and read all abbreviation into en_words

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages