Skip to content

A repository of tools that are useful in a digital humanities class. These tools are used for exploratory data analysis of written texts in English and french.

License

Notifications You must be signed in to change notification settings

mbardoe/digitalhumanities

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

34 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

digitalhumanities

The goal of this class is to create wrapper functions for various '''nltk''' and other libraries to make explatory data analysis of texts in English and french more easily done within a Jupyter notebook. Some functionality that I would like to see:

  • The ability to take a text, and in a single line create a word cloud.
  • Easily create data displays of substring frequency in the text.
  • Show distribution of paragraph and sentence length.
  • Easily show the the word frequencies in the text.
  • Determine the grade level of various texts.

Example

Below you will see some examples of how the functions can be used with. The goal is that these functions would be easily used in a Jupyter notebook for easy analysis

import digitalhumanities
etrangfr=digitalhumanities.Corpus("texts/etrangerfr.txt", 'french')
etrangfr.wordcloud()

Wordcloud for the french version of The Stranger

etrenglish.show_occurences("?")

Graphs show question marks in English translation of The Stranger

About

A repository of tools that are useful in a digital humanities class. These tools are used for exploratory data analysis of written texts in English and french.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published