This project is part of the Data Science nanodegree program offered by Udacity in collaboration with Figure Eight(data providers).
Following a disaster, typically there will be lots of communication between people either direct or by social media right at the time when response organizations have the least capacity to filter and pull out the least important ones. Often it is really one in every thousand messages that might be relevant to the disaster response organizations.
The components built for this project are the following:
-ETL script(data/process_data.py) to clean data and load it to a database file.
-Training classifier script(models/train_classifier.py) which builds a machine learning model using Natural Language Processing techniques to classify messages into several categories and stores this model into a pickle file.
-Website using flask backend to perform classifying task with any text provided into one/several categories and which provides a description of the training dataset using visualizations.
-
Run the following commands in the project's root directory to set up your database and model.
- To run ETL pipeline that cleans data and stores in database
python data/process_data.py data/disaster_messages.csv data/disaster_categories.csv data/DisasterResponse.db
The table name by default is "Message" - To run ML pipeline that trains classifier and saves
python models/train_classifier.py data/DisasterResponse.db models/classifier.pkl
- To run ETL pipeline that cleans data and stores in database
-
Run the following command in the app's directory to run your web app.
python run.py
-
Go to http://0.0.0.0:3001/