- Fork or clone this repo to your local machine.
- Navigate to the project directory and activate the virtualenv
source dev/bin/activate
- Install requirements
pip install -rrequirements.txt
from the root directory. - Navigate to the packages -> regression_model then run
tox
to make sure that everything is passing.
The following kaggle notebooks includes the complete basic analyses and EDA
Scaffold template - starter template
Notes to consider:
- If you intended to use this project as a reference to build your own, you need to define your environmental variables in the circleci.
- Tests are disabled for this project, would appreciate if anyone raise a pull request to include them.
I structured the scaffold based on the OOP which seperate concerns of code.
processing
folder conatains any scripts for data wrangeling, cleaning, or feature engineering.trained_model
folder contains any scripts dedicated to build model, tuning or any related scripts.pipeline.py
file contains all the procedures that should be done using thesklearn.pipeline
predict.py
file dedicated for getting out the predictionstrain_pipeline.py
file dedicated to train the model, starting from downloading the dataset, split, apply pipeline...etc.
- The resulted model is saved as a
.pkl
file versioned with the same version of the package. - Only one API end point is already deployed to Heroku, pull requests are welcome
- Build extra API end points
- Build frontend interface using streamlit
- Complete unit tests.
- Create a configuration file for travis
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.