AutoEDA is an open-source project designed to automate the data preprocessing workflow, making it easier for data scientists and analysts to prepare their datasets for exploratory data analysis (EDA) and machine learning model building. This toolkit aims to eliminate null values, prepare clean datasets, perform feature engineering, and optimize preprocessing strategies for seamless integration with various machine learning models.
We welcome contributions from the community! Below are the areas where you can contribute:
- Enhancing the Frontend: Improve the user interface and user experience of the application using modern frontend frameworks and libraries.
- Adding Necessary Pages: Implement additional pages like a feedback form or documentation section to enhance user interaction and gather input.
Contributors can assist in the following areas:
- Data Loading: Implement efficient methods to read and store CSV files.
- Data Cleaning: Develop algorithms for handling null values, removing duplicates, and correcting data types.
- Feature Engineering: Introduce techniques to create new features from existing data for improved model performance.
- Model Training: Experiment with various machine learning algorithms to optimize preprocessing strategies.
- Building Functions for Data Processing: Develop different functions that support data cleaning, filtering, and preprocessing procedures.
- Creating APIs: Build APIs for the machine learning model to handle data uploads and downloads efficiently.
- Integration: Ensure seamless integration between the frontend and backend components.
- Dockerization: Dockerize the application to streamline deployment and ensure consistent environments.
- React.JS + Vite (for frontend)
- Python 3.x (for backend + model building)
- Docker (for containerization)
Make sure to run the .gitignore file.
To clone the repository, run the following command:
git clone https://github.com/Nidhi-Satyapriya/AutoEDA-Automated-Data-Preprocessing-Toolkit
Once cloned, follow the instructions in the respective frontend and backend directories for setup and running the application.