Welcome to the NLP Projects repository! This repository contains a collection of Natural Language Processing (NLP) projects developed by [Your Name or Organization].
- News Classification
- Auto Correct
- Measure Similarity
- Text Summarization
- Email Spam Detection
- Resume Classification
- Knowledge Graph
-
Description: This project aims to classify news articles into different categories using NLP techniques. It involves text preprocessing, feature extraction, and machine learning classification algorithms.
-
Files:
fars_news_v1.2.ipynb
: Jupyter notebook containing the code for news classification.test.txt
: Sample test data for the classification.
-
Description: Implementation of an auto-correct system using NLP algorithms. The system corrects spelling mistakes in text input by suggesting the most probable corrections based on context and language models.
-
Files:
main.py
: Main script for the auto-correct system.model/
: Directory containing modules for edit distance calculation, Jaccard similarity, and pre-processing.words_en.csv
: English word dataset.words_fa.csv
: Persian (Farsi) word dataset.
-
Description: A project to measure the similarity between two texts. It involves calculating various similarity metrics such as cosine similarity, Jaccard similarity, or edit distance.
-
Files:
main.ipynb
: Jupyter notebook containing code for measuring text similarity.data set/
: Directory containing sample text files for similarity measurement.
-
Description: Implementation of text summarization techniques using NLP. The project aims to generate concise summaries of large text documents or articles while preserving the key information.
-
Files:
main.ipynb
: Jupyter notebook with code for text summarization.
-
Description: Detecting spam emails using various machine learning algorithms and NLP features. The project involves text preprocessing, feature extraction, model training, and evaluation.
-
Files:
src/
: Directory containing scripts for pre-processing, feature extraction, model training, and evaluation.README.md
: Details about the project and its implementation.
-
Description: Classifying resumes into different categories based on their content. It involves extracting relevant information from resumes and using machine learning algorithms for classification.
-
Files:
resume_classification.ipynb
: Jupyter notebook for resume classification.resume_dataset.csv
: Dataset containing resume samples.
-
Description: Building a knowledge graph from text data. The project involves extracting entities and relationships from unstructured text and representing them in a structured graph format.
-
Files:
example/
: Directory containing example data and scripts for building the knowledge graph.src/
: Directory containing scripts for extracting details, processing data, and building the knowledge graph.README.md
: Information about the project and how to use it.
Feel free to explore each project folder for more details and instructions on how to run the code.