GitHub - RMNCLDYO/HTTParser: HTTParser is an open-source Python library designed for parsing web content using various HTTP methods. It allows for both static and dynamic content extraction, making it a versatile tool for web scraping and data retrieval tasks.

Overview

HTTParser is an open-source Python library designed for parsing web content using various HTTP methods. It allows for both static and dynamic content extraction, making it a versatile tool for web scraping and data retrieval tasks.

This tool is valuable for anyone working with web scraping, API testing, or any application requiring advanced HTTP response handling and parsing. Its modular design allows for easy extension or modification to suit specific needs or handle various web content types.

Key Features

Supports GET and POST methods.
Handles multiple response formats: JSON, HTML, JavaScript.
Customizable request headers, parameters, and payload.
Option to parse dynamic content using Selenium WebDriver.
Simple and intuitive interface for making HTTP requests.

Prerequisites

Python 3.x

Dependencies

The following Python packages are required:

requests: For making HTTP requests.
beautifulsoup4: Library for parsing results.

The following Python packages are optional:

selenium: Library for loading dynamic content.

Installation

To install HTTParser, clone the repository and install dependencies:

git clone https://github.com/RMNCLDYO/HTTParser.git
cd HTTParser
pip install -r requirements.txt

Available Variables

url: URL of the page to be parsed. _{( REQUIRED )}
method: HTTP method, options: "get" or "post". _{( REQUIRED )}
response_format: Response format, options: "js", "json", or "html". _{( REQUIRED )}
headers: Custom HTTP headers, format: { "header_name": "header_value" }. _{( OPTIONAL )}
params: URL parameters, format: { "param_name": "param_value" }. _{( OPTIONAL )}
payload: Data payload for POST requests, format: { "payload_name": "payload_value" }. _{( OPTIONAL )}
browser_path: Path to the web browser, used for JavaScript rendering. _{( OPTIONAL )}
chromedriver_path: Path to ChromeDriver, used for JavaScript rendering. _{( OPTIONAL )}

Usage

HTML Usage

GET Method

from httparser import HTTParser

request = HTTParser(
    url="https://httpbin.org/html",
    method="get",
    response_format="html"
)

response = request.response()
print(response)

JSON Usage

GET Method

from httparser import HTTParser

request = HTTParser(
    url="https://httpbin.org/json",
    method="get",
    response_format="json"
)

response = request.response()
print(response)

POST Method

from httparser import HTTParser

request = HTTParser(
    url="https://httpbin.org/anything",
    method="post",
    response_format="json",
    payload={"HTTParser":"Example Payload"}
)

response = request.response()
print(response)

Dynamic (JS) Usage

GET Method

from httparser import HTTParser

request = HTTParser(
    url="https://httpbin.org/delay/3",
    method="get",
    response_format="js",
    browser_path="/path/to/browser",
    chromedriver_path="/path/to/chromedriver"
)

response = request.response()
print(response)

Dynamic Content Rendering with Javascript _{( * optional * )}

Installation

pip install selenium

Setting Up ChromeDriver and WebDrivers

To ensure HTTParser works effectively, especially for content that requires JavaScript rendering, you'll need to download and set up ChromeDriver and a compatible WebDriver.

Choosing a Compatible WebDriver

While ChromeDriver is designed for Chrome, you can also use it with other Chromium-based browsers. Here are some options:

Google Chrome
Brave Browser
Opera Browser

Visit Supported WebDrivers to explore other Chromium-based browsers.

Downloading ChromeDriver

Visit ChromeDriver Downloads to download the latest ChromeDriver.
Choose the version that matches your browser's version. To check your browser version, navigate to 'Help > About' in your browser.
Download the appropriate ChromeDriver for your operating system (Windows, Mac, or Linux).

Installing ChromeDriver

Follow the detailed instructions on the ChromeDriver Getting Started page for your specific operating system.

Error Handling

HTTParser logs errors in Error.log. Check this file for error details.

Contributing

Contributions are welcome!

Please refer to CONTRIBUTING.md for detailed guidelines on how to contribute to this project.

Reporting Issues

Encountered a bug? We'd love to hear about it. Please follow these steps to report any issues:

Check if the issue has already been reported.
Use the Bug Report template to create a detailed report.
Submit the report here.

Your report will help us make the project better for everyone.

Feature Requests

Got an idea for a new feature? Feel free to suggest it. Here's how:

Check if the feature has already been suggested or implemented.
Use the Feature Request template to create a detailed request.
Submit the request here.

Your suggestions for improvements are always welcome.

Versioning and Changelog

Stay up-to-date with the latest changes and improvements in each version:

CHANGELOG.md provides detailed descriptions of each release.

Security

Your security is important to us. If you discover a security vulnerability, please follow our responsible disclosure guidelines found in SECURITY.md. Please refrain from disclosing any vulnerabilities publicly until said vulnerability has been reported and addressed.

License

Licensed under the MIT License. See LICENSE for details.

Name		Name	Last commit message	Last commit date
Latest commit History 54 Commits
.github		.github
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
httparser.py		httparser.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Overview

Key Features

Prerequisites

Dependencies

Installation

Available Variables

Usage

HTML Usage

JSON Usage

Dynamic (JS) Usage

Dynamic Content Rendering with Javascript _{( * optional * )}

Installation

Setting Up ChromeDriver and WebDrivers

Choosing a Compatible WebDriver

Downloading ChromeDriver

Installing ChromeDriver

Error Handling

Contributing

Reporting Issues

Feature Requests

Versioning and Changelog

Security

License

About

Releases

Packages

Languages

License

RMNCLDYO/HTTParser

Folders and files

Latest commit

History

Repository files navigation

Overview

Key Features

Prerequisites

Dependencies

Installation

Available Variables

Usage

HTML Usage

JSON Usage

Dynamic (JS) Usage

Dynamic Content Rendering with Javascript ( * optional * )

Installation

Setting Up ChromeDriver and WebDrivers

Choosing a Compatible WebDriver

Downloading ChromeDriver

Installing ChromeDriver

Error Handling

Contributing

Reporting Issues

Feature Requests

Versioning and Changelog

Security

License

About

Topics

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Releases

Packages 0

Languages

Dynamic Content Rendering with Javascript _{( * optional * )}

Packages