Skip to content

This Repository shows how Python can Interact with the PDF file. This repo shows how we can generate sentences from "PATTERN MATCHING IRREGULAR" pdf file and then how to make the new pdf file from the with NO IRREGULARITY in proper format.

Notifications You must be signed in to change notification settings

AvirajBattan/Extract-Data

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

5 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

extract_data

task-1.py := this script extract all the sentences and the numerical data associated with senteces from the a.pdf file ( sentences will be extracted and store the sentences in output.txt and numerical data will be store in number_data.txt)

task-2.py := this script merge the sentences and numerical data to form new file called new_a.txt which is same as a.pdf but better formatted.

output directory contains all the output from task1 and task2.

About

This Repository shows how Python can Interact with the PDF file. This repo shows how we can generate sentences from "PATTERN MATCHING IRREGULAR" pdf file and then how to make the new pdf file from the with NO IRREGULARITY in proper format.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages