- Protobuffer:
- gRPC:
- OCR:
- Containers:
- Single Page Application:
- Tesseract OCR Library
- GhostScript CGo library to convert PDF to png.
- Google Protobuffers
- Go gRPC Server
- Node Http Server
- React Singe Page App
- Docker Container
- Build Pipeline uses Linux Makefiles. Just a list of targets with shell commads to run.
Takes a document from any source (Web, PDF) and renders it into an image.
To rasterize PDFs, Ghostscript is required.
Git clone Ghostscript source into the parent folder:
git clone git://git.ghostscript.com/ghostpdl.git ghostpdl
Run make
to build the source and then make so
to build the libraries.
Converts images to text.
###Terreract OCR Engine
Git clone
git clone https://github.com/tesseract-ocr/tesseract.git tesseract
Partitions text into boundaries. The result is some kind of document with structure.
Server/client model for sending documents to be ocr-ed