Why did we create this project?
- In the
Laravel
project, it was necessary to extract texts from large files. Existing packages do not work with files larger than 50 megabytes. - Text extraction is an expensive operation. Running on a separate server will reduce the load.
- It was necessary to create a cover for the source.
Install Docker and Docker Compose
git clone https://github.com/dotcode-moscow/pdf-api.git
cd pdf-api
docker-compose up -d pdf-api
Extracts text from a file. As a parameter, we pass the URL to the file.
ping-pong method
Image to pdf converter
curl -d "url=https://trove.nla.gov.au/newspaper/rendition/nla.news-page29291123.pdf" "http://localhost:8080/api/extractText"
http://localhost:8080/api/extractText?url=https://trove.nla.gov.au/newspaper/rendition/nla.news-page29291123.pdf
"Page number" (without sorting) and "extracted text".
"img" - jpeg base64 front page cover
{
"1":"National Library of Australia...",
"img": "data:image/jpeg;base64..."
}
network_mode: "host"
Pull requests are welcome.