Skip to content

Retrieving the PDF from a web link and parsing through the file to read and store the text within the PDF file.

Notifications You must be signed in to change notification settings

now-youre-gittin-it/web-scraped-pdf-reader

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 

Repository files navigation

web-scraped-pdf-reader

Retrieving the PDF from a web link via urllib package and parsing through the file to read and store the text within the PDF file. The text is read using the fitz library. Note that there is no need for OCR on the PDF here because it already contains machine-readable text. This code won't work on files that need OCR to obtain readable text. Note: the PDF file from the link is solely the property of the original owners; this is a learning exercise.

About

Retrieving the PDF from a web link and parsing through the file to read and store the text within the PDF file.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published