Skip to content

axflow/pdf-ts

Repository files navigation

pdf-ts

pdf-ts is a TypeScript library for PDF text extraction. It uses Mozilla's PDF.js to expose a simple API for text extraction.

npm i pdf-ts

Examples

Extract text from a PDF.

import {pdfToText} from 'pdf-ts';
const pdf = await fs.readFile('./path/to/file.pdf');
const text = await pdfToText(pdf);
console.log(text);

Extract a list of pages from a PDF.

import {pdfToPages} from 'pdf-ts';
const pdf = await fs.readFile('./path/to/file.pdf');
const pages = await pdfToPages(pdf);
console.log(pages); // [{page: 1, text: '...'}, {page: 2, text: '...'}, ...]

License

MIT