Atomic Web Service (AWS, REST API) for converting DOC/DOCX files to plain/text, powered by catdoc, docx2txt and Node.js
Part of the "Personal Research Information System" atomic web services ecosystem.
Program runs on modern MacOS and Linux distributions. To run program you will need:
For Ubuntu Server Linux distribution
$ git clone https://github.com/malakhovks/doc-docx-extract-api.git
Run program in development mode (default port: 3001; log-mode: development). Winston logging level will be set to debug and transport debug/info/warning logs to Console:
$ npm run start-development
You can set port in ./config/development.json:
{
"port": 3001,
"log-mode": "development"
}
Run program in production mode (default port: 3001; log-mode: production). Winston logging level will be set to error and transport error logs to Console:
$ npm run start-production
You can set port in ./config/production.json:
{
"port": 3001,
"log-mode": "production"
}
$ curl -X POST -F "doc=@document.doc" http://127.0.0.1:3000/api/doctotext
$ curl -X POST -F "docx=@document.docx" http://127.0.0.1:3000/api/docxtotext
HTTP/1.1 200 OK
Content-Type: text/plain
body: raw text