Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for Microsoft Office documents #41

Open
snydman opened this issue Oct 5, 2017 · 6 comments
Open

Add support for Microsoft Office documents #41

snydman opened this issue Oct 5, 2017 · 6 comments

Comments

@snydman
Copy link

snydman commented Oct 5, 2017

Similar to Box.com.

@tomcrane
Copy link

tomcrane commented Nov 8, 2017

Is there a list of required formats? I imagine Word, Excel are mandatory, but what about the others?

https://en.wikipedia.org/wiki/Microsoft_Office#Desktop_apps

@anarchivist
Copy link

Speaking for the Virtual Tribunals project, we'd need support for Microsoft Word.

@tomcrane
Copy link

@snydman @anarchivist

The major attraction of box as the mechanism for rendering various document formats in the UV is that the development effort for the client is much simpler - the box viewer (https://github.com/box/viewer.js) could be incorporated into a UV extension, and will handle a huge number of formats. Without box, many independent solutions would be required for different formats, and some formats would be impossible.

However, the box viewer client library doesn’t deal with documents in their native formats; it renders documents that have already been transformed via the box API into an intermediate form. You instantiate a viewer and point it at the box version of the document (https://github.com/box/viewer.js#loading-a-simple-viewer). That requires an integration between the repository and box, so that the Word document/spreadsheet/etc is uploaded to box and converted. That is, the simple box implementation in the UV requires your assets to be on box first.

I've just been taking a look at how Confluence (the wiki we use) renders uploaded office docs in the browser. It converts them to PDF on the server and then renders the PDF (in the same way the UV currently renders a PDF).

Other lines of attack:

Office web viewer in a browser: https://blogs.office.com/en-us/2013/04/10/office-web-viewer-view-office-documents-in-a-browser/?eu=true
(in an iFrame maybe)?

Google docs viewer: https://jsfiddle.net/7xr419yb/embedded/result/ (from https://stackoverflow.com/questions/27957766/how-do-i-render-a-word-document-doc-docx-in-the-browser-using-javascript)

Viewer.js (a different one) - works for PDFs and Open Document Format, but not MS Office docx etc: http://viewerjs.org/examples/

@anarchivist
Copy link

Hi @tomcrane - I think that's fine; my biggest concern is whether we'll need to transfer the files themselves to Box.com, rather than "just" hit an API with the files. Sending files to an API alone is not a major concern for my project, since all the assets are currently public; I defer to @snydman and others whether this would be a concern otherwise for any resources that might need to be behind auth.

@snydman
Copy link
Author

snydman commented Nov 15, 2017

Reading @tomcrane 's post a second time it seems that our docs would need to be on Box, which seems like a non-starter. I am leaning strongly towards the "convert Office docs to PDF" approach and make the native Office docs available for download in the viewer. Treat PDF as a derivative format generated during pre-accessioning, like JP2 creation.

@tomcrane
Copy link

Getting the UV to render Office docs from Box would probably be quite easy, but as you say a non-starter. I think that focusing effort on a really good user experience for PDFs, for which there is established web practice and JavaScript client libraries, would be a much better use of development time.

pinging @edsilv

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants