-
-
Notifications
You must be signed in to change notification settings - Fork 356
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Not correctly recognizing all .docx files #103
Comments
In which program did you create the file? This check only recognizes files created in Microsoft Office. |
Looks like it works in
It fails on https://github.com/sindresorhus/file-type/blob/master/index.js#L126. @forivall do you have any idea why? |
The first file in the zip archive is to see for yourself, use zipfinfo to see the difference between your docx and the fixture
I'm guessing that the UofT doc was created with an older version of openoffice or libreoffice that doesn't replicate the ms office zip file order. or no versions of openoffice or libreoffice do that. or some other generator. Using pandoc to generate a docx yields a properly identifiable file. I don't have open/libreoffice on my current computer to test if that is the case. |
You are correct that I didn't create the file with a standard desktop version of Microsoft Word. It's been a while since the file was created. Unfortunately I don't remember the tool but most likely some sort of open office. I have a number of other files that exhibit the same problem though, I don't think it's unique to this particular file. |
The main thing is that I'd like to learn how file actually detects it, since i'd like feature parity on that. Otherwise, we could add in alternate logic such that if the filename starts with However, we probably want to keep the limit of only needing 4100 bytes to detect. |
@forivall the magic doesn't do what the comments suggest.
The regex actually tests the entire file starting from offset
This file is matched because the string @sindresorhus is it necessary to replicate the file Magic logic, or would it make more sense to jump to the central directory for zip files? |
I see this issue with a docx exported from google docs. version 8.1.0 |
@BradleyDHobbs, please open a new issue, and refer to this one. In that issue, can you zip and attach a small sample file? |
When I extract the file type for the sample .docx file in the unit test it works fine for me. However when I try extracting the file type for a test file I created on my Mac the library thinks it's a zip file.
UofTCSCoop.docx
The text was updated successfully, but these errors were encountered: