Skip to content
This repository has been archived by the owner on Jun 2, 2024. It is now read-only.

Zip file format not compatible with MS OFFICE xlsx #23

Closed
outersky opened this issue Jan 9, 2017 · 17 comments
Closed

Zip file format not compatible with MS OFFICE xlsx #23

outersky opened this issue Jan 9, 2017 · 17 comments

Comments

@outersky
Copy link

outersky commented Jan 9, 2017

unzip some.xlsx file, and re-zip them to another.xlsx with zip-rs, an error dialog will show up when it's opened by MS Office.

@mvdnes
Copy link
Collaborator

mvdnes commented Jan 10, 2017

Could you please provide some details?

  • What is the error message?
  • What is the code you used to pack the zip

@outersky
Copy link
Author

Simple Excel File:
example.xlsx

Unzip and re-zip with zip-rs:

zip_rs.xlsx

Open with MS Excel :
screen shot 2017-01-30 at 21 54 30

Code:

`

extern crate zip;
use std::io;
use std::fs::{self, File, DirEntry};
use std::path::Path;
use zip::ZipWriter;
use std::io::Read;
use std::io::Write;

fn main(){
    let f = File::create("/tmp/zip_rs.xlsx").unwrap();
    let mut zip = zip::ZipWriter::new(f);

    visit(&mut zip, "/tmp/excel_dir", &add_zip_file);
    zip.finish();
}

fn add_zip_file(zw: &mut ZipWriter<File>, prefix: &str, de: &DirEntry) {
    let options = zip::write::FileOptions::default();// .compression_method(zip::CompressionMethod::Stored);
    let path = de.path();
    let path2 = path.strip_prefix(prefix).unwrap().to_str().unwrap();
    let mut buffer = Vec::new();
    let mut f = File::open(&path).unwrap();
    println!("add file: {}", &path2);
    f.read_to_end(&mut buffer);
    zw.start_file(path2, options);
    zw.write_all(&buffer);
}

fn visit(zw: &mut ZipWriter<File>, dir: &str, cb: &Fn(&mut ZipWriter<File>, &str, &DirEntry)) -> io::Result<()> {
    visit_dir(zw, dir, dir, cb)
}

fn visit_dir(zw: &mut ZipWriter<File>, dir: &str, prefix: &str, cb: &Fn(&mut ZipWriter<File>, &str, &DirEntry)) -> io::Result<()> {
    let dir = Path::new(dir);
    if dir.is_dir() {
        for entry in fs::read_dir(dir)? {
            let entry = entry?;
            let path = entry.path();
            let path2 = path.strip_prefix(prefix).unwrap().to_str().unwrap();
            if path.is_dir() {
                let options = zip::write::FileOptions::default().compression_method(zip::CompressionMethod::Stored);
                println!("add dir: {}/", &path2);
                zw.add_directory(format!("{}/", path2), options);
                visit_dir(zw, path.to_str().unwrap(), prefix, cb)?;
            } else {
                cb(zw, prefix, &entry);
            }
        }
    }
    Ok(())
}

`

@mvdnes
Copy link
Collaborator

mvdnes commented Jan 30, 2017

LibreOffice does not have a problem with your file, so I think it is mostly Excel for Mac being picky.

However, I did find some differences.
The original file has:

% unzip -v example.xlsx 
Archive:  example.xlsx
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
    1084  Defl:S      343  68% 1980-01-01 00:00 7afc5813  [Content_Types].xml
     733  Defl:S      263  64% 1980-01-01 00:00 9e54cc7d  _rels/.rels
     557  Defl:S      225  60% 1980-01-01 00:00 dd643fd0  xl/_rels/workbook.xml.rels
    1302  Defl:S      630  52% 1980-01-01 00:00 6c7e03cc  xl/workbook.xml
   13888  Stored    13888   0% 1980-01-01 00:00 d5615c9d  docProps/thumbnail.jpeg
    6788  Defl:S     1663  76% 1980-01-01 00:00 f3141bad  xl/theme/theme1.xml
    1278  Defl:S      584  54% 1980-01-01 00:00 90fa4bac  xl/styles.xml
     832  Defl:S      455  45% 1980-01-01 00:00 fa0966b3  xl/worksheets/sheet1.xml
     639  Defl:S      342  47% 1980-01-01 00:00 c4ddd90f  docProps/core.xml
     795  Defl:S      400  50% 1980-01-01 00:00 c146567d  docProps/app.xml
--------          -------  ---                            -------
   27896            18793  33%                            10 files

It does not have any entries for the directories.
Maybe if you omit zw.add_directory(format!("{}/", path2), options);?
The CRC's are all ok, so the files themselves should be intact.

@outersky
Copy link
Author

outersky commented Jan 30, 2017

It does not work.

zip_rs.xlsx

MS Excel said file needed to be repaired first either on MacOS or on Windows. After that , everything is ok. but the error dialog is annoying.

Now I got this:

mac-2:tmp tony$ unzip -v zip_rs.xlsx 
Archive:  zip_rs.xlsx
 Length   Method    Size  Cmpr    Date    Time   CRC-32   Name
--------  ------  ------- ---- ---------- ----- --------  ----
    1084  Defl:N      329  70% 01-31-2017 07:42 7afc5813  [Content_Types].xml
     733  Defl:N      252  66% 01-31-2017 07:42 9e54cc7d  _rels/.rels
     795  Defl:N      387  51% 01-31-2017 07:42 c146567d  docProps/app.xml
     639  Defl:N      328  49% 01-31-2017 07:42 c4ddd90f  docProps/core.xml
   13888  Stored    13888   0% 01-31-2017 07:42 d5615c9d  docProps/thumbnail.jpeg
     557  Defl:N      215  61% 01-31-2017 07:42 dd643fd0  xl/_rels/workbook.xml.rels
    1278  Defl:N      561  56% 01-31-2017 07:42 90fa4bac  xl/styles.xml
    6788  Defl:N     1491  78% 01-31-2017 07:42 f3141bad  xl/theme/theme1.xml
    1302  Defl:N      602  54% 01-31-2017 07:42 6c7e03cc  xl/workbook.xml
     832  Defl:N      440  47% 01-31-2017 07:42 fa0966b3  xl/worksheets/sheet1.xml
--------          -------  ---                            -------
   27896            18493  34%                            10 files

@mvdnes
Copy link
Collaborator

mvdnes commented Feb 1, 2017

What does it say when you do not use compression on thumbnail.jpeg?

@outersky
Copy link
Author

outersky commented Feb 2, 2017

I already did that.

Now I'm trying to find some clue from jszip which is used in js-xlsx , but it takes time as I'm not familiar with zip file format.

extern crate zip;

use std::io;
use std::fs::{self, File, DirEntry};
use std::path::Path;

use zip::ZipWriter;

use std::io::Read;
use std::io::Write;

fn main() {
    let f = File::create("/tmp/zip_rs.xlsx").unwrap();
    let mut zip = zip::ZipWriter::new(f);

    visit(&mut zip, "/tmp/excel_dir", &add_zip_file);
    zip.finish();
}

fn add_zip_file(zw: &mut ZipWriter<File>, prefix: &str, de: &DirEntry) {
    let path = de.path();
    let path2 = path.strip_prefix(prefix).unwrap().to_str().unwrap();
    let mut options = if path2 == "docProps/thumbnail.jpeg" {
        zip::write::FileOptions::default().compression_method(zip::CompressionMethod::Stored)
    } else {
        zip::write::FileOptions::default()
    };
    let mut buffer = Vec::new();
    let mut f = File::open(&path).unwrap();
    println!("add file: {}", &path2);
    f.read_to_end(&mut buffer);
    zw.start_file(path2, options);
    zw.write_all(&buffer);
}

fn visit(zw: &mut ZipWriter<File>, dir: &str, cb: &Fn(&mut ZipWriter<File>, &str, &DirEntry)) -> io::Result<()> {
    visit_dir(zw, dir, dir, cb)
}

fn visit_dir(zw: &mut ZipWriter<File>, dir: &str, prefix: &str, cb: &Fn(&mut ZipWriter<File>, &str, &DirEntry)) -> io::Result<()> {
    let dir = Path::new(dir);
    if dir.is_dir() {
        for entry in fs::read_dir(dir)? {
            let entry = entry?;
            let path = entry.path();
            let path2 = path.strip_prefix(prefix).unwrap().to_str().unwrap();
            if path.is_dir() {
                /*
                                    let options = zip::write::FileOptions::default().compression_method(zip::CompressionMethod::Stored);
                                    println!("add dir: {}/", &path2);
                                    zw.add_directory(format!("{}/", path2), options);
                */
                visit_dir(zw, path.to_str().unwrap(), prefix, cb)?;
            } else {
                cb(zw, prefix, &entry);
            }
        }
    }
    Ok(())
}

@reviewher
Copy link

@mvdnes is there a way to write ZIP files without the subdirectories? If you compare with vim, you'll see that the bad files have directory entries whereas the OPC says those should not be included in the file:

screen shot 2017-03-09 at 14 21 20

Related discussion from jszip: Stuk/jszip#130 (comment) (got the vim check idea from @SheetJSDev's comment later in the thread)

@LegNeato
Copy link

LegNeato commented Mar 9, 2017

Is that what you are seeing? When I write using this library I don't see it adding subdirs unless I specifically add:

    let file = File::create(&output).unwrap();
    let mut zip = zip::ZipWriter::new(file);

    let walker = WalkDir::new(root).into_iter();
    for entry in walker.filter_entry(|e| !should_ignore(e)) {
        let entry = entry.unwrap();
        let relative_path = entry.path().strip_prefix(root).unwrap();
        let relative_string = relative_path.to_string_lossy();

        // Ignore the entry to the rootdir itself.
        if relative_string == "" {
            continue;
        }
        println!("Adding: {}", relative_string);
        let metadata = entry.metadata().unwrap();
        if metadata.is_file() {
            let mut buffer = String::new();
            let mut f = File::open(entry.path())?;
            // This will fail if something is not UTF-8 encoded.
            try!(f.read_to_string(&mut buffer));
            let options = FileOptions::default()
              .compression_method(zip::CompressionMethod::Deflated)
              .unix_permissions(metadata.permissions().mode());
            try!(zip.start_file(relative_string, options));
            try!(zip.write_all(buffer.as_bytes()));
        } else if metadata.is_dir() {
            // Do nothing for directories.
            // If I uncomment the next line, I will have directory entries. Otherwise only files.
            // try!(zip.add_directory(format!("{}/", relative_string), FileOptions::default()));
        } else {
            writeln!(
                &mut std::io::stderr(),
                "Error: cannot determine file type: {}", relative_path.display(),
            ).expect("failed printing error to stderr");
            process::exit(1);
        }
    }
    try!(zip.finish());

@reviewher
Copy link

I actually haven't tried the code. I was looking for an semi-related issue for a different repo and this came up in the search. The displayed difference compares the results from the files prepared by @outersky .

@GeorgeAverkin
Copy link

The error occurs because created zip file contains UNIX file system but Excel expects FAT

@mvdnes
Copy link
Collaborator

mvdnes commented May 21, 2017

@V-0-1-D: zip does not use a filesystem, it stores its files in a custom way. What does happen is that the zip-file has a number which indicates which version it is made by.

@tafia
Copy link

tafia commented Jan 13, 2018

On calamine side, everything was working fine until v0.2.8. v0.2.9 broke something, not sure what exactly.
Hope it helps.

laumann pushed a commit to laumann/calamine that referenced this issue Jan 13, 2018
This is to work around zip 0.2.9 turning "deflate" into a feature and
calamine disabling all features by default.

See also zip-rs/zip-old#23
@laumann
Copy link

laumann commented Jan 13, 2018

I'm not sure that this is related, but I was able to make calamine's tests run with zip 0.2.9 by turning on the new deflate feature. In calamine's Cargo.toml all features for zip are turned off by default and AFAICT the major change between 0.2.8 and 0.2.9 of zip is that deflate became a feature (and thus got disabled).

Compiling the example program without features:

$ cargo build --example extract --release --no-default-features

and running:

$ target/release/examples/extract issues.xlsx

(issues.xlsx)

gives

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: UnsupportedArchive("Compression method not supported")', /checkout/src/libcore/result.rs:906:4
note: Run with `RUST_BACKTRACE=1` for a backtrace.

Full backtrace for completeness:

thread 'main' panicked at 'called `Result::unwrap()` on an `Err` value: UnsupportedArchive("Compression method not supported")', /checkout/src/libcore/result.rs:906:4
stack backtrace:
   0: std::sys::imp::backtrace::tracing::imp::unwind_backtrace
             at /checkout/src/libstd/sys/unix/backtrace/tracing/gcc_s.rs:49
   1: std::sys_common::backtrace::_print
             at /checkout/src/libstd/sys_common/backtrace.rs:68
   2: std::panicking::default_hook::{{closure}}
             at /checkout/src/libstd/sys_common/backtrace.rs:57
             at /checkout/src/libstd/panicking.rs:381
   3: std::panicking::default_hook
             at /checkout/src/libstd/panicking.rs:397
   4: std::panicking::rust_panic_with_hook
             at /checkout/src/libstd/panicking.rs:577
   5: std::panicking::begin_panic
             at /checkout/src/libstd/panicking.rs:538
   6: std::panicking::begin_panic_fmt
             at /checkout/src/libstd/panicking.rs:522
   7: rust_begin_unwind
             at /checkout/src/libstd/panicking.rs:498
   8: core::panicking::panic_fmt
             at /checkout/src/libcore/panicking.rs:71
   9: core::result::unwrap_failed
             at /checkout/src/libcore/macros.rs:23
  10: <core::result::Result<T, E>>::unwrap
             at /checkout/src/libcore/result.rs:772
  11: extract::real_main
             at examples/extract.rs:22
  12: extract::main
             at examples/extract.rs:7
  13: __rust_maybe_catch_panic
             at /checkout/src/libpanic_unwind/lib.rs:101
  14: std::rt::lang_start
             at /checkout/src/libstd/panicking.rs:459
             at /checkout/src/libstd/panic.rs:365
             at /checkout/src/libstd/rt.rs:58
  15: main
  16: __libc_start_main
  17: _start

@mvdnes
Copy link
Collaborator

mvdnes commented Jan 13, 2018

Ah I am sorry, this new change was indeed incompatible for people running --no-default-features. However, I do not think this is related to the issue discussed here.

tafia pushed a commit to tafia/calamine that referenced this issue Jan 15, 2018
This is to work around zip 0.2.9 turning "deflate" into a feature and
calamine disabling all features by default.

See also zip-rs/zip-old#23
@tafia
Copy link

tafia commented Jan 15, 2018

However, I do not think this is related to the issue discussed here.

Sorry about that.

And thanks @laumann !

@mvdnes
Copy link
Collaborator

mvdnes commented May 23, 2018

I believe the cause of this issue was the same as #72.
On Android, the zip now works. @outersky, can you confirm it now works on OSX?

@mvdnes
Copy link
Collaborator

mvdnes commented Jun 22, 2018

Closing because I think it is fixed. Feel free to re-open if this is not actually the case.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants