Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to extract 7z archives with long paths #116

Closed
asrabon opened this issue Apr 18, 2020 · 11 comments
Closed

Unable to extract 7z archives with long paths #116

asrabon opened this issue Apr 18, 2020 · 11 comments
Labels
bug Something isn't working for extraction Issue on extraction, decompression or decryption windows Issue on WIndows
Milestone

Comments

@asrabon
Copy link

asrabon commented Apr 18, 2020

If you create any 7z archive that has a path over 255 characters(8.3 naming convention). The program will fail to unpack them and give a "_lzma.LZMAError". If the archive contains some files that aren't in a ton of subdirectories to the point where the file falls into the 8.3 naming convention it will extract those fine it is just when it comes across a file with a 7zip file path of more than 255 characters.

@miurahr
Copy link
Owner

miurahr commented Apr 18, 2020

Could @asrabon upload a simple data which include a path over 255 characters which cause the issue?

@asrabon
Copy link
Author

asrabon commented Apr 18, 2020

Here is the file if you need any additional samples please just let me know I will be glad to provide some. The file is a zip of the 7z file since github doesn't allow uploads of 7zs.

7zip with short path and long path.zip

@miurahr
Copy link
Owner

miurahr commented Apr 18, 2020

When running 7z x longpath.7z with p7zip on linux, it produce error as follows:

$ env LANG=C 7z x ~/projects/py7zr/tests/data/longpath.7z 

7-Zip [64] 16.02 : Copyright (c) 1999-2016 Igor Pavlov : 2016-05-21
p7zip Version 16.02 (locale=C,Utf16=off,HugeFiles=on,64 bits,4 CPUs Intel(R) Core(TM) i3-6100U CPU @ 2.30GHz (406E3),ASM,AES-NI)

Scanning the drive for archives:
1 file, 1079 bytes (2 KiB)                        

Extracting archive: /home/miurahr/projects/py7zr/tests/data/longpath.7z
--
Path = /home/miurahr/projects/py7zr/tests/data/longpath.7z
Type = 7z
Physical Size = 1079
Headers Size = 413
Method = LZMA2:12
Solid = -
Blocks = 2

ERROR: Can not open output file : File name too long : ./Users\AnthonyRabon\Downloads\CJ_WS_Spectre-v040920R1_2020-04-09_23-40-44 (1)\CJ_WS_Spectre-v040920R1_2020-04-09_23-40-44\Suspicious Files\Program Files\WindowsApps\AD2F1837.HPPrinterControl_110.1.671.0_x64__v10z8vjag6ke6\HP.Framework.Extensions.ScanCapture\Assets\Arrow.png\Arrow.png\Arrow.png
                                                                        
Sub items Errors: 1

Archives with Errors: 1

Sub items Errors: 1

Does it really extract correctly on your platform?

With py7zr, it also become OS error.

$ env LANG=C py7zr x ~/projects/py7zr/tests/data/longpath.7z 
Exception in thread Thread-1:
Traceback (most recent call last):
  File "/usr/lib/python3.8/threading.py", line 932, in _bootstrap_inner
    self.run()
  File "/usr/lib/python3.8/threading.py", line 870, in run
    self._target(*self._args, **self._kwargs)
  File "/home/miurahr/.virtualenvs/py7zr38/lib/python3.8/site-packages/py7zr/compression.py", line 223, in extract_single
    with fileish.open(mode='wb') as ofp:
  File "/usr/lib/python3.8/pathlib.py", line 1213, in open
    return io.open(self, mode, buffering, encoding, errors, newline,
  File "/usr/lib/python3.8/pathlib.py", line 1069, in _opener
    return self._accessor.open(self, flags, mode)
OSError: [Errno 36] File name too long: 'Users\\AnthonyRabon\\Downloads\\CJ_WS_Spectre-v040920R1_2020-04-09_23-40-44 (1)\\CJ_WS_Spectre-v040920R1_2020-04-09_23-40-44\\Suspicious Files\\Program Files\\WindowsApps\\AD2F1837.HPPrinterControl_110.1.671.0_x64__v10z8vjag6ke6\\HP.Framework.Extensions.ScanCapture\\Assets\\Arrow.png\\Arrow.png\\Arrow.png'
Traceback (most recent call last):
  File "/home/miurahr/.virtualenvs/py7zr38/bin/py7zr", line 10, in <module>
    sys.exit(main())
  File "/home/miurahr/.virtualenvs/py7zr38/lib/python3.8/site-packages/py7zr/__init__.py", line 43, in main
    return cli.run()
  File "/home/miurahr/.virtualenvs/py7zr38/lib/python3.8/site-packages/py7zr/cli.py", line 39, in run
    return args.func(args)
  File "/home/miurahr/.virtualenvs/py7zr38/lib/python3.8/site-packages/py7zr/cli.py", line 209, in run_extract
    a.extractall()
  File "/home/miurahr/.virtualenvs/py7zr38/lib/python3.8/site-packages/py7zr/py7zr.py", line 710, in extractall
    return self.extract(path)
  File "/home/miurahr/.virtualenvs/py7zr38/lib/python3.8/site-packages/py7zr/py7zr.py", line 805, in extract
    self._set_file_property(o, p)
  File "/home/miurahr/.virtualenvs/py7zr38/lib/python3.8/site-packages/py7zr/py7zr.py", line 475, in _set_file_property
    os.utime(str(outfilename), times=(creationtime, creationtime))
OSError: [Errno 36] File name too long

Py7zr should catch os error and show better error message at least.

@miurahr
Copy link
Owner

miurahr commented Apr 18, 2020

I'm wondering that the test data has a 'backslash` for path separator and p7zip and py7zr understand it is a part of filename not path separator. Then p7zip and py7zr try to create a filename which has a backslash as a part of filename with length is over > 255. It may cause an OS error because many file system has a limitation of filename length as 255 bytes.

On linux, it allow 4096 bytes for path length.

@asrabon
Copy link
Author

asrabon commented Apr 18, 2020

I have been using 7zip on windows and it has been extracting the files without issues.

Whenever I iterate through through the list of files in the archive and attempt to extract each one individually. When it hits one with a long path it will eventually give the LZMAError I mentioned above when it attempts to decompress the data.

@miurahr
Copy link
Owner

miurahr commented Apr 18, 2020

I cannot reproduce LZMA error.
Could you post an execution messages?

@miurahr miurahr added bug Something isn't working for extraction Issue on extraction, decompression or decryption needs more info Need more information or test data to reproduce windows Issue on WIndows labels Apr 18, 2020
@miurahr
Copy link
Owner

miurahr commented May 8, 2020

@asrabon I just found Windows has a limitation MAX_PATH in 260 bytes.
How did you handle it?

see https://bugs.python.org/issue27731

@miurahr

This comment has been minimized.

@miurahr
Copy link
Owner

miurahr commented May 9, 2020

Update, I've added modification of registory entry and update py7zr code then it seems solved.

New-ItemProperty -Path Registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem -Name 'LongPathsEnabled' -Type DWord -Value 1
LongPathsEnabled : 1
PSPath           : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control\FileSystem
PSParentPath     : Microsoft.PowerShell.Core\Registry::HKEY_LOCAL_MACHINE\SYSTEM\CurrentControlSet\Control
PSChildName      : FileSystem
PSProvider       : Microsoft.PowerShell.Core\Registry

Then test case

def test_extract_longpath_file(tmp_path):
    with py7zr.SevenZipFile(testdata_path.joinpath('longpath.7z').open('rb')) as archive:
        archive.extractall(path=tmp_path)

passed.
https://ci.appveyor.com/project/miurahr/py7zr/build/job/m6m4xnjvur2d4r78

@miurahr miurahr added this to the v0.7 milestone May 9, 2020
@miurahr
Copy link
Owner

miurahr commented May 9, 2020

v0.7.0b2 released with enhancement for this.
Could @asrabon test the release whether improve your situation with it or not?

Please update your machine repository before testing in order for your Windows 10 to accept long path > 260bytes.

@miurahr miurahr removed the needs more info Need more information or test data to reproduce label May 9, 2020
@miurahr
Copy link
Owner

miurahr commented May 13, 2020

Do not get a negative feedback. Close now but please reopen when not fixes your case.

@miurahr miurahr closed this as completed May 13, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working for extraction Issue on extraction, decompression or decryption windows Issue on WIndows
Projects
None yet
Development

No branches or pull requests

2 participants