Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add long path support for Windows [Approach #2] #2056

Closed
wants to merge 22 commits into from

Conversation

majaeger
Copy link

@majaeger majaeger commented Nov 23, 2021

Currently, ninja only works for paths less than MAX_PATH (255) on Windows. This change makes ninja long-path aware by manifesting the application as "longPathAware", and using the unicode version of windows file APIs.

See the below docs for more info:
https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation
https://docs.microsoft.com/en-us/dotnet/standard/io/file-path-formats#skip-normalization
https://docs.microsoft.com/en-us/windows/win32/fileio/maximum-file-path-limitation#enable-long-paths-in-windows-10-version-1607-and-later

This is a second attempt based off of #1939 , which attempts to address some of the shortcomings / difficulties with the previous approach.

Difference:
The main difference between this and the previous PR, is that previously all windows APIs were moved to the UNICODE version only. In this one, both UNICODE and non-UNICODE are supported based on the BUILD_UNICODE option set in CMakeLists.txt. This has a couple of advantages:

  • For non-unicode version, there are no extra copies incurred to convert paths to wstrings
  • Compiling for the non-UNICODE version executes the same code paths (mostly) as for non-windows platforms. One of the big difficulties I hit previously was getting the correct behavior between windows and non-windows platforms
  • No core logic changes. Given how deep the concept of paths as std::string is ingrained in the ninja code-base, trying to convert everything to use wstring (and still support std::string for non-windows) is a massive undertaking (And I found that I missed a ton of instances in the previous PR, especially around C file functions). By converting to wstring on-demand, there are no core logic changes with this PR. Just conversions to/from std::string/std::wstring when calling platform APIs.

Based on these performance results, I've enabled the BUILD_UNICODE option by default. If this is not desirable, I can switch it to OFF instead.

Test Name UNICODE ANSI
build_log_perftest min 13ms max 14ms avg 13.2ms min 14ms max 14ms avg 14.0ms
canon_perftest min 189ms max 204ms avg 196.8ms min 187ms max 207ms avg 192.2ms
clparser_perftest Parse 16384 times in 3019ms avg 184.3us Parse 8192 times in 2367ms avg 288.9us
depfile_parser_perftest (Unknown what args to use)
hash_collision_bench 0 collisions after 20000000 runs 0 collisions after 20000000 runs
manifest_parser_perftest (Failed to run python script)

@jonesmz

This comment was marked as abuse.

@majaeger
Copy link
Author

Could you provide a comparison / contrast against #1671 ?

Sure thing @jonesmz . I would say that the fundamental concept between the two is similar, but they have different goals.

The linked PR aims to fully support Unicode within ninja. This means it cares about string contents as well as string type, which results in changes that ripple through a large portion of the ninja code base. It also means that many code paths are different between windows and linux builds. The fact that long paths are supported with this approach is just a side-effect of using the Unicode version of windows C / Win32 APIs. It also fully removes the ability to build for windows with no UNICODE support.

In contrast, this PR enables UNICODE for the sole purpose of enabling long path support. That means that std:;string is still the main representation of a path within ninja, and std::string are converted to std::wstring on-demand only when being passed into win32 / CRT functions which take a file path. This means that no core-logic is actually changed between the unicode/non-unicode builds, at the cost of some minor overhead when copying a std::string to a std::wstring. It also maintains the ability to build without UNICODE defined, which greatly simplifies debugging issues between the windows build and linux build. The downside is that, while UNICODE is defined, this change doesn't actually make any attempt to ensure unicode string contents are handled differently than ascii (Same behavior as today)

@Milerius
Copy link

Milerius commented Dec 6, 2021

when this will be merged and a new release will be available?

@@ -0,0 +1,26 @@
{
Copy link
Contributor

@cristianadam cristianadam Dec 22, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The CMakeSettings.json file is not needed for this PR.

src/file_path.h Outdated Show resolved Hide resolved
@digit-google
Copy link
Contributor

Both this PR and #1671 seem to move in the right direction, but I would suggest creating a dedicated FilePath class that holds a file path in the "native" format, irrespective of "UNICODE", with methods to wrap fopen()/unlink()/stat()/whatever, plus some helpers like FilePath::AppendUTF8(). Of course only use them when actually accessing the file system, Ninja can keep using UTF-8 strings for its internal representation of paths in the build graph, for example.

This will make the code easier to maintain, and avoids introducing yet another build configuration setting / variant for the Win32 build.

And to be clear, I'd happily introduce a PR to do that after any one of these two is merged.

@stephanosio
Copy link

Just adding a reference to the issue that this PR fixes.

Closes #1900.

@majaeger
Copy link
Author

@digit-google , I 100% agree. That's the path I originally started down, however it ended up being a much broader change than I felt comfortable introducing. Especially since it impacted the linux build as well, and I do not have as much experience there.

src/util.h Outdated Show resolved Hide resolved
@digit-google
Copy link
Contributor

It looks like the code change should work on Mingw, only the CMakeList changes needs an update. I wonder if this needs an update in configure.py too? The latter is not used to generate release binaries for Windows, but I assume some people would wnat a Unicode-aware Windows binary generated with it too.

@jhasse
Copy link
Collaborator

jhasse commented Feb 22, 2022

I would prefer it if Microsoft lifts the 260 character limit for *A functions instead.

@jonesmz

This comment was marked as abuse.

@jhasse
Copy link
Collaborator

jhasse commented Feb 22, 2022

[...] and never going to happen for existing versions.

Microsoft has added UTF-8 support to Windows 10 via Windows Update so I don't see why this is never going to happen.

Is ninja planning to drop support for e.g. windows 7, 8, 8.1 and 10?

There is no "ninja" entity which "plans" anything.

@jonesmz

This comment was marked as abuse.

@majaeger
Copy link
Author

I agree it is unlikely that the *A functions are updated to change behavior due to back-compat requirements. While it would be more convenient if we could just use the A functions in this code base, under the hood the A functions also widen the string and forward the calls to the W version of the APIs as well (After performing additional legacy logic related to DOS paths), so there's not actually a big difference.

From an outside contributor's perspective, I think the most helpful thing for me would have been if c++17 could be used easily throughout ninja, which would enable use of std::filesystem and std::string_view APIs . I think that would get the code closer to the kind of change @digit-google described above.

@jonesmz

This comment was marked as abuse.

@jonesmz

This comment was marked as abuse.

@jonesmz

This comment was marked as abuse.

@majaeger
Copy link
Author

@jonesmz , the performance numbers for this change are in the initial PR description. It's basically a win in some tests and a slight loss in others. This is because the win32 API *A() functions themselves also widen the strings passed in, but do additional processing regarding legacy DOS paths. By doing the widening in the ninja code and using the *W() functions, we skip that additional processing, and the path gets passed directly to the filesystem driver.

I haven't looked through all of the changes you linked, but in general I expect performance to be about the same unless ninja uses wide strings everywhere, thus eliminating the need for either the ninja code or the win32 A* functions to widen the strings.

I did actually see your fork before I started work on this. While I personally like the direction you've taken a lot of the code, I'm currently restricted in my corporate build environment to the main project. If my free time opens up though, I'll keep an eye out for issues in your repo that I might be able to assist with :)

@jhasse
Copy link
Collaborator

jhasse commented Feb 23, 2022

If a new feature only works on newer Windows versions, that's not "dropping support for older versions". See UTF-8 or ANSI color output for example.

If a PR would result in ninja.exe not running anymore (or a regression) on Windows 7, I wouldn't merge it.

@emmenlau
Copy link

@majaeger I was trying to pull your changes into Ninja 1.10.2, but its not so easy. I think it may help if you rebased your work on Ninja master, rather than pulling Ninja master into your branch. Or maybe my git-foo is not sufficient to cherry-pick your commits.

@old-pigeon
Copy link

I compiled the latest Ninja git master with this patch in hope to resolve this issue: microsoft/vcpkg#23714

Sadly, it didn't help. Only this fixed the problem: microsoft/vcpkg#23714 (comment)

So it seems this patch is either not complete (catching all cases?) or is buggy.

@Neumann-A
Copy link

So it seems this patch is either not complete (catching all cases?) or is buggy.

This is because there are other tools being a road block.. In the qt case there are moc/uic/rcc which don't have long paths enabled (although they easily could) but the ultimativ roadblock will be the MSVC tools which heavily use MAX_PATH and are thus limited. So unless you use a full llvm toolchain which includes stuff like rc/mt you won't get full long path support but only a few extra characters.

@Ju1He1
Copy link

Ju1He1 commented Jun 21, 2022

So it seems this patch is either not complete (catching all cases?) or is buggy.

This is because there are other tools being a road block.. In the qt case there are moc/uic/rcc which don't have long paths enabled (although they easily could) but the ultimativ roadblock will be the MSVC tools which heavily use MAX_PATH and are thus limited. So unless you use a full llvm toolchain which includes stuff like rc/mt you won't get full long path support but only a few extra characters.

so there is no chance ninja + msvc will receive a patch to resolve this nasty issue?

@tntljc
Copy link

tntljc commented Oct 17, 2022

So it seems this patch is either not complete (catching all cases?) or is buggy.

This is because there are other tools being a road block.. In the qt case there are moc/uic/rcc which don't have long paths enabled (although they easily could) but the ultimativ roadblock will be the MSVC tools which heavily use MAX_PATH and are thus limited. So unless you use a full llvm toolchain which includes stuff like rc/mt you won't get full long path support but only a few extra characters.

Ninja supporting the long path is still valuable at least for my project. The long path won't be involved in any msvc compilation but it blocked CMake + ninja.

@emmenlau
Copy link

Ninja supporting the long path is still valuable at least for my project. The long path won't be involved in any msvc compilation but it blocked CMake + ninja.

Well, if you can build ninja from source, this PR here worked nicely for us for about a year now. If you can not build from source let me know, maybe I can send a pre-built Windows executable.

@old-pigeon
Copy link

When it worked for over a year, can this "just" be merged?

@abique
Copy link

abique commented Oct 17, 2022

This branch has conflicts, so the PR should be updated.

@emmenlau
Copy link

When it worked for over a year, can this "just" be merged?

Well, as was discussed before, there is still no consensus inside the ninja team if this PR is the right way forward. Kindly pinging @jhasse for suggestions where to go from here...

@abique
Copy link

abique commented Oct 17, 2022

Could the long path feature be turned off using a command line switch?

@emmenlau
Copy link

Could the long path feature be turned off using a command line switch?

Why would you want to turn it off?

@abique
Copy link

abique commented Oct 17, 2022

Could the long path feature be turned off using a command line switch?

Why would you want to turn it off?

It seems that the ninja team was against having long path for some reasons, to not break legacy build systems etc...
So having a switch to turn it on/off solves the issue with builds that requires long path disabled.

@emmenlau
Copy link

It seems that the ninja team was against having long path for some reasons, to not break legacy build systems etc...
So having a switch to turn it on/off solves the issue with builds that requires long path disabled.

I do not think that this is the main concern. I think the main concern is that this PR adds a significant change to the codebase, and there are other ways how these changes could be implemented. Depending on the implementation, there are different pro's and con's considering the complexity of the change-set and the effort of future maintenance.

This is just my understanding. But I would be quite surprised if anybody would explicitly disable long path support when its merged. The downsides I can see are in the code complexity and maintenance.

@Osyotr
Copy link

Osyotr commented Oct 17, 2022

The amount of changes in this PR can be reduced by removing #if _UNICODE logic and always using W-versions of WinAPI.
There is no point in supporting A-versions.

@jhasse
Copy link
Collaborator

jhasse commented Oct 17, 2022

I think the main concern is that this PR adds a significant change to the codebase, and there are other ways how these changes could be implemented.

Well kind of. If you mean by "other changes" changes by Microsoft and not in Ninja's code: Support for long paths in combination with UTF-8.

@Selrak
Copy link

Selrak commented Oct 27, 2022

How to use this commit (especially, activate the long path capability) ? (For me with CMake 3.20). Is there an option or something ? Or should my project be configured in some special way ?
I built it locally but I still run into issues with some files not generated by Ninja because their absolute path is too long ...
Alternatively, I'll take a proper release where this is the default behavior !

@cristianadam
Copy link
Contributor

At #2225 I have a very small PR that just adds the "longPathAware" manifest, which allows usage of long paths if the proper registry key is set.

It has 0 code changes and for the users that need long path support they can have it.

@jhasse
Copy link
Collaborator

jhasse commented Feb 7, 2023

closing in favor of #2225

@jhasse jhasse closed this Feb 7, 2023
@majaeger
Copy link
Author

majaeger commented Feb 7, 2023

@jhasse , I am very skeptical that the change in #2225 resolved all of the long path issues that this PR handles. I also can't continue to keep up this branch updated forever if it is not going to be considered though. If I perform testing on #2225 for additional cases and find it is still not sufficient, will this PR actually be considered?

@jhasse
Copy link
Collaborator

jhasse commented Feb 7, 2023

IMHO all additional cases not solved by #2225 should be reported to Microsoft. This really shouldn't be app developers responsibility.

@saschasc
Copy link

saschasc commented Feb 7, 2023

I tested this PR and it solved the long path issues of ninja on Windows. First I tried #2225 and it did not solve anything. I don't understand your reasoning why Microsoft should solve this?

@jhasse
Copy link
Collaborator

jhasse commented Feb 7, 2023

ninja isn't the only Windows program. There are thousands of others. It's less work if Microsoft puts in the (probably huge? No idea, my bet would be that their code sucks BIG TIME) work to fix this in Windows so that all the thousands apps only need #2225 and not #2056, than every program doing #2056.

@cristianadam
Copy link
Contributor

In the qt case there are moc/uic/rcc which don't have long paths enabled (although they easily could) but the ultimativ roadblock will be the MSVC tools which heavily use MAX_PATH and are thus limited. So unless you use a full llvm toolchain which includes stuff like rc/mt you won't get full long path support but only a few extra characters.

I took care of moc/uic/rcc in Qt 6.5 with the longPathAware manifest at https://codereview.qt-project.org/c/qt/qtbase/+/447113, similar to #2225 of ninja.

Soon you can on Windows use: clang-cl, Qt, and ninja to build applications for Windows using the MSVC ABI.

@levicki
Copy link

levicki commented Feb 14, 2024

@jhasse

ninja isn't the only Windows program. There are thousands of others.

But it is the one used to build say xformers from source code, and now that is failing thanks to ninja still not supporting long paths after 4 years of deliberation on how to proceed on the path issue(s) reported in the repo.

I don't see what is stopping the team from addressing this for so long?

  • Using *W APIs should have happened long ago
  • Why not use std::filesystem::path instead of rolling your own filepath class as suggested elsewhere? It supports all relevant conversions.
  • Manifest will be ignored on older Windows which don't support it.
  • ninja can check the registry key on startup and print a warning for the user to enable it if they have issues with long paths.
  • on older Windows versions ninja can prepend \\?\ to the path and override MAX_PATH limitation like pretty much all apps dealing with paths on Windows were doing since like forever.

It's 2024 for God sake, issues with long paths (and/or spaces in paths) should be a thing of the past.

@jhasse
Copy link
Collaborator

jhasse commented Feb 15, 2024

It was reported that long path support works with current master, please give it a try and comment there if it doesn't work for you: #1900 (comment)

@emmenlau
Copy link

It was reported that long path support works with current master, please give it a try and comment there if it doesn't work for you

The solution in master does not work for me. I'm sorry, I thought this was known? A number of people keep on having problems with the solution in master, me for one.

I've enabled the registry setting for long paths in Windows 11, then built ninja from latest master, and it does work in many cases, where it failed without the manifest fix/registry hack before.

However, there are still some cases where the build fails. I could not really figure out why, but for example, Qt fails to build for me (fully reproducible). In contrast, I've switched to the PR #2056 from Matt Jaeger successfully since a rather long time, and subsequently never had any long path issues ever since.

@levicki
Copy link

levicki commented Feb 15, 2024

It was reported that long path support works with current master, please give it a try and comment there if it doesn't work for you: #1900 (comment)

I can do that but I don't know whether putting the ninja.exe built from master into a folder which is in the PATH is enough for pip to use that version when building a wheel provided that no ninja package is installed in current venv or machine wide?

See, that's why we need a release. Install the package and Bob's your uncle instead of having to figure out a way to override each and every build system which is now using ninja.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.