Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Vendorize libiconv on macOS #11

Closed

Conversation

savetheclocktower
Copy link

@savetheclocktower savetheclocktower commented Jul 2, 2024

After a long struggle with lots of wrong turns, I think this is how we fix things.

The root problem

On macOS 13 (or 14; I've lost track now), the system version of libiconv switched from the GNU implementation to the FreeBSD implementation. There are API incompatibilities between the two; this presents a problem for superstring, which previously was happy to rely on the ambient libiconv.dylib already present on the default library load path.

When I first noticed this problem back in January, I didn't investigate the root cause; it wasn't affecting our binaries yet because we were pinned to older macOS versions in CI. I just knew that I could fix it locally (i.e., when running from source via ATOM_DEV_RESOURCE_PATH) by installing Homebrew's version of libiconv and making sure its paths were present in my LDFLAGS and CPPFLAGS environment variables.

So when CirrusCI automatically upgraded our macOS images and our Apple Silicon builds started failing with the same iconv errors, I figured the solution was the same. We modified the CI job to run brew install libiconv, modify LDFLAGS and CPPFLAGS, and proceed as usual. That was enough to get Pulsar to compile, so we thought the job was done.

The second problem

Except we hadn't solved anything. We'd gotten Pulsar to compile by telling it about a dynamic library that was present on the CI machine; but all that did was lead Pulsar to assume that that library would be present on everyone's machine. So we ended up producing a version of Pulsar that would refuse to run unless you had libiconv present at /opt/local/opt/libiconv. (That isn't even the standard brew --prefix, so a local Homebrew installation of libiconv would still probably fail this test.)

The fix

A day's worth of flailing ensued, interrupted by occasional research. The most fruitful results were this blog post (which summarized the dilemma rather well and hinted at the eventual solution), followed by this blog post (exposing me to this strange install_name_tool utility and its ability to resolve relative paths).

Then, the breakthrough: searching GitHub for ways that other libraries use binding.gyp and install_name_tool. (That search query isn't working right now, so I'm glad it was working last night.)

Here's the short version:

  • When you build a dylib file, it is tagged with an ID that also happens to be a path on disk. This is the install name of the library. This is how macOS disambiguates various versions/copies of the same dynamic library.
  • If you try to “borrow” a copy of a dylib from a known location, you will confuse the binary that relies on it. It will complain that it can't find the library.
  • You can mess with DYLD_LIBRARY_PATH to get around this, but the better way to do it is to use install_name_tool to point that binary to a different install name (and thus a different path on disk).
  • You can also use install_name_tool to tag a dylib with a different ID.
  • Years ago, it used to be the case that an install name had to be an absolute path on disk; this made it hard for applications to bundle their own dynamic libraries unless they mandated that they be run from the /Applications folder. Luckily, Apple added a few magic tokens that made it possible to define library locations relative to the things that request them.

So our first task is to vendorize libiconv. We were already doing this for Windows; now we're doing it for macOS as well.

Since the vendored files we need are platform-specific, we can copy the files we need from the user's Homebrew installation or from a path that the user provides via an environment variable. We do this as a pre-compile task on macOS only.

In our case, the thing that needs libiconv.dylib is the built superstring.node module. So as a post-compile task, we make sure that the module we just built knows where to find our vendorized copy of libiconv.dylib. The @loader_path magic token is what tells it to search relative to itself.

This should get superstring working on macOS again — not just compiling but also running in a self-contained manner. I've been testing this by temporarily moving the Homebrew version of libiconv to a different place on my local machine just to confirm it doesn't confuse superstring. And the effects of install_name_tool seem to survive a trip through electron-rebuild, so this isn't even something that the pulsar repo needs to be aware of.

How we proceed

I'm fine with landing this however we want to. We can land it on master and tag a new release; we can land it on a branch; doesn't matter much. In the short term, the goal is to be able to modify Pulsar’s package.json to point superstring to a tag/SHA on our repo instead of a version number on NPM. If we do that — and modify the resolutions so that text-buffer uses that same copy for its superstring dependency — then everything should work, and we'll be able to build Apple Silicon binaries again on macOS.

(The same problem would've eventually struck us on the Intel Mac side, too. This way we get out ahead of it.)

Testing

For sanity's sake, I'd love your help testing this no matter which platform you're on. If you're on Windows or Linux, it'd be great to know for sure that I haven't broken anything (though CI should help there).

Any macOS users can try the steps below. If you're on an Intel Mac… well, so am I, but it'd still be good to have another data point. If you're on Apple Silicon, I'm especially interested in your experience.

  • Check out this PR.
  • In the project root, run npm run build:node.
  • If you don't have a Homebrew installation of libiconv, the script should halt the build and tell you why.
  • Once you do have a Homebrew installation of libiconv (brew install libiconv), the script should copy its files into its vendor/libiconv directory and build against them.
  • When it's done:
    • cd build/Release
    • otool -L superstring.node
    • The output should contain the following line:
      @loader_path/../../vendor/libiconv/lib/libiconv.2.dylib (compatibility version 9.0.0, current version 9.1.0)
      
    • cd ../../vendor/libiconv/lib
    • ls -l
    • Verify that libiconv.2.dylib is present.

…to match the version that `superstring` expects.

This is copied directly from what Homebrew installs into `$(brew --prefix)/opt/libiconv`, except that the install name has been changed on `libiconv.2.dylib` to `libiconv.2.dylib`. (When it's installed from Homebrew, its macOS library “install name” matches its original path, so that's where macOS looks for it.

As part of the build, we rename the library with `install_name_tool` so that it can be resolved relative to the location of `superstring.node`.

This can probably be simplified, but first let's see if it'll work.
@savetheclocktower savetheclocktower changed the title Vendor libiconv on macOS Vendorize libiconv on macOS Jul 2, 2024
@savetheclocktower
Copy link
Author

It'd certainly be nice to get CI working on macOS, but that's a pre-existing issue and is probably less urgent than our need to get binaries building again.

@savetheclocktower
Copy link
Author

savetheclocktower commented Jul 2, 2024

Just realized that I've probably vendorized some things that would actually need to be built on a per-platform basis. We could get around this by continuing to rely on the presence of Homebrew libiconv, but running some sort of pre-build action that finds said Homebrew installation, copies the files to where we want them to be, then runs install_name_tool to rename the library file.

I'll look at the Homebrew recipe and try to understand it a bit better. We could vendorize libiconv in unbuilt form and then build it ourselves for the target platform, but then we're basically doing Homebrew's job, except worse.

…because we need a version built for the current machine's architecture.

(If our current approach doesn't work, we could vendorize two different copies — one for each architecture.)
…by grabbing it from Homebrew or from a path specified by the user in an environment variable.
@savetheclocktower
Copy link
Author

savetheclocktower commented Jul 3, 2024

I did a test PR on the pulsar repo with the superstring dependency changed to point to the head of this PR. I was able to produce seemingly working builds for both Intel Mac and Apple Silicon.

I tested the Intel Mac binary and it worked fine; I say “seemingly” here because CI produces an unsigned build for PRs. I have no reason to doubt that code signing would work.

@meadowsys is our guinea pig for Apple Silicon builds, so I'll update when I hear back from them. I did notice this in the CirrusCI build logs…

warning: changes being made to the file will invalidate the code signature in: /opt/homebrew/opt/libiconv/lib/libiconv.2.dylib

…and I believe it would've happened when we used install_name_tool to change that .dylib’s install name. If this ends up causing problems, I think I could manage to work around this and make that step unnecessary.

But then the code signing of the app itself appeared to work properly, so I'm hoping that's all we need. The post-build Playwright smoke tests also ran smoothly.

(Edit: and if we do need to re-sign the .dylib, it might be as simple as this.)

@DeeDeeG
Copy link
Member

DeeDeeG commented Jul 3, 2024

I can vouch that, on my macOS machine (macOS Catalina 10.15.7, intel x86_64 processor), the steps under the Testing section of the PR text above worked as the instructions said they should.

As for getting CI working, I've got a couple of commits here to make a bit of progress on that: https://github.com/DeeDeeG/superstring/commits/472c6db1c05879cad9600a357fda1a2082bdf056 (Gets Python setuptools installed again successfully.)

Would also need to modify the test matrix so Node 14 isn't tested on macos-latest runners... (which are implicitly macos-14 now, which for free tier always runs on ARM64/"Apple Silicon" hosts in GitHub Actions...) since Node 14 never got compatibility with "Apple Silicon" (ARM64 macOS), if I understand correctly. We can run Node 14 tests on macos-12 or macos-13 runners for now.

…by adding an alternate strategy that we can adopt if we ever need to.

Also copy over `libiconv`’s README and license file.
@savetheclocktower
Copy link
Author

I addressed some feedback on Discord by making sure we copy libiconv’s license file and README to the vendor/libiconv directory.

@meadowsys reported on Discord that the Apple Silicon binary works fine, which… was a bit surprising to me? I was pretty sure it was going to complain about the code signature on libiconv.2.dylib — so sure that I had started to adopt a different approach to avoid changing that library's install name. I won't mess with that we've got if it's working, but I will check that approach into the repo just in case we need it in the future.

@savetheclocktower
Copy link
Author

Anyway, here's how we could proceed:

  • Optionally wait for macOS CI jobs to start working (@DeeDeeG, that decision is up to you)
  • Land this PR
  • Tag a new version
  • Optionally publish this version on NPM as @pulsar-edit/superstring or something (don't think we've ever done this before, but it'd be nice to have our forked code be available on NPM for a handful of reasons)
  • Open a PR pointing Pulsar's package.json to the new version of superstring, either on NPM or on GitHub
  • Do some sanity testing on the binary artifacts of that new PR
  • Land that PR

At that point, it'd be great to use this whole ordeal as an excuse to revisit #9 and #10 — neither of which is crucial in the short-term, but both of which would be nice to address. I think it's a good idea to start inching closer to the parts of Pulsar that make us nervous: superstring, other native modules formerly owned by the Atom team, and finally text-buffer.

@savetheclocktower
Copy link
Author

Closing in favor of #15.

@savetheclocktower savetheclocktower deleted the install-name-fix branch July 6, 2024 01:30
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants