Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Cannot install tfjs-node on production server #7265

Closed
travisvadnais opened this issue Jan 11, 2023 · 16 comments
Closed

Cannot install tfjs-node on production server #7265

travisvadnais opened this issue Jan 11, 2023 · 16 comments

Comments

@travisvadnais
Copy link

travisvadnais commented Jan 11, 2023

Hello -

I'm working on a project using danfojs-node@1.1.2 and @tensorflow/tfjs-node@3.20.0. I'm able to run this locally, but it fails when trying to deploy to a dev environment using Cloud Foundry.

I initially was using 3.21.0, but the http key for the pre-built binary was resulting in a 404 error.

Relevant portion of the logs:

OUT > node scripts/install.js
OUT * Building TensorFlow Node.js bindings
OUT CPU-linux-3.20.0.tar.gz
OUT node-pre-gyp install failed with error: Error: Command failed: node-pre-gyp install --fallback-to-build
OUT internal/modules/cjs/loader.js:905
OUT   throw err;
OUT   ^
OUT Error: Cannot find module '../lib/main'
OUT Require stack:
OUT - /tmp/app/node_modules/.bin/node-pre-gyp
OUT     at Function.Module._resolveFilename (internal/modules/cjs/loader.js:902:15)
OUT     at Function.Module._load (internal/modules/cjs/loader.js:746:27)
OUT     at Module.require (internal/modules/cjs/loader.js:974:19)
OUT     at require (internal/modules/cjs/helpers.js:101:18)
OUT     at Object.<anonymous> (/tmp/app/node_modules/.bin/node-pre-gyp:4:1)
OUT     at Module._compile (internal/modules/cjs/loader.js:1085:14)
OUT     at Object.Module._extensions..js (internal/modules/cjs/loader.js:1114:10)
OUT     at Module.load (internal/modules/cjs/loader.js:950:32)
OUT     at Function.Module._load (internal/modules/cjs/loader.js:790:12)
OUT     at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:75:12) {
OUT   code: 'MODULE_NOT_FOUND',
OUT   requireStack: [ '/tmp/app/node_modules/.bin/node-pre-gyp' ]
OUT }
OUT npm ERR! code ELIFECYCLE
OUT npm ERR! errno 1
OUT npm ERR! @tensorflow/tfjs-node@3.20.0 install: `node scripts/install.js`
OUT npm ERR! Exit status 1
OUT npm ERR! 
OUT npm ERR! Failed at the @tensorflow/tfjs-node@3.20.0 install script.
OUT npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
OUT npm ERR! A complete log of this run can be found in:
OUT npm ERR!     /home/vcap/.npm/_logs/2023-01-11T21_21_42_189Z-debug.log
OUT        �[31;1m**ERROR**�[0m Unable to build dependencies: exit status 1
OUT        �[31;1m**WARNING**�[0m A module may be missing from 'dependencies' in package.json
ERR Failed to compile droplet: Failed to run all supply scripts: exit status 14
OUT Exit status 223
OUT Cell 9dfd1aab-715e-4748-8944-33ac34cc84ad stopping instance c243f510-0919-41c0-82e3-90b9e35ee860
OUT Cell 9dfd1aab-715e-4748-8944-33ac34cc84ad destroying container for instance c243f510-0919-41c0-82e3-90b9e35ee860

And these are the relevant dependencies in package.json:

"dependencies": {
    "@lmig/health": "^5.0.0",
    "@tensorflow/tfjs-node": "3.20.0",
    "async": "^3.2.4",
    "bl": "^5.0.0",
    "body-parser": "^1.20.0",
    "browserify-fs": "^1.0.0",
    "cors": "^2.8.5",
    "danfojs-node": "^1.1.2",
    "dotenv": "^11.0.0",
    "express": "^4.17.1",
    "express-routemagic": "^2.0.6",
    "fast-csv": "^4.3.6",
    "got": "^12.1.0",
    "https": "^1.0.0",
    "json-2-csv": "^3.14.4",
    "morgan": "^1.10.0",
    "node-gyp": "^9.3.1" 

We've tried various combinations of node & tfjs-node versions, but can't seem to get past the package installs.

I can provide additional logging if needed, or any other info you may need.

TIA

@mattsoulanille
Copy link
Member

Hi @travisvadnais, and thanks for the report. You mentioned you're running this on cloud foundry. Is it running in a docker container, and if so, can you share what container you're using? Thanks!

@mattsoulanille mattsoulanille self-assigned this Jan 11, 2023
@travisvadnais
Copy link
Author

Hi - thanks for picking this up! We deploy through a Bamboo pipeline and none of the tasks spin up a Docker container, but I'm not sure if that's something that happens 'behind the scenes' or not.

We did some additional searching this morning and tried installing @mapbox/node-pre-gyp. Same error. We also tried removing node-gyp and node-pre-gyp from package.json and it all resulted in the same error.

It seems extremely unusual that it ran locally with all of the above changes (with multiple versions of node, no less!), but errored out w/ the same error in the deploy regardless of which of the above we tried.

@mattsoulanille
Copy link
Member

mattsoulanille commented Jan 12, 2023

This might be a symlink and node resolution issue. node_modules/.bin/node-pre-gyp is a symlink to ../@mapbox/node-pre-gyp/bin/node-pre-gyp.

$ file node_modules/.bin/node-pre-gyp 
node_modules/.bin/node-pre-gyp: symbolic link to `../@mapbox/node-pre-gyp/bin/node-pre-gyp`

However, this file uses a relative require.

$ cat node_modules/.bin/node-pre-gyp 
#!/usr/bin/env node
'use strict';

require('../lib/main');

../lib/main does not exist relative to node_modules/.bin/node-pre-gyp, but node should follow the symlink to ../@mapbox/node-pre-gyp/bin/node-pre-gyp, where that relative import works. However, this is not the case if your environment uses --preserve-symlinks or NODE_PRESERVE_SYMLINKS=1.

To see if this is the issue, you can copy the node-pre-gyp package to another directory, change it locally to import from @mapbox/node-pre-gyp/lib/main, and override the version tfjs-node uses with a file: dependency to your modified version. Alternatively, you can just edit the file locally and run yarn node-pre-gyp to see if the command can run, although you might still need to do the package override to build tfjs-node.

@travisvadnais
Copy link
Author

Thank you Matt - I'm going to bring this back to the team and try to tackle this approach tomorrow. I'll post back here if that was the issue, as I'm sure it'll help someone else at some point!

@travisvadnais
Copy link
Author

Hi @mattsoulanille -

We ended up completely rebuilding our application into a more modern Angular / Node app (hence the long response time), however this issue is still occurring. We've validated that --preserve-symlinks is not used in our environment, and tried rebuilding the symlink with no success.

We've attempted running preinstall scripts for both node-pre-gyp and @tensorflow/tfjs-node individually, and we've tried running npm rebuild @tensorflow/tfjs-node --build-from-source after the initial npm install build task in our bamboo pipeline, but still no success.

Anything we've tried, we've tried by both modifying the package.json and adding custom build tasks in Bamboo.

This works locally, and interestingly enough the Bamboo build process completes successfully - it's just the deployment where this occurs.

We've also tried virtually every version of tfjs-node. Our deployment uses Linux, so I'm not sure if this matters, but it installs Python 3.10 as part of the process.

We're dying here, as we are planning to use this for a whole lot of initiatives over the next couple years. Please help!

P.S. Interestingly enough, we just tried v.4.4.0 this morning and it's flagging us for not being able to find the ../ module instead of ../lib/main. New errors are always nice! Relevant logs below

simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT > node scripts/install.js
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT CPU-linux-4.4.0.tar.gz
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT npm ERR! errno 1
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT * Building TensorFlow Node.js bindings
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT node-pre-gyp install failed with error: Error: Command failed: node-pre-gyp install --fallback-to-build
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT internal/modules/cjs/loader.js:934
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT   throw err;
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT   ^
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT Error: Cannot find module '../'
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT Require stack:
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT - /tmp/app/node_modules/.bin/node-pre-gyp
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at Function.Module._resolveFilename (internal/modules/cjs/loader.js:931:15)
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at Function.Module._load (internal/modules/cjs/loader.js:774:27)
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at Module.require (internal/modules/cjs/loader.js:1003:19)
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at require (internal/modules/cjs/helpers.js:107:18)
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at Object.<anonymous> (/tmp/app/node_modules/.bin/node-pre-gyp:15:20)
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at Module._compile (internal/modules/cjs/loader.js:1114:14)
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at Object.Module._extensions..js (internal/modules/cjs/loader.js:1143:10)
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at Module.load (internal/modules/cjs/loader.js:979:32)
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at Function.Module._load (internal/modules/cjs/loader.js:819:12)
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT     at Function.executeUserEntryPoint [as runMain] (internal/modules/run_main.js:75:12) {
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT   code: 'MODULE_NOT_FOUND',
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT   requireStack: [ '/tmp/app/node_modules/.bin/node-pre-gyp' ]
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT }
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT npm ERR! code ELIFECYCLE
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.39-0400 [STG/0]      OUT Exit status 223
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT npm ERR! @tensorflow/tfjs-node@4.4.0 install: `node scripts/install.js`
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT npm ERR! Exit status 1
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT npm ERR! 
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT npm ERR! Failed at the @tensorflow/tfjs-node@4.4.0 install script.
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT npm ERR! A complete log of this run can be found in:
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT npm ERR!     /home/vcap/.npm/_logs/2023-04-20T14_18_38_868Z-debug.log
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT        �[31;1m**ERROR**�[0m Unable to build dependencies: exit status 1
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT        �[31;1m**WARNING**�[0m A module may be missing from 'dependencies' in package.json
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT        This module may be specified in 'devDependencies' instead of 'dependencies'
simple	20-Apr-2023 10:18:38	2023-04-20T10:18.38-0400 [STG/0]      OUT        See: https://devcenter.heroku.com/articles/nodejs-support#devdependencies```

@travisvadnais
Copy link
Author

Hi @mattsoulanille -

Some added context - our deployment pipeline uses the Amazon Linux 2 OS distribution. Maybe this requires additional dependencies to be installed?

@mattsoulanille
Copy link
Member

The only thing I can think of right now is that there's still some issue with symlinks on Bamboo. Can you try changing this line from

cp.exec(`node-pre-gyp install ${buildOption}`, (err) => {

to

cp.exec(`node_modules/@mapbox/node-pre-gyp/bin/node-pre-gyp install ${buildOption}`, (err) => {

If that doesn't work, then you can also try

cp.exec(`node node_modules/@mapbox/node-pre-gyp/lib/main install ${buildOption}`, (err) => {

This should remove all the symlinks that the command was following.

@travisvadnais
Copy link
Author

Hi @mattsoulanille -

Thanks for the follow up, but still no luck - although it's a somewhat different error, so maybe that's progress??

Here's the log after the first command you provided:

      OUT * Building TensorFlow Node.js bindings
      OUT node-pre-gyp install failed with error: Error: Command failed: node_modules/@mapbox/node-pre-gyp/bin/node-pre-gyp install --fallback-to-build
      OUT /bin/sh: 1: node_modules/@mapbox/node-pre-gyp/bin/node-pre-gyp: not found
      OUT npm ERR! code ELIFECYCLE
      OUT npm ERR! errno 1
      OUT npm ERR! @tensorflow/tfjs-node@4.4.0 install: `node scripts/install.js`
      OUT npm ERR! Exit status 1
      OUT npm ERR! 
      OUT npm ERR! Failed at the @tensorflow/tfjs-node@4.4.0 install script.
      OUT npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
      OUT npm ERR! A complete log of this run can be found in:
      OUT npm ERR!     /home/vcap/.npm/_logs/2023-04-25T20_01_26_849Z-debug.log
      OUT        �[31;1m**ERROR**�[0m Unable to build dependencies: exit status 1
      ERR Failed to compile droplet: Failed to run all supply scripts: exit status 14
      OUT Exit status 223 

And this is after the 2nd command:

      OUT }
      OUT npm ERR! code ELIFECYCLE
      OUT npm ERR! errno 1
      OUT npm ERR! @tensorflow/tfjs-node@4.4.0 install: `node scripts/install.js`
      OUT npm ERR! Exit status 1
      OUT npm ERR!
      OUT npm ERR! Failed at the @tensorflow/tfjs-node@4.4.0 install script.
      OUT npm ERR! This is probably not a problem with npm. There is likely additional logging output above.
      OUT Exit status 223
      OUT npm ERR! A complete log of this run can be found in:
      OUT npm ERR!     /home/vcap/.npm/_logs/2023-04-25T20_47_54_747Z-debug.log
      OUT        **ERROR** Unable to build dependencies: exit status 1
      OUT        **WARNING** A module may be missing from 'dependencies' in package.json
      OUT        This module may be specified in 'devDependencies' instead of 'dependencies'
      OUT        See: https://devcenter.heroku.com/articles/nodejs-support#devdependencies
      ERR Failed to compile droplet: Failed to run all supply scripts: exit status 14

For what it's worth, those are the deploy logs. The build passes, however I noticed this oddity in the log while working on the above:

error	I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
error.       To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

Does any of this give you any other ideas?

Thanks again!

@mattsoulanille
Copy link
Member

OUT /bin/sh: 1: node_modules/@mapbox/node-pre-gyp/bin/node-pre-gyp: not found

This is interesting. It looks like the @mapbox/node-pre-gyp module is missing. Can you check that node_modules/@mapbox/node-pre-gyp exists when you run on bamboo?

error	I tensorflow/core/platform/cpu_feature_guard.cc:193] This TensorFlow binary is optimized with oneAPI Deep Neural Network Library (oneDNN) to use the following CPU instructions in performance-critical operations:  AVX2 AVX512F FMA
error.       To enable them in other operations, rebuild TensorFlow with the appropriate compiler flags.

This is actually a good sign. It indicates that tfjs-node was able to load its tensorflow native binary.

@travisvadnais
Copy link
Author

Thanks @mattsoulanille -

The good news is, when I add @mapbox/node-pre-gyp@1.0.9 to my package.json, I'm able to get past that error - so we're getting closer . . . but now I'm hitting this error block in my deploy. I'm not sure what's actually happening here, but I can't imagine whatever it's trying to install here is truly so massive that it would crash a container.

This is 11 minutes into the deploy too, so this has to be the last hurdle before a successful deploy . . .

OUT Uploading droplet, build artifacts   cache...
--
OUT Uploading build artifacts cache...
OUT Uploading droplet...
OUT Uploaded build artifacts cache (215B)
OUT Creating droplet for app with guid [REDACTED]
OUT Uploaded droplet (453.7M)
OUT Uploading complete
OUT Cell [REDACTED] stopping instance [REDACTED]
OUT Cell [REDACTED] destroying container for instance [REDACTED]
OUT Cell [REDACTED] successfully destroyed container for instance   [REDACTED]
OUT Cell [REDACTED] creating container for instance [REDACTED]
OUT Cell [REDACTED] successfully created container for instance   [REDACTED]
OUT Downloading droplet...
ERR Copying droplet into the container failed: stream-in: nstar: error   streaming in: exit status 2. Output: tar:   ./app/node_modules/@tensorflow/tfjs-node/deps/lib/libtensorflow.so: Wrote   only 3072 of 10240 bytes
ERR tar:   ./app/node_modules/@tensorflow/tfjs-node/deps/lib/libtensorflow.so.2: Cannot   open: Disk quota exceeded
ERR tar:   ./app/node_modules/@tensorflow/tfjs-node/deps/THIRD_PARTY_TF_C_LICENSES:   Cannot open: Disk quota exceeded
ERR tar: ./app/node_modules/@tensorflow/tfjs-node/deps/LICENSE: Cannot   open: Disk quota exceeded
ERR tar: ./app/node_modules/@tensorflow/tfjs-node/deps/include: Cannot   mkdir: Disk quota exceeded
ERR tar: ./app/node_modules/@tensorflow/tfjs-node/deps/include: Cannot   mkdir: Disk quota exceeded
ERR tar:   ./app/node_modules/@tensorflow/tfjs-node/deps/include/tensorflow: Cannot   mkdir: No such file or directory
ERR tar: ./app/node_modules/@tensorflow/tfjs-node/deps/include: Cannot   mkdir: Disk quota exceeded
OUT Cell [REDACTED] stopping instance [REDACTED]
OUT Cell [REDACTED] destroying container for instance [REDACTED]
OUT Process has crashed with type: "web"
OUT Cell [REDACTED] creating container for instance [REDACTED]
OUT Cell [REDACTED] successfully created container for instance   [REDACTED]
OUT Downloading droplet...
OUT Cell [REDACTED] successfully destroyed container for instance   [REDACTED]
ERR Copying droplet into the container failed: stream-in: nstar: error   streaming in: exit status 2. Output: tar:   ./app/node_modules/@tensorflow/tfjs-node/deps/lib/libtensorflow.so: Wrote   only 3072 of 10240 bytes
ERR tar:   ./app/node_modules/@tensorflow/tfjs-node/deps/lib/libtensorflow.so.2: Cannot   open: Disk quota exceeded
ERR tar:   ./app/node_modules/@tensorflow/tfjs-node/deps/THIRD_PARTY_TF_C_LICENSES:   Cannot open: Disk quota exceeded
ERR tar: ./app/node_modules/@tensorflow/tfjs-node/deps/LICENSE: Cannot   open: Disk quota exceeded
ERR tar: ./app/node_modules/@tensorflow/tfjs-node/deps/include: Cannot   mkdir: Disk quota exceeded
ERR tar: ./app/node_modules/@tensorflow/tfjs-node/deps/include: Cannot   mkdir: Disk quota exceeded
ERR tar:   ./app/node_modules/@tensorflow/tfjs-node/deps/include/tensorflow: Cannot   mkdir: No such file or directory
ERR tar: ./app/node_modules/@tensorflow/tfjs-node/deps/include: Cannot   mkdir: Disk quota exceeded
OUT Cell [REDACTED] stopping instance [REDACTED]
OUT Cell [REDACTED] destroying container for instance [REDACTED]
OUT Process has crashed with type: "web"

@pyu10055
Copy link
Collaborator

@travisvadnais This is error looks like you are running out of disk space, please check your storage quota.

@travisvadnais
Copy link
Author

Thank you both @mattsoulanille & @pyu10055 . I was FINALLY able to get up and running in production!

For posterity, the solution to getting ``@tensorflow/tfjs-node@4.4.0 running in my Linux system was to npm install @mapbox/node-pre-gyp@1.0.9

I think we're good to close this issue out, but any ideas why we would be running into issues with @mapbox/node-pre-gyp not installing as a dependency when running npm install ?

@gaikwadrahul8
Copy link
Contributor

Hi, @travisvadnais

Good to hear that your issue got resolved and these two issue threads #162 , #61 may help you to understand why @mapbox/node-pre-gyp not installing as a dependency when running npm install for package.json file if I'm not wrong

Could you please confirm if this issue is resolved for you ? Please feel free to close the issue if it is resolved ? Thank you!

@travisvadnais
Copy link
Author

Resolved. Thank you all!

@gaikwadrahul8
Copy link
Contributor

Hi, @travisvadnais

You're welcome and Good to hear that your issue has been resolved, please feel free to close this issue now. Thank you!

@travisvadnais
Copy link
Author

Resolved

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants