This repository has been archived by the owner on Nov 22, 2022. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 799
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
facebook-github-bot
added
the
CLA Signed
Do not delete this pull request or issue due to inactivity.
label
Sep 6, 2019
snisarg
force-pushed
the
export-D17114398
branch
from
September 21, 2019 02:10
76792e3
to
f376133
Compare
snisarg
added a commit
to snisarg/pytext
that referenced
this pull request
Sep 21, 2019
Summary: Pull Request resolved: facebookresearch#955 We have users who can't train models on extremely large embeddings because we try to allocate space for that on the GPU. With this diff, in training, we add a flag which users can set explicitly to keep the embedding layer on CPU even when the model is getting trained on GPUs. This is not default because we need the user to know that there will be a cost associated moving the tensors on and off the GPU. Note that this only applies during training. Also note that this does not work in a multi-GPU environment because of the way the weights are synced via NCCL. Differential Revision: D17114398 fbshipit-source-id: ba7b004c6e2e75af1ee9cff64eee563cf3e52435
snisarg
added a commit
to snisarg/pytext
that referenced
this pull request
Sep 24, 2019
Summary: Pull Request resolved: facebookresearch#955 We have users who can't train models on extremely large embeddings because we try to allocate space for that on the GPU. With this diff, in training, we add a flag which users can set explicitly to keep the embedding layer on CPU even when the model is getting trained on GPUs. This is not default because we need the user to know that there will be a cost associated moving the tensors on and off the GPU. Note that this only applies during training. Also note that this does not work in a multi-GPU environment because of the way the weights are synced via NCCL. Differential Revision: D17114398 fbshipit-source-id: a9f1791d83d67f331094e64f1574cf1c149deabf
snisarg
force-pushed
the
export-D17114398
branch
from
September 24, 2019 02:06
f376133
to
ac54233
Compare
snisarg
added a commit
to snisarg/pytext
that referenced
this pull request
Sep 24, 2019
Summary: Pull Request resolved: facebookresearch#955 We have users who can't train models on extremely large embeddings because we try to allocate space for that on the GPU. With this diff, in training, we add a flag which users can set explicitly to keep the embedding layer on CPU even when the model is getting trained on GPUs. This is not default because we need the user to know that there will be a cost associated moving the tensors on and off the GPU. Note that this only applies during training. Also note that this does not work in a multi-GPU environment because of the way the weights are synced via NCCL. Differential Revision: D17114398 fbshipit-source-id: e28b2981fbcbb248a6a704fd3c6e325fd45490e9
snisarg
force-pushed
the
export-D17114398
branch
from
September 24, 2019 02:37
ac54233
to
189fdf4
Compare
snisarg
added a commit
to snisarg/pytext
that referenced
this pull request
Sep 24, 2019
Summary: Pull Request resolved: facebookresearch#955 We have users who can't train models on extremely large embeddings because we try to allocate space for that on the GPU. With this diff, in training, we add a flag which users can set explicitly to keep the embedding layer on CPU even when the model is getting trained on GPUs. This is not default because we need the user to know that there will be a cost associated moving the tensors on and off the GPU. Note that this only applies during training. Also note that this does not work in a multi-GPU environment because of the way the weights are synced via NCCL. Differential Revision: D17114398 fbshipit-source-id: 840f37f77c70089137f2cf23a262dc503e5e2080
snisarg
force-pushed
the
export-D17114398
branch
2 times, most recently
from
September 26, 2019 20:36
8a15419
to
a5e9775
Compare
snisarg
added a commit
to snisarg/pytext
that referenced
this pull request
Sep 26, 2019
Summary: Pull Request resolved: facebookresearch#955 We have users who can't train models on extremely large embeddings because we try to allocate space for that on the GPU. With this diff, in training, we add a flag which users can set explicitly to keep the embedding layer on CPU even when the model is getting trained on GPUs. This is not default because we need the user to know that there will be a cost associated moving the tensors on and off the GPU. Note that this only applies during training. Also note that this does not work in a multi-GPU environment because of the way the weights are synced via NCCL. Differential Revision: D17114398 fbshipit-source-id: 56343dd90a9e05d021650b9d765274a721dffa13
snisarg
added a commit
to snisarg/pytext
that referenced
this pull request
Sep 26, 2019
Summary: Pull Request resolved: facebookresearch#955 We have users who can't train models on extremely large embeddings because we try to allocate space for that on the GPU. With this diff, in training, we add a flag which users can set explicitly to keep the embedding layer on CPU even when the model is getting trained on GPUs. This is not default because we need the user to know that there will be a cost associated moving the tensors on and off the GPU. Note that this only applies during training. Also note that this does not work in a multi-GPU environment because of the way the weights are synced via NCCL. Differential Revision: D17114398 fbshipit-source-id: 8da9f9628c64f23ba751d6ceb63ffe1ce9b05c17
snisarg
force-pushed
the
export-D17114398
branch
from
September 26, 2019 20:37
a5e9775
to
324802c
Compare
Summary: Pull Request resolved: facebookresearch#955 We have users who can't train models on extremely large embeddings because we try to allocate space for that on the GPU. With this diff, in training, we add a flag which users can set explicitly to keep the embedding layer on CPU even when the model is getting trained on GPUs. This is not default because we need the user to know that there will be a cost associated moving the tensors on and off the GPU. Note that this only applies during training. Also note that this does not work in a multi-GPU environment because of the way the weights are synced via NCCL. Reviewed By: chenyangyu1988 Differential Revision: D17114398 fbshipit-source-id: 1d4c41940af0d69415b8e606899afcecc843b064
snisarg
force-pushed
the
export-D17114398
branch
from
October 1, 2019 20:51
324802c
to
309248c
Compare
This pull request has been merged in 84adc39. |
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Summary:
We have users who can't train models on extremely large embeddings because we try to allocate space for that on the GPU.
With this diff, in training, we add a flag which users can set explicitly to keep the embedding layer on CPU even when the model is getting trained on GPUs. This is not default because we need the user to know that there will be a cost associated moving the tensors on and off the GPU.
Note that this only applies during training.
Differential Revision: D17114398