Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RNN failure hackaround #944

Merged
merged 1 commit into from
Nov 27, 2019
Merged

RNN failure hackaround #944

merged 1 commit into from
Nov 27, 2019

Conversation

MikeInnes
Copy link
Member

See #923.

bors try

bors bot added a commit that referenced this pull request Nov 26, 2019
@bors
Copy link
Contributor

bors bot commented Nov 26, 2019

try

Build succeeded

@MikeInnes
Copy link
Member Author

Although bors went through, the CI pipeline has failed; I'm guessing that means this didn't fix the issue for real.

bors try

bors bot added a commit that referenced this pull request Nov 26, 2019
@bors
Copy link
Contributor

bors bot commented Nov 26, 2019

try

Build failed

@DhairyaLGandhi
Copy link
Member

Seems #923 related still

@MikeInnes
Copy link
Member Author

Now using Tim's hackaround in JuliaGPU/CuArrays.jl#517.

bors try

bors bot added a commit that referenced this pull request Nov 26, 2019
@bors
Copy link
Contributor

bors bot commented Nov 26, 2019

try

Build failed

@MikeInnes
Copy link
Member Author

Test runner seems to be having trouble. Trying again.

bors r+

bors bot added a commit that referenced this pull request Nov 26, 2019
944: RNN failure hackaround r=MikeInnes a=MikeInnes

See #923.

bors try

Co-authored-by: Mike Innes <mike.j.innes@gmail.com>
@maleadt
Copy link
Collaborator

maleadt commented Nov 26, 2019

One runner's GPU died:

$ nvidia-smi                                                                                                                                                                                                                
Unable to determine the device handle for GPU 0000:41:00.0: Unknown Error

FWIW you can just log-in on gitlab and restart the failing ones.
I also wouldn't merge this just yet; as soon as the PR is merged in CuArrays the Manifest would break here.
bors r-

@bors
Copy link
Contributor

bors bot commented Nov 26, 2019

Build failed

@maleadt
Copy link
Collaborator

maleadt commented Nov 26, 2019

Looks good! I'll restart the job to try and flush out errors.

@MikeInnes
Copy link
Member Author

So is CI up again?

bors try

bors bot added a commit that referenced this pull request Nov 27, 2019
@bors
Copy link
Contributor

bors bot commented Nov 27, 2019

try

Build succeeded

@MikeInnes
Copy link
Member Author

bors r+

bors bot added a commit that referenced this pull request Nov 27, 2019
944: RNN failure hackaround r=MikeInnes a=MikeInnes

See #923.

bors try

Co-authored-by: Mike Innes <mike.j.innes@gmail.com>
@maleadt
Copy link
Collaborator

maleadt commented Nov 27, 2019

bors r+

just FYI, that'll embed a reference to tb/workspace in the Flux Manifest. That branch will disappear as soon as JuliaGPU/CuArrays.jl#517 is merged. So it might be better to wait until it's part of master and use that hash?

@MikeInnes
Copy link
Member Author

Yeah, if the branch is going to be deleted immediately then that's best to avoid.

bors r-

@bors
Copy link
Contributor

bors bot commented Nov 27, 2019

Canceled

@maleadt
Copy link
Collaborator

maleadt commented Nov 27, 2019

As soon as CI is green on CuArrays I'll merge so that we can move on here.

@maleadt maleadt merged commit ab45047 into master Nov 27, 2019
@CarloLucibello CarloLucibello deleted the rnn-fix branch April 7, 2022 07:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants