-
Notifications
You must be signed in to change notification settings - Fork 17.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: fix crash during VDSO calls on PowerPC #46767
Conversation
This patch reinstates a fix for PowerPC with regard to making VDSO calls while receiving a signal, and subsequently crashing. The crash happens because certain VDSO calls can modify the r30 register, which is where g is stored. This change was reverted for PowerPC because r30 is supposed to be a non-volatile register. This is true, but that only makes a guarantee across function calls, but not "within" a function call. This patch was seemingly fine before because the Linux kernel still had hand rolled assembly VDSO function calls, however with a recent change to C function calls it seems the compiler used can generate instructions which temporarily clobber r30. This means that when we receive a signal during one of these calls the value of r30 will not be the g as the runtime expects, causing a segfault. You can see from this assembly dump how the register is clobbered during the call: (the following is from a 5.13rc2 kernel) ``` Dump of assembler code for function __cvdso_clock_gettime_data: 0x00007ffff7ff0700 <+0>: cmplwi r4,15 0x00007ffff7ff0704 <+4>: bgt 0x7ffff7ff07f0 <__cvdso_clock_gettime_data+240> 0x00007ffff7ff0708 <+8>: li r9,1 0x00007ffff7ff070c <+12>: slw r9,r9,r4 0x00007ffff7ff0710 <+16>: andi. r10,r9,2179 0x00007ffff7ff0714 <+20>: beq 0x7ffff7ff0810 <__cvdso_clock_gettime_data+272> 0x00007ffff7ff0718 <+24>: rldicr r10,r4,4,59 0x00007ffff7ff071c <+28>: lis r9,32767 0x00007ffff7ff0720 <+32>: std r30,-16(r1) 0x00007ffff7ff0724 <+36>: std r31,-8(r1) 0x00007ffff7ff0728 <+40>: add r6,r3,r10 0x00007ffff7ff072c <+44>: ori r4,r9,65535 0x00007ffff7ff0730 <+48>: lwz r8,0(r3) 0x00007ffff7ff0734 <+52>: andi. r9,r8,1 0x00007ffff7ff0738 <+56>: bne 0x7ffff7ff07d0 <__cvdso_clock_gettime_data+208> 0x00007ffff7ff073c <+60>: lwsync 0x00007ffff7ff0740 <+64>: mftb r30 <---- RIGHT HERE => 0x00007ffff7ff0744 <+68>: ld r12,40(r6) ``` What I believe is happening is that the kernel changed the PowerPC VDSO calls to use standard C calls instead of using hand rolled assembly. The hand rolled assembly calls never touched r30, so this change was safe to roll back. That does not seem to be the case anymore as on the 5.13rc2 kernel the compiler *is* generating assembly which modifies r30, making this change again unsafe and causing a crash when the program receives a signal during these calls (which will happen often due to async preempt). I realize this was reverted due to unexplained hangs in PowerPC builders, but I think we should reinstate this change and investigate those issues separately: golang@f4ca3c1
This PR (HEAD: cfbdc1e) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/328110 to see it. Tip: You can toggle comments from me using the |
Message from Cherry Mui: Patch Set 1: (2 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Derek Parker: Patch Set 1: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
This PR (HEAD: b3217a8) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/328110 to see it. Tip: You can toggle comments from me using the |
Message from Derek Parker: Patch Set 2: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Lynn Boger: Patch Set 2: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Derek Parker: Patch Set 2: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Paul Murphy: Patch Set 2: (2 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Also fix up some formatting.
Message from Derek Parker: Patch Set 2: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
This PR (HEAD: 77da464) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/328110 to see it. Tip: You can toggle comments from me using the |
Message from Derek Parker: Patch Set 3: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Paul Murphy: Patch Set 3: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
This PR (HEAD: 4a90283) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/328110 to see it. Tip: You can toggle comments from me using the |
Message from Derek Parker: Patch Set 4: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Paul Murphy: Patch Set 4: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Paul Murphy: Patch Set 4: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Derek Parker: Patch Set 4: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Cherry Mui: Patch Set 4: Run-TryBot+1 (2 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Go Bot: Patch Set 4: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Go Bot: Patch Set 4: TryBot-Result+1 (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Lynn Boger: Patch Set 4: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
This PR (HEAD: c3002bc) has been imported to Gerrit for code review. Please visit https://go-review.googlesource.com/c/go/+/328110 to see it. Tip: You can toggle comments from me using the |
Message from Derek Parker: Patch Set 5: (5 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Lynn Boger: Patch Set 5: Run-TryBot+1 (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Go Bot: Patch Set 5: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Paul Murphy: Patch Set 5: Code-Review+1 (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Go Bot: Patch Set 5: (2 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Derek Parker: Patch Set 6: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Lynn Boger: Patch Set 6: Run-TryBot+1 (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Go Bot: Patch Set 6: (2 comments) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Go Bot: Patch Set 6: TryBot-Result+1 (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Cherry Mui: Patch Set 6: Code-Review+2 Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Lynn Boger: Patch Set 6: Trust+1 Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Lynn Boger: Patch Set 6: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Cherry Mui: Patch Set 6: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Lynn Boger: Patch Set 6: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
Message from Derek Parker: Patch Set 6: (1 comment) Please don’t reply on this GitHub thread. Visit golang.org/cl/328110. |
This PR is being closed because golang.org/cl/328110 has been merged. |
This patch reinstates a fix for PowerPC with regard to making VDSO calls while receiving a signal, and subsequently crashing. The crash happens because certain VDSO calls can modify the r30 register, which is where g is stored. This change was reverted for PowerPC because r30 is supposed to be a non-volatile register. This is true, but that only makes a guarantee across function calls, but not "within" a function call. This patch was seemingly fine before because the Linux kernel still had hand rolled assembly VDSO function calls, however with a recent change to C function calls it seems the compiler used can generate instructions which temporarily clobber r30. This means that when we receive a signal during one of these calls the value of r30 will not be the g as the runtime expects, causing a segfault. You can see from this assembly dump how the register is clobbered during the call: (the following is from a 5.13rc2 kernel) ``` Dump of assembler code for function __cvdso_clock_gettime_data: 0x00007ffff7ff0700 <+0>: cmplwi r4,15 0x00007ffff7ff0704 <+4>: bgt 0x7ffff7ff07f0 <__cvdso_clock_gettime_data+240> 0x00007ffff7ff0708 <+8>: li r9,1 0x00007ffff7ff070c <+12>: slw r9,r9,r4 0x00007ffff7ff0710 <+16>: andi. r10,r9,2179 0x00007ffff7ff0714 <+20>: beq 0x7ffff7ff0810 <__cvdso_clock_gettime_data+272> 0x00007ffff7ff0718 <+24>: rldicr r10,r4,4,59 0x00007ffff7ff071c <+28>: lis r9,32767 0x00007ffff7ff0720 <+32>: std r30,-16(r1) 0x00007ffff7ff0724 <+36>: std r31,-8(r1) 0x00007ffff7ff0728 <+40>: add r6,r3,r10 0x00007ffff7ff072c <+44>: ori r4,r9,65535 0x00007ffff7ff0730 <+48>: lwz r8,0(r3) 0x00007ffff7ff0734 <+52>: andi. r9,r8,1 0x00007ffff7ff0738 <+56>: bne 0x7ffff7ff07d0 <__cvdso_clock_gettime_data+208> 0x00007ffff7ff073c <+60>: lwsync 0x00007ffff7ff0740 <+64>: mftb r30 <---- RIGHT HERE => 0x00007ffff7ff0744 <+68>: ld r12,40(r6) ``` What I believe is happening is that the kernel changed the PowerPC VDSO calls to use standard C calls instead of using hand rolled assembly. The hand rolled assembly calls never touched r30, so this change was safe to roll back. That does not seem to be the case anymore as on the 5.13rc2 kernel the compiler *is* generating assembly which modifies r30, making this change again unsafe and causing a crash when the program receives a signal during these calls (which will happen often due to async preempt). This change happened here: https://lwn.net/ml/linux-kernel/235e5571959cfa89ced081d7e838ed5ff38447d2.1601365870.git.christophe.leroy@csgroup.eu/. I realize this was reverted due to unexplained hangs in PowerPC builders, but I think we should reinstate this change and investigate those issues separately: f4ca3c1 Fixes #46803 Change-Id: Ib18d7bbfc80a1a9cb558f0098878d41081324b52 GitHub-Last-Rev: c3002bc GitHub-Pull-Request: #46767 Reviewed-on: https://go-review.googlesource.com/c/go/+/328110 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Trust: Lynn Boger <laboger@linux.vnet.ibm.com>
This patch reinstates a fix for PowerPC with regard to making VDSO calls while receiving a signal, and subsequently crashing. The crash happens because certain VDSO calls can modify the r30 register, which is where g is stored. This change was reverted for PowerPC because r30 is supposed to be a non-volatile register. This is true, but that only makes a guarantee across function calls, but not "within" a function call. This patch was seemingly fine before because the Linux kernel still had hand rolled assembly VDSO function calls, however with a recent change to C function calls it seems the compiler used can generate instructions which temporarily clobber r30. This means that when we receive a signal during one of these calls the value of r30 will not be the g as the runtime expects, causing a segfault. You can see from this assembly dump how the register is clobbered during the call: (the following is from a 5.13rc2 kernel) ``` Dump of assembler code for function __cvdso_clock_gettime_data: 0x00007ffff7ff0700 <+0>: cmplwi r4,15 0x00007ffff7ff0704 <+4>: bgt 0x7ffff7ff07f0 <__cvdso_clock_gettime_data+240> 0x00007ffff7ff0708 <+8>: li r9,1 0x00007ffff7ff070c <+12>: slw r9,r9,r4 0x00007ffff7ff0710 <+16>: andi. r10,r9,2179 0x00007ffff7ff0714 <+20>: beq 0x7ffff7ff0810 <__cvdso_clock_gettime_data+272> 0x00007ffff7ff0718 <+24>: rldicr r10,r4,4,59 0x00007ffff7ff071c <+28>: lis r9,32767 0x00007ffff7ff0720 <+32>: std r30,-16(r1) 0x00007ffff7ff0724 <+36>: std r31,-8(r1) 0x00007ffff7ff0728 <+40>: add r6,r3,r10 0x00007ffff7ff072c <+44>: ori r4,r9,65535 0x00007ffff7ff0730 <+48>: lwz r8,0(r3) 0x00007ffff7ff0734 <+52>: andi. r9,r8,1 0x00007ffff7ff0738 <+56>: bne 0x7ffff7ff07d0 <__cvdso_clock_gettime_data+208> 0x00007ffff7ff073c <+60>: lwsync 0x00007ffff7ff0740 <+64>: mftb r30 <---- RIGHT HERE => 0x00007ffff7ff0744 <+68>: ld r12,40(r6) ``` What I believe is happening is that the kernel changed the PowerPC VDSO calls to use standard C calls instead of using hand rolled assembly. The hand rolled assembly calls never touched r30, so this change was safe to roll back. That does not seem to be the case anymore as on the 5.13rc2 kernel the compiler *is* generating assembly which modifies r30, making this change again unsafe and causing a crash when the program receives a signal during these calls (which will happen often due to async preempt). This change happened here: https://lwn.net/ml/linux-kernel/235e5571959cfa89ced081d7e838ed5ff38447d2.1601365870.git.christophe.leroy@csgroup.eu/. I realize this was reverted due to unexplained hangs in PowerPC builders, but I think we should reinstate this change and investigate those issues separately: f4ca3c1 Fixes #46857 Change-Id: Ib18d7bbfc80a1a9cb558f0098878d41081324b52 GitHub-Last-Rev: c3002bc GitHub-Pull-Request: #46767 Reviewed-on: https://go-review.googlesource.com/c/go/+/328110 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Trust: Lynn Boger <laboger@linux.vnet.ibm.com> (cherry picked from commit 16e82be) Reviewed-on: https://go-review.googlesource.com/c/go/+/334411 Run-TryBot: Carlos Amedee <carlos@golang.org>
This patch reinstates a fix for PowerPC with regard to making VDSO calls while receiving a signal, and subsequently crashing. The crash happens because certain VDSO calls can modify the r30 register, which is where g is stored. This change was reverted for PowerPC because r30 is supposed to be a non-volatile register. This is true, but that only makes a guarantee across function calls, but not "within" a function call. This patch was seemingly fine before because the Linux kernel still had hand rolled assembly VDSO function calls, however with a recent change to C function calls it seems the compiler used can generate instructions which temporarily clobber r30. This means that when we receive a signal during one of these calls the value of r30 will not be the g as the runtime expects, causing a segfault. You can see from this assembly dump how the register is clobbered during the call: (the following is from a 5.13rc2 kernel) ``` Dump of assembler code for function __cvdso_clock_gettime_data: 0x00007ffff7ff0700 <+0>: cmplwi r4,15 0x00007ffff7ff0704 <+4>: bgt 0x7ffff7ff07f0 <__cvdso_clock_gettime_data+240> 0x00007ffff7ff0708 <+8>: li r9,1 0x00007ffff7ff070c <+12>: slw r9,r9,r4 0x00007ffff7ff0710 <+16>: andi. r10,r9,2179 0x00007ffff7ff0714 <+20>: beq 0x7ffff7ff0810 <__cvdso_clock_gettime_data+272> 0x00007ffff7ff0718 <+24>: rldicr r10,r4,4,59 0x00007ffff7ff071c <+28>: lis r9,32767 0x00007ffff7ff0720 <+32>: std r30,-16(r1) 0x00007ffff7ff0724 <+36>: std r31,-8(r1) 0x00007ffff7ff0728 <+40>: add r6,r3,r10 0x00007ffff7ff072c <+44>: ori r4,r9,65535 0x00007ffff7ff0730 <+48>: lwz r8,0(r3) 0x00007ffff7ff0734 <+52>: andi. r9,r8,1 0x00007ffff7ff0738 <+56>: bne 0x7ffff7ff07d0 <__cvdso_clock_gettime_data+208> 0x00007ffff7ff073c <+60>: lwsync 0x00007ffff7ff0740 <+64>: mftb r30 <---- RIGHT HERE => 0x00007ffff7ff0744 <+68>: ld r12,40(r6) ``` What I believe is happening is that the kernel changed the PowerPC VDSO calls to use standard C calls instead of using hand rolled assembly. The hand rolled assembly calls never touched r30, so this change was safe to roll back. That does not seem to be the case anymore as on the 5.13rc2 kernel the compiler *is* generating assembly which modifies r30, making this change again unsafe and causing a crash when the program receives a signal during these calls (which will happen often due to async preempt). This change happened here: https://lwn.net/ml/linux-kernel/235e5571959cfa89ced081d7e838ed5ff38447d2.1601365870.git.christophe.leroy@csgroup.eu/. I realize this was reverted due to unexplained hangs in PowerPC builders, but I think we should reinstate this change and investigate those issues separately: f4ca3c1 Fixes #46858 Change-Id: Ib18d7bbfc80a1a9cb558f0098878d41081324b52 GitHub-Last-Rev: c3002bc GitHub-Pull-Request: #46767 Reviewed-on: https://go-review.googlesource.com/c/go/+/328110 Run-TryBot: Lynn Boger <laboger@linux.vnet.ibm.com> TryBot-Result: Go Bot <gobot@golang.org> Reviewed-by: Cherry Mui <cherryyz@google.com> Trust: Lynn Boger <laboger@linux.vnet.ibm.com> (cherry picked from commit 16e82be) Reviewed-on: https://go-review.googlesource.com/c/go/+/334410 Run-TryBot: Cherry Mui <cherryyz@google.com>
…ut of saved registers so that we need to store two values on the stack to restore them later
This patch reinstates a fix for PowerPC with regard to making VDSO calls
while receiving a signal, and subsequently crashing. The crash happens
because certain VDSO calls can modify the r30 register, which is where g
is stored. This change was reverted for PowerPC because r30 is supposed
to be a non-volatile register. This is true, but that only makes a
guarantee across function calls, but not "within" a function call. This
patch was seemingly fine before because the Linux kernel still had hand
rolled assembly VDSO function calls, however with a recent change to C
function calls it seems the compiler used can generate instructions
which temporarily clobber r30. This means that when we receive a signal
during one of these calls the value of r30 will not be the g as the
runtime expects, causing a segfault.
You can see from this assembly dump how the register is clobbered during
the call:
(the following is from a 5.13rc2 kernel)
What I believe is happening is that the kernel changed the PowerPC VDSO
calls to use standard C calls instead of using hand rolled assembly. The
hand rolled assembly calls never touched r30, so this change was safe to
roll back. That does not seem to be the case anymore as on the 5.13rc2
kernel the compiler is generating assembly which modifies r30, making
this change again unsafe and causing a crash when the program receives a
signal during these calls (which will happen often due to async
preempt). This change happened here:
https://lwn.net/ml/linux-kernel/235e5571959cfa89ced081d7e838ed5ff38447d2.1601365870.git.christophe.leroy@csgroup.eu/.
I realize this was reverted due to unexplained hangs in PowerPC
builders, but I think we should reinstate this change and investigate
those issues separately:
f4ca3c1
Fixes #46803