-
Notifications
You must be signed in to change notification settings - Fork 4.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Segfault during restore: CoreCLR Product Build Linux_musl arm release #43826
Comments
Not sure what area to put this under, so I labeled it with infrastructure for now to ensure it's seen by the right people. Feel free to move it wherever appropriate. |
cc: @trylek @jkoritzinsky |
@janvorli, am I right to recall these are the runs you recently enabled? I guess that we can either get to the dump and follow from that by identifying which component crashes (looks like the nuget downloader but the log isn't sufficiently detailed to be sure) or find out that there's no dump and primarily track this as the infra deficiency of not having a dump available. |
Yes, I have recently added the linux-musl arm builds. It is a cross-build, so the crash happens on x64 Linux. As we don't seem to capture core dumps of crashes of build, it doesn't seem actionable. |
@dotnet/runtime-infrastructure - do we know how to enable dump collection on AzDO build machines and / or investigate why it doesn't work in case it should be already enabled? |
Dumps are configured on helix test machines, not build machines as far as I know. @MattGal please correct me if I am wrong here. https://github.com/dotnet/runtime/blob/master/docs/design/coreclr/botr/xplat-minidump-generation.md has the information of the environment variables needed if the crash is happening somewhere in managed code - I'd say that's unlikely given this is an x64 process, but the logs point at it happening during a restore. Otherwise, for native dumps we'd need to set |
There isn't really a difference in how a Helix machine is provisioned. Specifically, we still create and upload dumps for anything that crashes even on the build machines, and if you know your build's info and have Kusto access you can definitely find these dumps. This isn't likely to get any more helpful than it currently is, because the helix work items in question come from an Azure Devops pool provider and there's no built in part of this interface to promote files to be "part" of a build outside of the build's execution. However, if you're keen to know how to get dumps (if they exist) off of a given AzDO build, send me the build in corpnet email and I'll walk you through how to find them. |
Is there a way to get the dump? This was executed inside a docker container and the container is not preserved, right? |
I tried, but didn't find any data on Kusto pointing to one (or even a failure on such a leg). I sent a message to matt and will share any findings. |
Ah. If the Azure Devops agent drove a build inside a docker container, no we definitely don't have any record of core dumps done inside that container. |
If we wanted to enable capturing dumps in containers, we could possibly map a folder on the machine running the container into the container and let the dumps go there. And also pass |
Would it make sense to instead add a step that tries to upload dumps as artifacts if one exists when the build fails? |
Haven't seen this recently. |
Should we close this issue and re-open if we see further occurances? If not should we at least remove the |
Closing this as we haven't seen it for a while. |
At the start of "Build native test components":
https://dev.azure.com/dnceng/public/_build/results?buildId=865863&view=logs&jobId=2796eae7-6bff-580e-7515-5bfa4409543c&j=2796eae7-6bff-580e-7515-5bfa4409543c&t=925eae2f-7374-55ef-fc58-6001c38b9348
Hit in #43798
The text was updated successfully, but these errors were encountered: