Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

XAudio2 bit flips via voltage sag when using AVX256 instructions at the same as I destroy source voices? #20

Open
MaulingMonkey opened this issue Oct 8, 2022 · 6 comments
Labels
bug Something isn't working invalid This doesn't seem right
Milestone

Comments

@MaulingMonkey
Copy link
Owner

MaulingMonkey commented Oct 8, 2022

Unsound if XAudio2 threads outlive main?
Had a once-off XAudio2 thread crash when tearing down my snake clone with a leaked IXAudio2 instance. Worth investigating.

@MaulingMonkey MaulingMonkey added the bug Something isn't working label Oct 8, 2022
@MaulingMonkey
Copy link
Owner Author

I've since made several attempts to create a repro case, all of which have evaporated. I might just mark create unsafe and call it a day, which might be worth doing anyways just to indicate "this is a large native API and it probably has at least some unsoundness (internal overflows / bad handling of OOMs) that are impossible to guard against."

@MaulingMonkey
Copy link
Owner Author

I have indeed marked create unsafe and called it a day. That said, if anyone else encounters this and can find more details, please share!

@MaulingMonkey MaulingMonkey reopened this Oct 10, 2022
@MaulingMonkey
Copy link
Owner Author

Managed a repro, collected symbols, and really dug in. I believe my CPU is mis-executing instructions, possibly due to a voltage spike induced bit-flip from XAudio using fancy instructions in combination with other spikes in activity being put on the cores by process shutdown.

XAudio2 Thread

+	XAudio2_9.dll!OAPIPELINE::MatrixMixFromInt16DiagonalAvx(struct OAPIPELINE::SMatrixMixParameters const *)	Unknown
 	XAudio2_9.dll!LEAPFX::CAudioSRC::Process(unsigned int,struct XAPO_PROCESS_BUFFER_PARAMETERS const *,unsigned int,struct XAPO_PROCESS_BUFFER_PARAMETERS *,int)	Unknown
 	XAudio2_9.dll!LEAPCORE::CSWVoice::Process(unsigned int)	Unknown
 	XAudio2_9.dll!LEAPCORE::CGraphManager::ProcessVoiceChains(unsigned int)	Unknown
 	XAudio2_9.dll!LEAPCORE::CGraphManager::ProcessGraph(unsigned int)	Unknown
 	XAudio2_9.dll!LEAPCORE::CGraphManager::GraphThreadProc(unsigned __int64)	Unknown
 	XAudio2_9.dll!CThreadBase::StaticGraphThreadProc(void *)	Unknown
 	kernel32.dll!BaseThreadInitThunk�()	Unknown
 	ntdll.dll!RtlUserThreadStart�()	Unknown
Unhandled exception at 0x00007FFB5858A165 (XAudio2_9.dll) in Snake.exe: 0xC0000005:
Access violation reading location 0x00000228FDF510FC.
 00007FFB5858A15E  jmp         OAPIPELINE::MatrixMixFromInt16DiagonalAvx+565h (07FFB5858A165h)  
+00007FFB5858A160  vmovups     ymm2,ymmword ptr [rbp]  
 00007FFB5858A165  movsx       eax,word ptr [r8-14h]  
 00007FFB5858A16A  mov         dword ptr [rbp+0E0h],eax  
 00007FFB5858A170  movsx       eax,word ptr [r8-12h]  
 00007FFB5858A175  mov         dword ptr [rbp+0E4h],eax  
RAX = 3F8000003F800000 RBX = 0000000000000000 RCX = 0000000000000000 RDX = 00000228FDEC41A0
RSI = 00000228FDEC5870 RDI = 00000228FDF510FC R8  = 00000228FDF51110 R9  = 0000000000000010
R10 = 0000000000000372 R11 = 00000228FDEC4180 R12 = 0000000000000001 R13 = 0000000000000001
R14 = 00000000000001B9 R15 = 0000000000000000 RIP = 00007FFB5858A165 RSP = 0000003E547FF560
RBP = 0000003E547FF580 EFL = 00010200 

The instruction is dereferencing RBP but that points at a wildly different address than the read address that's supposedly exploding. However, RDI matches exactly, and is one bit flip away from RBP in encoding:

https://wiki.osdev.org/X86-64_Instruction_Encoding#Registers

X.Reg ... 64-bit GP ...
... ... ... ...
0.101 (5) ... RBP ...
0.110 (6) ... RSI ...
0.111 (7) ... RDI ...

Main Thread

The main thread callstack shows the crash occurs while handling WM_CLOSE, well before destroying the IXAudio2 or trying to invoke atexit stuff / invoke ExitProcess. In fact, the main thread has only just begun to attempt to tear down the very first voice, Music:

 	ntdll.dll!NtWaitForAlertByThreadId�()	Unknown
 	ntdll.dll!RtlpWaitOnAddressWithTimeout()	Unknown
 	ntdll.dll!RtlpWaitOnAddress()	Unknown
 	ntdll.dll!RtlpWaitOnCriticalSection()	Unknown
 	ntdll.dll!RtlpEnterCriticalSectionContended()	Unknown
 	ntdll.dll!RtlEnterCriticalSection�()	Unknown
 	XAudio2_9.dll!XAUDIO2::CX2Voice::DestroyVoice(void)	Unknown
 	snakelib_rs.dll!thindx_xaudio2_sys::xaudio2_8::IXAudio2Voice::DestroyVoice() Line 28	Rust
 	snakelib_rs.dll!thindx_xaudio2::xaudio2_8::voices::impl$15::drop(thindx_xaudio2::xaudio2_8::voices::SourceVoiceUntyped * self) Line 64	Rust
 	snakelib_rs.dll!core::ptr::drop_in_place<thindx_xaudio2::xaudio2_8::voices::SourceVoiceUntyped>(thindx_xaudio2::xaudio2_8::voices::SourceVoiceUntyped *) Line 487	Rust
 	snakelib_rs.dll!core::ptr::drop_in_place<slice$<thindx_xaudio2::xaudio2_8::voices::SourceVoiceUntyped>>(ptr_mut$<slice$<thindx_xaudio2::xaudio2_8::voices::SourceVoiceUntyped>>) Line 487	Rust
 	snakelib_rs.dll!alloc::vec::impl$28::drop<thindx_xaudio2::xaudio2_8::voices::SourceVoiceUntyped,alloc::alloc::Global>(alloc::vec::Vec<thindx_xaudio2::xaudio2_8::voices::SourceVoiceUntyped,alloc::alloc::Global> * self) Line 2923	Rust
 	snakelib_rs.dll!core::ptr::drop_in_place<alloc::vec::Vec<thindx_xaudio2::xaudio2_8::voices::SourceVoiceUntyped,alloc::alloc::Global>>(alloc::vec::Vec<thindx_xaudio2::xaudio2_8::voices::SourceVoiceUntyped,alloc::alloc::Global> *) Line 487	Rust
 	snakelib_rs.dll!core::ptr::drop_in_place<snakelib_rs::audio::XAudio2Sound>(snakelib_rs::audio::XAudio2Sound *) Line 487	Rust
 	snakelib_rs.dll!core::ptr::drop_in_place<alloc::boxed::Box<snakelib_rs::audio::XAudio2Sound,alloc::alloc::Global>>(snakelib_rs::audio::XAudio2Sound * *) Line 487	Rust
 	snakelib_rs.dll!snakelib_rs::audio::xaudio2_sound_destroy(snakelib_rs::audio::XAudio2Sound * sound) Line 179	Rust
 	[Managed to Native Transition]	
 	SnakeLib.dll!Snake.XAudio2Sound.Dispose.AnonymousMethod__0() Line 75	C#
 	SnakeLib.dll!Snake.RustPanicException.Wrap.AnonymousMethod__0() Line 13	C#
 	SnakeLib.dll!Snake.RustPanicException.Wrap<int>(System.Func<int> f) Line 18	C#
 	SnakeLib.dll!Snake.RustPanicException.Wrap(System.Action a) Line 13	C#
 	SnakeLib.dll!Snake.XAudio2Sound.Dispose() Line 75	C#
 	Snake.exe!Snake.Sounds.DestroyAll() Line 31	C#
 	Snake.exe!Snake.SnakeForm.Dispose(bool disposing) Line 114	C#
 	System.Windows.Forms.dll!System.Windows.Forms.Form.WmClose(ref System.Windows.Forms.Message m)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DebuggableCallback(System.IntPtr hWnd, int msg, System.IntPtr wparam, System.IntPtr lparam)	Unknown
 	[Native to Managed Transition]	
 	user32.dll!UserCallWinProcCheckWow()	Unknown
 	user32.dll!DispatchClientMessage()	Unknown
 	user32.dll!__fnDWORD�()	Unknown
 	ntdll.dll!KiUserCallbackDispatcherContinue�()	Unknown
 	win32u.dll!NtUserMessageCall�()	Unknown
 	user32.dll!RealDefWindowProcWorker()	Unknown
 	user32.dll!RealDefWindowProcW()	Unknown
 	uxtheme.dll!DoMsgDefault(struct _THEME_MSG const *)	Unknown
 	uxtheme.dll!OnDwpSysCommand()	Unknown
 	uxtheme.dll!_ThemeDefWindowProc()	Unknown
 	uxtheme.dll!ThemeDefWindowProcW�()	Unknown
 	user32.dll!DefWindowProcW()	Unknown
 	user32.dll!UserCallWinProcCheckWow()	Unknown
 	user32.dll!CallWindowProcW()	Unknown
 	System.Windows.Forms.ni.dll!00007ffbe2a6351d()	Unknown
 	[Managed to Native Transition]	
 	System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DefWndProc(ref System.Windows.Forms.Message m)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.Form.DefWndProc(ref System.Windows.Forms.Message m)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.Control.WndProc(ref System.Windows.Forms.Message m)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.Form.WmSysCommand(ref System.Windows.Forms.Message m)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DebuggableCallback(System.IntPtr hWnd, int msg, System.IntPtr wparam, System.IntPtr lparam)	Unknown
 	[Native to Managed Transition]	
 	user32.dll!UserCallWinProcCheckWow()	Unknown
 	user32.dll!DispatchClientMessage()	Unknown
 	user32.dll!__fnDWORD�()	Unknown
 	ntdll.dll!KiUserCallbackDispatcherContinue�()	Unknown
 	win32u.dll!NtUserMessageCall�()	Unknown
 	user32.dll!RealDefWindowProcWorker()	Unknown
 	user32.dll!RealDefWindowProcW()	Unknown
 	uxtheme.dll!DoMsgDefault(struct _THEME_MSG const *)	Unknown
 	uxtheme.dll!OnDwpNcLButtonDown()	Unknown
 	uxtheme.dll!_ThemeDefWindowProc()	Unknown
 	uxtheme.dll!ThemeDefWindowProcW�()	Unknown
 	user32.dll!DefWindowProcW()	Unknown
 	user32.dll!UserCallWinProcCheckWow()	Unknown
 	user32.dll!CallWindowProcW()	Unknown
 	System.Windows.Forms.ni.dll!00007ffbe2a6351d()	Unknown
 	[Managed to Native Transition]	
 	System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DefWndProc(ref System.Windows.Forms.Message m)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.Form.DefWndProc(ref System.Windows.Forms.Message m)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.Control.WndProc(ref System.Windows.Forms.Message m)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.Form.WmNcButtonDown(ref System.Windows.Forms.Message m)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.NativeWindow.DebuggableCallback(System.IntPtr hWnd, int msg, System.IntPtr wparam, System.IntPtr lparam)	Unknown
 	[Native to Managed Transition]	
 	user32.dll!UserCallWinProcCheckWow()	Unknown
 	user32.dll!DispatchMessageWorker()	Unknown
 	System.Windows.Forms.ni.dll!00007ffbe2aeba99()	Unknown
 	[Managed to Native Transition]	
 	System.Windows.Forms.dll!System.Windows.Forms.Application.ComponentManager.System.Windows.Forms.UnsafeNativeMethods.IMsoComponentManager.FPushMessageLoop(System.IntPtr dwComponentID, int reason, int pvLoopData)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.Application.ThreadContext.RunMessageLoopInner(int reason, System.Windows.Forms.ApplicationContext context)	Unknown
 	System.Windows.Forms.dll!System.Windows.Forms.Application.ThreadContext.RunMessageLoop(int reason, System.Windows.Forms.ApplicationContext context)	Unknown
 	Snake.exe!Snake.Program.Main() Line 15	C#
 	[Native to Managed Transition]	
 	mscoreei.dll!_CorExeMain�()	Unknown
 	mscoree.dll!_CorExeMain_Exported�()	Unknown
 	kernel32.dll!BaseThreadInitThunk�()	Unknown
 	ntdll.dll!RtlUserThreadStart�()	Unknown

@MaulingMonkey MaulingMonkey added the invalid This doesn't seem right label Oct 10, 2022
@MaulingMonkey MaulingMonkey changed the title Unsound if XAudio2 threads outlive main? XAudio2 bit flips via voltage sag when using AVX256 instructions at the same as I destroy source voices. Oct 10, 2022
@MaulingMonkey
Copy link
Owner Author

MaulingMonkey commented Oct 10, 2022

Managed a 3rd repro, exact same xaudio2+main thread callstacks. Also stepped all the way through to the same logic in a regular non-misexecuting execution of xaudio2 to verify I see this instruction stream - e.g. the disassembly is semi-trustworthy. Got a second machine out of cold storage and had it running through windows updates preparing to attempt a repro on it, only for it to die on me. Replacement power brick on the way in case that's all that bit the bullet. Clearly XAudio2 is cursed by gremlins.

@MaulingMonkey MaulingMonkey added this to the 2022-10-10 milestone Oct 10, 2022
@MaulingMonkey MaulingMonkey changed the title XAudio2 bit flips via voltage sag when using AVX256 instructions at the same as I destroy source voices. XAudio2 bit flips via voltage sag when using AVX256 instructions at the same as I destroy source voices Oct 10, 2022
@MaulingMonkey MaulingMonkey changed the title XAudio2 bit flips via voltage sag when using AVX256 instructions at the same as I destroy source voices XAudio2 bit flips via voltage sag when using AVX256 instructions at the same as I destroy source voices? Oct 10, 2022
@MaulingMonkey
Copy link
Owner Author

MaulingMonkey commented Oct 12, 2022

Bacterius wants to blame a silicon/microcode bug. Which I guess is also possible?

The system in question is a ~4.5 years old heavily abused NUC that compiled a lot of stuff and is fed through a piddly little 120W power brick.

@MaulingMonkey
Copy link
Owner Author

jetp250 in Discord notes:

Not that I know anything about this but how do you know it's the vmovups line rather than the next one? The exception's address ends with 165, movups is on 160, and on the movsx line it accesses word ptr[r8-14h]. r8's value there is 0x00000228FDF51110, and subtracting 0x14 does give you 0x228FDF510FC i.e the same value as RDI holds, so this would do the same OOB access as the suspected bit flip

Have I perhaps been mislead by trusting VS's debugger pointing out the vmovups a little too much? Probably!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working invalid This doesn't seem right
Projects
None yet
Development

No branches or pull requests

1 participant