Making Microphone to be "Voice Activated" instead push-to-talk #27

JuhaOjala · 2018-06-18T10:57:44Z

Hello! This is not actually an Issue, but an request for help. I activated your plugin and added the TFAudioCapture component to player character.

Then in Beginplay I binded the delegate, convert the raw binary to Wav file and finally convert the Wav to USoundWave and play the thing. This allows me to play the sound that was just recorded

Here is the .CPP code for the Wav to USoundWave conversion if anybody needs:

USoundWave* AAudioCaptureSimpleCharacter::GetSoundWaveFromRawWav(TArray<uint8> Bytes)
{
	USoundWave* sw = NewObject<USoundWave>(USoundWave::StaticClass());`

	if (!sw)
		return nullptr;

	TArray < uint8 > rawFile;
	rawFile = Bytes;
	//FFileHelper::LoadFileToArray(rawFile, filePath.GetCharArray().GetData());
	FWaveModInfo WaveInfo;

	if (WaveInfo.ReadWaveInfo(rawFile.GetData(), rawFile.Num()))
	{
		sw->InvalidateCompressedData();

		sw->RawData.Lock(LOCK_READ_WRITE);
		void* LockedData = sw->RawData.Realloc(rawFile.Num());
		FMemory::Memcpy(LockedData, rawFile.GetData(), rawFile.Num());
		sw->RawData.Unlock();

		int32 DurationDiv = *WaveInfo.pChannels * *WaveInfo.pBitsPerSample * *WaveInfo.pSamplesPerSec;
		if (DurationDiv)
		{
			sw->Duration = *WaveInfo.pWaveDataSize * 8.0f / DurationDiv;
		}
		else
		{
			sw->Duration = 0.0f;
		}
		sw->SampleRate = *WaveInfo.pSamplesPerSec;
		sw->NumChannels = *WaveInfo.pChannels;
		sw->RawPCMDataSize = WaveInfo.SampleDataSize;
		sw->SoundGroup = ESoundGroup::SOUNDGROUP_Default;

	}
	else {
		return nullptr;
	}

	return sw;
}
USoundWave* AAudioCaptureSimpleCharacter::GetSoundWaveFromRawWav(TArray<uint8> Bytes)
{
	USoundWave* sw = NewObject<USoundWave>(USoundWave::StaticClass());

	if (!sw)
		return nullptr;

	TArray < uint8 > rawFile;
	rawFile = Bytes;
	//FFileHelper::LoadFileToArray(rawFile, filePath.GetCharArray().GetData());
	FWaveModInfo WaveInfo;

	if (WaveInfo.ReadWaveInfo(rawFile.GetData(), rawFile.Num()))
	{
		sw->InvalidateCompressedData();

		sw->RawData.Lock(LOCK_READ_WRITE);
		void* LockedData = sw->RawData.Realloc(rawFile.Num());
		FMemory::Memcpy(LockedData, rawFile.GetData(), rawFile.Num());
		sw->RawData.Unlock();

		int32 DurationDiv = *WaveInfo.pChannels * *WaveInfo.pBitsPerSample * *WaveInfo.pSamplesPerSec;
		if (DurationDiv)
		{
			sw->Duration = *WaveInfo.pWaveDataSize * 8.0f / DurationDiv;
		}
		else
		{
			sw->Duration = 0.0f;
		}
		sw->SampleRate = *WaveInfo.pSamplesPerSec;
		sw->NumChannels = *WaveInfo.pChannels;
		sw->RawPCMDataSize = WaveInfo.SampleDataSize;
		sw->SoundGroup = ESoundGroup::SOUNDGROUP_Default;

	}
	else {
		return nullptr;
	}

	return sw;
}

Now, I tried to see what value the bytes are that are coming from the OnAudioData Array, but when ever I try to do something with them, the whole game freezes. I guess it's because the data is constantly coming through and if something is done with the data, it never stops and computer chokes.

Could you point out what would be procedure to start doing the constantly listening microphone? You gave some advice for the subject in Unreal Forum, but I didn't quite grasp what you meant. Thank you for your time.

The text was updated successfully, but these errors were encountered:

getnamo · 2018-06-24T14:27:58Z

It should be a matter of listening to the OnAudioData, it calls back on the game thread and the buffer is copied, but it's possible it might get overwritten again before you finish working on the game thread, that would be a bug. An easy fix may be to uncomment these three lines: https://github.com/getnamo/tensorflow-ue4/blob/master/Source/TFAudioCapture/Private/FTFAudioCapture.cpp#L45, https://github.com/getnamo/tensorflow-ue4/blob/master/Source/TFAudioCapture/Private/FTFAudioCapture.cpp#L46, and https://github.com/getnamo/tensorflow-ue4/blob/master/Source/TFAudioCapture/Private/FTFAudioCapture.cpp#L57 which would make you receive those blueprint calls on the sound thread. As long as you don't create or destroy UObjects on those callbacks it will work fine.

Regarding your earlier question of only triggering when you hear voice, a simple way to determine that is to listen to the average volume of audio and trigger if it's high enough. E.g. averaging the absolute values of each byte for the whole array of data you receive OnAudioData, then if the absolute average is above a certain threshold-> send the bytes, at that point start a timeout which resets each time you continue breaching the threshold for listening, if the timeout reaches the end, stop streaming the bytes until it get's re-triggered again.

getnamo added the question label Jun 18, 2018

getnamo mentioned this issue Aug 28, 2019

Add audio conversion utility to BPLibrary getnamo/SocketIOClient-Unreal#147

Closed

eternalfusion mentioned this issue Sep 4, 2020

TFAudioCapture for recording audio #58

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Making Microphone to be "Voice Activated" instead push-to-talk #27

Making Microphone to be "Voice Activated" instead push-to-talk #27

JuhaOjala commented Jun 18, 2018

getnamo commented Jun 24, 2018

Making Microphone to be "Voice Activated" instead push-to-talk #27

Making Microphone to be "Voice Activated" instead push-to-talk #27

Comments

JuhaOjala commented Jun 18, 2018

getnamo commented Jun 24, 2018