Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrect timing of queued register writes in ymf262 #1

Open
M-HT opened this issue May 1, 2024 · 7 comments
Open

Incorrect timing of queued register writes in ymf262 #1

M-HT opened this issue May 1, 2024 · 7 comments
Assignees

Comments

@M-HT
Copy link

M-HT commented May 1, 2024

When you queue some register writes (aymo_ymf262_enqueue_write) before reading the output (aymo_ymf262_generate_i16x2), there is a difference in timings between nuked opl and simd versions.

In nuked opl the register writes occur at the end of 3rd, 5th, 7th, 9th, ... tick.
In simd versions the register writes occur at the end of 1st, 4th, 7th, 10th, ... tick.

This results in different output with simd versions vs nuked opl.

Possible fix is to change the line chip->rq_delay = AYMO_(REG_QUEUE_LATENCY); in function aymo_(rq_update) to chip->rq_delay = AYMO_(REG_QUEUE_LATENCY) - 1; and add the line chip->rq_delay = AYMO_(REG_QUEUE_LATENCY); to the end of function aymo_(ctor).

A minor unrelated issue is that in file aymo_ym7128_common.h you're declaring variable aymo_ym7128_kernel_minèhase instead of aymo_ym7128_kernel_minphase.

@TexZK TexZK self-assigned this May 1, 2024
@TexZK TexZK added the help wanted Extra attention is needed label May 1, 2024
@TexZK TexZK removed their assignment May 1, 2024
@TexZK
Copy link
Owner

TexZK commented May 1, 2024

Hi! Thanks for spotting some misalignments!
I tried your proposed solution, but I still get some differences by replacing aymo_(write)()/OPL3_WriteReg() with aymo_(enqueue_write)()/OPL3_WriteRegBuffered() within app_run() of test_ymf262_compare_epilogue_inline.h.

I think that's because my implementation pre-calculates as much as possible within aymo_(write), while Nuked OPL3 performs some updates within the DSP routine, delaying the effect of some register write.
I don't know if it's mandatory to compute those "steady-state" values within the DSP routine, or if it's just some arbitrary legacy code.

Up to now this should be the only known difference between AYMO and Nuked OPL3.
I didn't give it much importance because that just delays the effect of a register write, which would be affected by CPU/interrupt/HW latencies in an actual OPL3 system.

@M-HT
Copy link
Author

M-HT commented May 2, 2024

I checked the failing tests and the register writes occur at the same tick in nuked opl and the simd versions (with my solution).

Just for testing I moved the the code block // Fix tremolo updates delayed w.r.t. AYMO from test_ymf262_compare_epilogue_inline.h to the end of function OPL3_WriteReg in opl3.c. With this change all tests were ok, so my solution is correct.

Because of this code block, your implementation sometimes generates different output than the original nuked opl. Wouldn't it be better to change your implementation to match the original nuked opl ?

Regardless of this, here is an example file where the tests fail (with aymo_(write)()/OPL3_WriteReg()):
example-difference.zip

I haven't checked why, maybe I should open a new issue ?

@TexZK
Copy link
Owner

TexZK commented May 2, 2024

Hopefully I fixed tremolo (as per Nuked OPL3) with commit 17913f2.

I tested the score file you provided, comparing the output of the two implementations.
There must still be some rare 1-sample delay somewhere on the right channel, perhaps linked to OPL_QUIRK_CHANNELSAMPLEDELAY (to be verified),
It looks like DooM E1M1 is affected by the same tiny issue as well.

@M-HT
Copy link
Author

M-HT commented May 2, 2024

You forgot to add eg_tremoloreq to file aymo_ymf262_arm_neon.h and to remove the code block // Fix tremolo updates delayed w.r.t. AYMO from file aymo_ymf262_none.c.

@TexZK
Copy link
Owner

TexZK commented May 2, 2024

Ah! I'm rather tired today 😅

Patched with commit 4374fe5, tested on RPi5

@TexZK
Copy link
Owner

TexZK commented May 2, 2024

Ok, commit f99285b corrected a few wrong slot delays related to OPL_QUIRK_CHANNELSAMPLEDELAY .

As per the open point about queue delays, I'd have to change a few scheduling policies to make them match Nuked's.
I'm not really too much into it, because I think what Nuked does is a little overkill, and the code a little obfuscated.
I mean, AFAIK Nuked schedules each writing w.r.t. the latest scheduled event (+2), as if it were a MIDI score.
I prefer to just enqueue the register into a FIFO, which is then dequeued as soon as possible (+2).
IMO this is what an actual PC would do, or at least what an emulation would; you don't schedule port outputs, you just make the CPU execute the out instruction.

@TexZK TexZK self-assigned this May 2, 2024
@TexZK TexZK removed the help wanted Extra attention is needed label May 2, 2024
@M-HT
Copy link
Author

M-HT commented May 3, 2024

I see three problems/differences with your queue implementation:

  1. You're not dequeuing as soon as possible +2, but as soon as possible +3. I fixed it by changing the line chip->rq_delay = AYMO_(REG_QUEUE_LATENCY); in function aymo_(rq_update) to chip->rq_delay = AYMO_(REG_QUEUE_LATENCY) - 1;.
  2. In nuked the earliest dequeue occurs in the third tick (first tick in your implementation). I fixed it by adding the line chip->rq_delay = AYMO_(REG_QUEUE_LATENCY); to the end of function aymo_(ctor).

In my understanding with these changes your implementation matches nuked (no test I did shows a difference) except the third problem.

  1. In your implementation you're not handling buffer overflow in rq_buffer. In nuked this is handled by writing the oldest queued value immediately (which frees a place in the queue). I don't have a real world example where buffer overflow occurs, but it should probably be handled, just in case.

I found another example where tests fail:
example-difference-2.zip

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants