Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Optimize handling of reset interrupt. #64

Merged
merged 8 commits into from
Mar 19, 2022
Merged

Conversation

mactcp
Copy link

@mactcp mactcp commented Mar 14, 2022

For your consideration, here is an optimization of reset handling.

Rather than polling m_isBusReset all over the place, I've changed the code to set up longjmp that can be used for a bus reset, then added the necessary code to call that from the interrupt handler. That does require a little bit of assembly.

I've gated the reset handling so it doesn't fire in the middle of a read, write, or flush of the image file. Flush will also be called now if there is a reset while a write was in progress.

I'm seeing about a 5% increase in read speed, and 10% increase in write speed on the computer I'm testing this on (Quadra 650, Apple drivers, Hyundai SD card, SCSI Director).

BTW, removed the PIN_MAP from attachInterrupt(). It was working OK because the value returned from PIN_MAP was the same, but if you look at the attachInterrupt function it is doing it's own PIN_MAP.

@erichelgeson
Copy link
Owner

Thanks again for the PR! I am seeing about the same on my LCIII+. I admit the asm is a bit over my knowledge level currently but I'll dig into the docs and read up on it. Again will test on a few different machines to ensure it doesn't affect compatibility and merge.

@mactcp
Copy link
Author

mactcp commented Mar 15, 2022

I've got more optimizations coming, probably this week. I wanted to submit this change first, as it is the most complex. Feel free to wait and queue them all up for a single test sweep.

I have quite a collection of vintage Macs, please let me know if you think there is anything I should test on myself.

@erichelgeson
Copy link
Owner

I will write up a more structured test/acceptance procedure in the wiki today or tomorrow for releases/PRs based on my process so far.

@mactcp
Copy link
Author

mactcp commented Mar 16, 2022

I have set up my scope to get a better look at what is going on performance wise. Without this optimization, the cycle time for REQ (in yellow) is longer. It is somewhere around 600ns before, and closer to 500ns with this optimization. This is for the read data transfer phase while running SCSI Director.

Read without reset optmization
Read with reset optimization

@mactcp
Copy link
Author

mactcp commented Mar 16, 2022

I just pushed some changes around GPIO setup.

  1. DB_MODE_OUT was set to 1, with a comment saying that 1 is OD. 1 is actually PUSH/PULL 10MHz. I have changed DB_MODE_OUT to 3 (PUSH/PULL 50MHz), and updated the comment with correct OD value of 7. I'm not expecting this to actually affect anything, as nothing on the DB pins is coming close to 10MHz speed.

  2. For DB_MODE_IN 8 (PU/PD), I've added code to make sure the PU resistors are enabled. Without this the weak PD resistors could be enabled on some of the data lines depending on what data was last written. For correctly terminated SCSI busses, this shouldn't affect anything, but it might help make things more robust for weak or missing termination.

  3. Removed a redundant SCSI_OUT(vBSY,inactive). The following gpio_mode(BSY, GPIO_INPUT_PU) sets the output pin to inactive to enable the PU resistor.

  4. Changed REQ to be PUSH/PULL when the target is active and DB_MODE_OUT is PUSH/PULL, and added support for it to be floating if the target is inactive and DB_MODE_IN is floating. Using PUSH/PULL for REQ is in line with what fast SCSI implementations use. I've attached a scope trace, and you can see the faster rise time and slightly higher level of the REQ waveform (yellow) looks more like the ACK waveform (blue) from the Mac (Quadra 650). I'm seeing a minuscule performance increase on my system, but this could help more significantly on systems with weak termination, like the PowerBook version of BlueSCSI.

REQ with PP

@erichelgeson
Copy link
Owner

Great analysis - and a good idea to validate those assumptions from the port over/translation. I've been stress testing this branch for a few days and the changes seem stable. I'm going to give it a test in my 6100 who was always very picky.

@mactcp
Copy link
Author

mactcp commented Mar 17, 2022

I happened to try this build with a 7100 yesterday, and it worked OK for me. However I don't have any termination resistors installed on the BlueSCSI, and I think a lot of Quadra and newer Macs have some sort of automatic termination on the motherboard.

One concern I have about the current termination resistor design on the BlueSCSI is disconnecting them. Disconnecting the power doesn't disconnect the resistors from the bus, so there can still be a significant amount of resistance connecting between active bus pins that could be interfering.

I have been thinking that for a future revision of the board it could be interesting to switch to using a pair of these $0.84 termination resistors and making them socketed:
https://www.mouser.com/ProductDetail/Bourns/4611X-104-221-331L?qs=JpXwipsyo8Dfi7JiCvdcWg%3D%3D

@erichelgeson
Copy link
Owner

I've actually been looking at the same ones for the next rev, and socketed makes sense - I have noticed that even with term off there is resistance on the line. Again I appreciate the detailed analysis and feedback.

My 6100 booted straight away and currently running through a stress test. Seems stable so far.

mactcp added 3 commits March 17, 2022 16:36
…op overhead in to one of the desired delays.
First pass at write speed optimization, now getting 825 KB/sec on LC III+.
Whitespace cleanup.
@mactcp
Copy link
Author

mactcp commented Mar 18, 2022

I was feeling like I was playing Whac-A-Mole with tuning the timing, until I pinned down the alignment of the writeDataLoop and readDataLoop functions. If the WAIT loops are not self contained within an 8 byte flash prefetch buffer, that seems to add 2 wait cycles per loop. Hopefully this might also be more robust against other code changes or different compiler versions. Hand tuned assembly for these two functions might be the most robust solution longer term.

This is looking good to me now, benchmarked at 1228 KB/sec read and 848 KB/sec write on an LC III+. PowerMac 7100/80 benchmarked at 1266 KB/sec read and 1253 KB/sec write.

I have a bunch of other optimization ideas to pursue, but I'll start a new PR for anything else.

@erichelgeson
Copy link
Owner

Nice work finding that 2 cycle issue - that likely explains the issues/differences between gcc versions. Currently pinning to toolchain-gccarmnoneeabi@1.60301.0 - I'll try out the latest and see if it makes a difference.

I have a few feature's I'll be PRing this week, and I should plan to cut a release sometime, but TBD.

Thanks again for contributing!

@erichelgeson erichelgeson merged commit 7eb1c0d into erichelgeson:main Mar 19, 2022
erichelgeson pushed a commit that referenced this pull request Nov 4, 2022
RP2040: Fix SCSI2 mode not getting enabled due to ATN handling
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants