Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pixel line skew between top and bottom panel halves when horizontal scrolling - P3 #338

Closed
xemjeff opened this issue Oct 25, 2022 · 67 comments
Labels
not an issue with library This library works as expected, but something else is the root cause, such as AdaFruitGFX

Comments

@xemjeff
Copy link

xemjeff commented Oct 25, 2022

When horizontal scrolling, the P3 (64x32) display shows an offset between panel top and bottom at the halfway mark.
I have 3 panels chained. The offset increases the faster the scroll. In the image below, I am writing out the lower case letter 'L' with a delay of 4ms between scroll shifts (one pixel to the left each cycle, blank the screen, redraw the text).

Seems like it's related to #133, but it does not happen with a static display - and it gets worse the faster the scroll.

Any thoughts on what to adjust or where to look?
pixelSkew-2

The P3 panel has this chip:
CHIPONE ICN2037BP

Here's a link to the docs for the chip:
https://olympianled.com/wp-content/uploads/2021/05/ICN2037_datasheet_EN_2017_V2.0.pdf

I tried changing the driver to ICN2038S, but that does not help - same effect.

@DarrylStrong
Copy link

This is due to the panels being two seperate halves top and bottom. As the eye follows the text across the screen there is a difference where the scans are. Top line will be in the same position as the top of the bottom half.

You can reduce the effect by getting the scan rate as high as possible (increase the clock frequency and reduce the colour depth) but it will still be there.

The other option for scrolling is to do bit manipulation on the DMA buffers themselves, I have modified the library to make the buffers public so I can then rotate the colour bits without moving the row select bits to facilitate scrolling without having to redraw the whole display.

@mrcodetastic
Copy link
Owner

mrcodetastic commented Oct 25, 2022

Hi @xemjeff, I've noticed this as well.

What version of the library are you using? 2.0.7 or what's in the repository?

What ESP32 hardware variant?

Oh, and are you using double buffering?

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

This is due to the panels being two seperate halves top and bottom. As the eye follows the text across the screen there is a difference where the scans are. Top line will be in the same position as the top of the bottom half.

You can reduce the effect by getting the scan rate as high as possible (increase the clock frequency and reduce the colour depth) but it will still be there.

The other option for scrolling is to do bit manipulation on the DMA buffers themselves, I have modified the library to make the buffers public so I can then rotate the colour bits without moving the row select bits to facilitate scrolling without having to redraw the whole display.

Thanks for your response @DarrylStrong. As you say the first rows (top half, bottom half) would be aligned. But that's not what we see here. Unless I misunderstand you, row 0 and row 16 are shifted in and latched at the same time.

@DarrylStrong
Copy link

Yes they are shifted at the same time but row 15 isn't, it is one full scan behind
Hence the step.

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

Version (from library.json) is 2.0.6.
ESP32 VROOM 32D

I am using flipBuffer(), so I guess that GFX version of double buffering. Is there perhaps double buffering also in the library?

@mrcodetastic
Copy link
Owner

Can you post a test sketch?

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

@DarrylStrong OK, I don't understand what you mean - forgive my ignorance.
Row 0 to 15 are sycn'd, Rows 16 to 31 look like one pixel ahead.
But at slower speeds (40 ms between updates), it looks more like 1/2 pixel ahead. And statically, there is no shift.

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

@mrfaptastic ok - I'll put something simple together to show the effect.

@mrcodetastic
Copy link
Owner

This is due to the panels being two seperate halves top and bottom. As the eye follows the text across the screen there is a difference where the scans are. Top line will be in the same position as the top of the bottom half.

It's not human, there seems to be some weird delay between what is pumped out to the RGB1 vs RGB2 pins.

I don't know if there's some hidden byte ordering issue or DMA update internal silicon delay. Or perhaps the boards are slow to latch the bottom half vs. top.

@DarrylStrong
Copy link

it is because your eye tracks across the leds as the text moves. That is what causes scrolling text on 7 row signs to tilt.

@DarrylStrong
Copy link

DarrylStrong commented Oct 25, 2022

This is due to the panels being two seperate halves top and bottom. As the eye follows the text across the screen there is a difference where the scans are. Top line will be in the same position as the top of the bottom half.

It's not human, there seems to be some weird delay between what is pumped out to the RGB1 vs RGB2 pins.

I don't know if there's some hidden byte ordering issue or DMA update internal silicon delay. Or perhaps the boards are slow to latch the bottom half vs. top.

I have been working in the LED display industry for thirty years. Static driven led matrices smear when scrolled, row scanned tilt and column scanned get shorter. This effect is something we had to work with on one of our 16 row boards which was scanned 8:1. We had to artificially shift one set of 8 rows sideways to remove the step.

@board707
Copy link
Contributor

board707 commented Oct 25, 2022

This is just a low scan rate. I have seen this many times.
To avoid it you have to make scan rate significantly faster than scrolling. @xemjeff used 40ms as scroll delay, so your scroll rate is 25 fps. To eliminate the effect he needs update the panel 200-300 times at second.

Addition
@xemjeff
I read above that you tried to scroll with a delay of 4ms - this is too fast. In order for the picture to remain without distortion, with such a scroll, it is necessary to update the image more than 1000 times per second.
I think you should reduce the scroll speed to 20-30ms inter-scroll delay

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

Here's a test sketch. The skew only happens at the mid panel. I can run 6,8 ms and there is no skew if I display on the top half or the bottom half - as long as it does not cross the middle row.

@board707 the skew still happens at 20-40 ms.

@mrfaptastic this is using double buffering from the library (set to "true")

Thanks to everyone for suggestions. Hopefully the test sketch will help.
PixelSkew.zip

@mrcodetastic
Copy link
Owner

This is just a low scan rate. I have seen this many times.
To avoid it you have to make scan rate significantly faster than scrolling. @xemjeff used 40ms as scroll delay, so your scroll rate is 25 fps. To eliminate the effect he needs update the panel 200-300 times at second.

Right, so basically the DMA buffer is being outputted slower to the panels than the CPU is updating the DMA buffer and causing essentially tearing. Hmmmm.

@DarrylStrong
Copy link

So this is why I don't see it as badly then as I move the dma buffers for scrolling rather than redrawing the whole screen.

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

Why is the tearing always at the mid-line? The data are shift loaded in parallel, the OE and LATCH are used for both.
From what I see, the bottom is updated before the top and the skew distance changes with scrolling speed.

If the CPU is updating the buffer at varying rates based on scroll rate, wouldn't the tear change row location?

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

I think I understand what's going on:
The lower and upper panels are drawn at the same time. But the line is moving.
It takes a certain amount of time (t_k) to move through the 16 rows. What we "see" are two slanted lines, and of course they don't connect. (see figure below). The faster the perceived movement, the greater the perceived slant.
I tool slow-motion video of the dislplay and it's fine, The pixels on top and bottom line up.

SkewMidLine

If this correct, then it's not tearing. I'm not sure if increasing the update rate would help.
Let me know if you concur.

@DarrylStrong
Copy link

That's exactly what I was trying to explain.

Hence why we had to offset the bottom half of our display by one pixel.

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

@DarrylStrong Yes - I understand now - it just took me a while :-)

I tried offsetting the bottom panel by one pixel. That works at the fast speeds.
Then of course, as we slow down the scroll, we get the reverse problem - skewing in the other direction.

I'm curious why manipulating the DMA buffer would make any difference. This would still result in the slant effect, no?

Added:
We could, I suppose, update the rows top to bottom for the top half, and bottom to top for the bottom half. They would then meet in the middle at the same time. But that might look a bit weird, and I'm not sure the display supports that.

@DarrylStrong
Copy link

I didn't explain myself very well 😊

It is just faster manipulating the buffers, I am using the library for 384x64 displays so the redraw takes a while 😉

@DarrylStrong
Copy link

I didn't explain myself very well 😊

It is just faster manipulating the buffers, I am using the library for 384x64 displays so the redraw takes a while 😉

It was also helped when we did this pixel shift that the scroll speed was synchronised with the scan rate.

@board707
Copy link
Contributor

The lower and upper panels are drawn at the same time. But the line is moving.
It takes a certain amount of time (t_k) to move through the 16 rows. What we "see" are two slanted lines, and of course they don't connect. (see figure below). The faster the perceived movement, the greater the perceived slant.
I tool slow-motion video of the dislplay and it's fine, The pixels on top and bottom line up.

But I am wondering why the double buffering do not fix this artefacts. As far as understand it, with double buffering whole picture should updated in one time.

@DarrylStrong
Copy link

It is written at the same time but your eyes follow the image across so see the columns in different positions as the text scrolls. Of course there are only two rows on at a time... Refer to the fantastic diagram above 😊

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

expanding on @DarrylStrong's explanation:

Your eye is following the line as is moves. The pixels are displayed one row at a time. (in my case one pixel per row)
From one row to the next, your eye has moved, following the line. This means that the next row is shifted on your retina by some small amount, Same for the next pixel. This results in the perception of a slanted line.

@mrcodetastic
Copy link
Owner

mrcodetastic commented Oct 25, 2022

Version (from library.json) is 2.0.6. ESP32 VROOM 32D

I am using flipBuffer(), so I guess that GFX version of double buffering. Is there perhaps double buffering also in the library?

That's your problem. You need to use 2.0.7

If you are using double buffering, and you give a few milliseconds from when you stop drawing to when you call 'flipDMABuffer', then you shouldn't have this issue... but only if you're using 2.0.7

We had this problem previously and I found a way to fix it in 2.0.7

Don't use the latest git version either as it's probably broken again.

@board707
Copy link
Contributor

Your eye is following the line as is moves...

If it was only an optical effect, we would not see a shift in the photo, as in the first message.
I just test scrolling in the chain of two 64x32 RGB matrices with scroll delay 5ms and picture update 200fps - i don't see anything similar to your picture. The scrolling letters is absolutely straight in vertical.

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

@mrfaptastic I tried 2.0.7, and introduced 2ms and 4ms delay before flipping the buffer. (dbuff = true). Same shift effect.

@board707 :
did you test with the test sketch I provided, or some other implementation? If different, could you post that?
what version of the lib? 2.0.6, 2.07, etc?
where do you set 200FPS - is that the min_refresh_rate? I tried 60, 85, 200 and saw no difference.

Added: I just tested with 2 chained boards instead of 3 (64x32). Same problem.

@board707
Copy link
Contributor

did you test with the test sketch I provided, or some other implementation?

Sorry, I tested just to make sure it's not an optical effect. I used completely different environment - 2x 64x32 RGB panels with RP2040 board and specific library.
(This is how the letters looks like when scrolling from left to right at a speed of 5ms/pixel. )
P_20221026_010240_small

@xemjeff
Copy link
Author

xemjeff commented Oct 25, 2022

@board707 has a good point. Why would it show up skewed in the slo-mo video?

I changed the code to draw a 2 pixel vertical line across the mid row.
When I shoot that video in slow motion, I see the two pixels skewed.

twoPixelVLine

@xemjeff
Copy link
Author

xemjeff commented Oct 26, 2022

I ran another test:

Two pixel, as above, but from left to right rather than right to left.
The skew is now reversed (the 2nd pixel now to the right of the first pixel).

My conclusions.

  • The skew is because the 2nd pixel is drawn right after the first, but in the next column as the scan begins again.

  • I don't think the library is the cause.

  • It's not the p3 panels. I ran the same test on P6 panels (32x32 x 3) with the same effect.

  • The effect comes as the bottom row of the top sub-panel at time 't_k' is followed immediate by the top row of the bottom sub panel at time t_k+1

I am curious as to why @board707 is not seeing this effect.

@mrcodetastic
Copy link
Owner

Please try again with the latest git version. The Skew with double buffering still exists as this is a optical illusion.

When turning off double buffering you don't see the skew as the whole panel is skewed because the drawing by the CPU is occuring as you see it, from top to bottom.

@xemjeff
Copy link
Author

xemjeff commented Mar 14, 2023

I just built from master branch.
Same thing. No skew, shift or even apparent slant with double buffering off.

In the source for the ESP32-VirtualMatrixPanel-I2S-DMA.h file, I noticed several attempts (commented out) dealing with the timing of the buffer flip

    inline void flipDMABuffer() 
    {         
        if ( !m_cfg.double_buff) { return; }

        // while (active_gfx_writes) { } // wait a bit ?
      //  initialized = false;
          dma_bus.flip_dma_output_buffer( back_buffer_id ); 
    //    initialized = true;

        /*
        i2s_parallel_set_previous_buffer_not_free();       
        // Wait before we allow any writing to the buffer. Stop flicker.
        while(i2s_parallel_is_previous_buffer_free() == false) { }       
        
        i2s_parallel_flip_to_buffer(ESP32_I2S_DEVICE, back_buffer_id);        
        // Flip to other buffer as the backbuffer. 
        // i.e. Graphic changes happen to this buffer, but aren't displayed until flipDMABuffer() is called again.
        back_buffer_id ^= 1;        
        
        i2s_parallel_set_previous_buffer_not_free();       
        // Wait before we allow any writing to the buffer. Stop flicker.
        while(i2s_parallel_is_previous_buffer_free() == false) { }          
        */

    }

@mrcodetastic
Copy link
Owner

mrcodetastic commented Mar 14, 2023

One thing we aren't testing in the example is the vertical line going from left to right and then right to left.

If the pixel offset is the same both directions then perhaps it is a data sync issue.

If the pixel offset changes the other way around then proves that it's an optical illusion to do with the row scanning. If that's the case then there's no fix other than to offset all fast coords in the bottom half of a panel.

@xemjeff
Copy link
Author

xemjeff commented Mar 14, 2023

@mrfaptastic : I've tested in both directions. The shift follows the direction. But I'm not convinced this is an optical illusion.
If it were, then the first image in this thread (a photo still) would not capture the shift.

First, I'd like to understand more about how to time the buffer flip. In my app, I'm flipping buffers every frame, each time I redraw the display, shifting the column offset by 1. Should that be synced with the DMA transfer?

Second, we could change the order in which the A,B,C,D pattern is output with matching row data.
The sequence now is row = [0,1,2,3 ..15] with RGB1 and the matching row at row+16 using RGB2.
If we move this to start in the middle of the top and bottom, we would use [8,7,9,6,10,5,11,4,12,3,13,2,14,1,15,0] for RGB1 and again row+16 for RGB2. That we remove the time difference between rows 1 and 16 - they would be adjacent in the output sequence.

Let me know which direction (or both) I could follow. I'm happy to do the work with your guidance in a forked repo and issue a PR when it's working.

/Jeff

@board707
Copy link
Contributor

Should that be synced with the DMA transfer?

No, it's the responsibility of the library.

The shift follows the direction.

Is the shift still the same on whole height of the line? Or, are the both lines ( above and below middle-level) straight vertical?

@xemjeff
Copy link
Author

xemjeff commented Mar 14, 2023

@board707 This is hard to tell - but I see the lines are always vertical. I captured video with iPhone at 240fps (slo-mo), and the extracted frames using ffmpeg. Here is a sequence.
ShiftSequence-240fps.zip

@mrfaptastic : Maybe we could capture this with a logic analyzer and provide a buffer switch GPIO output - high for buffer1, low for buffer2. Then look at the outputs of A,B,C,D and ensure that buffers only switch at transition from 1111 to 0000. If that were verified, then we would know the buffer switch is not the cause. Let me know what you think.

@mrcodetastic
Copy link
Owner

@xemjeff That would be usefull to double confirm it is an issue.

From a code perspective, one option is to start from scratch using this example that Espressif's Sprite_TM created (which was actually the genesis code for the creation of this library).

Hack it right back to the bare bones just to draw a line and see if the same issue happens as well:

https://www.esp32.com/viewtopic.php?f=17&t=3188

If I find time I will try look at this as an intellectual curiosity more than anything. A time sink this will be. Curious to see if we have this same issue on the ESP S2 and S3 devices now.

@mrcodetastic
Copy link
Owner

mrcodetastic commented Mar 15, 2023

Actually, your buffer switch GPIO suggestion I'll have to incorporate into any new test case based on my comment above. So don't hack the library and bother doing this yourself.

@mrcodetastic
Copy link
Owner

mrcodetastic commented Mar 16, 2023

Spent hours on this and created a basic example that only barely works. Seems I don't get the skew issue for double buffering now.... hmmm...

Example uses default ESP32 pin connections.

ESP32_HUB75_DoubleBufferTearingTest_2.zip

@xemjeff
Copy link
Author

xemjeff commented Mar 16, 2023

@mrfaptastic Thanks for the zip file and for putting the time into this. I know it's nagging problem and a time sink.
This example looks like the anim ESP-IDF 64x32 LED example you mentioned in an earlier post.
I'll download and try it on a my board (different GPIO outputs) in the morning.

Also, I'ver ordered a USB/logic analyzer with 16 channels and 400Mhz sampling rate to get a look at the signals. This would allow me to compare this working model with the output from the shifted version. Still, I will need to add a GPIO for buffer swap indicator.

@mrcodetastic
Copy link
Owner

@mrfaptastic Thanks for the zip file and for putting the time into this. I know it's nagging problem and a time sink. This example looks like the anim ESP-IDF 64x32 LED example you mentioned in an earlier post. I'll download and try it on a my board (different GPIO outputs) in the morning.

@xemjeff - Feel free to run it, but all it does is confirm that your persistence has paid off. There is an issue somewhere, and I'm sure it's a simplex fix, problem is finding the root cause.

Over the coming days I will look to see if it's an issue that has been introduced over time with this library inadvertantly. e.g. Testing the line scrolling example with version 1.1.0 of this library - problem is that version of the library doesn't compile anymore as Espressif have changed the idf so much ! Will work on it over coming days.

Also, I'ver ordered a USB/logic analyzer with 16 channels and 400Mhz sampling rate to get a look at the signals. This would allow me to compare this working model with the output from the shifted version. Still, I will need to add a GPIO for buffer swap indicator.

Wow. You mean business!

@xemjeff
Copy link
Author

xemjeff commented Mar 16, 2023

@mrfaptastic The logic analyzers are fairly inexpensive. My $12 version (24Mhz) is not up to the task, and I find it very handy for protocol debugging. I really appreciate your persistence and patience with this issue.

@mrcodetastic
Copy link
Owner

mrcodetastic commented Mar 16, 2023

So I used the very earliest version of this library I ever made, and it has this same issue.

It shares a lot of code with that example in the zip file I provided, which doesn't have this offset issue.

The bug hunt continues.

@mrcodetastic
Copy link
Owner

mrcodetastic commented Mar 16, 2023

OK. Here's PlatformIO Project (Arduino based) which I would like you to test when you get your logic analyzer. As far as I can see I have fixed the skew. That work OK?

ESP32_HUB75_ArduinoDoubleBufferTearingTest.zip

@xemjeff
Copy link
Author

xemjeff commented Mar 16, 2023

got it - thanks. I'll test with my board, and also check it out with the logic analyzer when that arrives.

@mrcodetastic
Copy link
Owner

got it - thanks. I'll test with my board, and also check it out with the logic analyzer when that arrives.

No problems. In any case I've decided to work on a new fork of this library that rewrites the core bitplane buffer alloc, DMA linked list creation and ordering (to try implement your suggestion as well about the row output ordering) etc. This should be transparent to most people, but fix this issue hopefully.

Will see what this brings, hopefully not new issues 🤣

@xemjeff
Copy link
Author

xemjeff commented Mar 17, 2023

@mrfaptastic Hate to bug you with this, but I can seem to get your test example ESP32_HUB75_ArduinoDoubleBufferTearingTest.zip to run properly. I've built both with Platform I/O Arduino and also straight Arduino IDE. In both cases, I'm getting all GPIO signals except the CLK. Not sure why. When I re-flash same boards with the old code, the CLK works. Same pin. Here's the logic analyzer capture at 16MHz (below). Notice that A,B,C,D LAT and OE work as expected, but no CLK. Is there something special to activate the CLK generation through DMA that I'm missing?

image

I verified the CLK pin and speed with serial debug output:
Using pin 15 for the CLK_PIN
dma clock speed: 10000000

mrcodetastic added a commit that referenced this issue Mar 18, 2023
mrcodetastic added a commit that referenced this issue Mar 18, 2023
Randomise rows.
@mrcodetastic
Copy link
Owner

Have implemented various hacks. The rewrite didn't solve the issue, but I learnt what one of the causes of the problem was - it was some weird bug.

Have added randomisation. It helps a bit, but it'll never be perfect for fast moving stuff.

mrcodetastic added a commit that referenced this issue Mar 19, 2023
mrcodetastic added a commit that referenced this issue Mar 19, 2023
New compile time option: ROW_SCAN_SHUFFLE

Don't update rows in sequential order.
@mrcodetastic
Copy link
Owner

mrcodetastic commented Mar 19, 2023

Try compiling with a global define called 'ROW_SCAN_SHUFFLE' defined and try without this being defined.

@xemjeff
Copy link
Author

xemjeff commented Mar 19, 2023

I downloaded the master branch with your recent changes. This is what I've observed:

a) The single line scroll is improved with ROW_SCAN_SHUFFLE not defined, in double buffer mode. The skew is still there, but less so.

b) In my main application with ROW_SCAN_SHUFFLE undefined, I am seeing a lot of flickering. This might be due to changes from 2.0.6 to 3.0 so I need to test that next. In my app, I am scrolling two lines at different speeds (top/bottom halves separately). That worked perfectly in 2.0.6.

c) In my main application with ROW_SCAN_SHUFFLE defined, I am seeing ghosting on the rows. Faint images of the text showing above or below the text line display.

I can capture videos and share if that will help. (flicker, ghosting, etc)

@mrcodetastic
Copy link
Owner

mrcodetastic commented Mar 19, 2023

Ok so a) is it then.

The shuffle idea doesn't look like it works. Perhaps the panels don't like being driven in random orders electrically and some residual capitance issue comes out of the woodwork causing ghosting. You can fiddle with the order in the code.

I don't see the other issues however, so I'll leave it to the signal diagnosis to see if there are any other issues beyond the optical illusion caused by row scanning.

@xemjeff
Copy link
Author

xemjeff commented Mar 20, 2023

I got the Platform.io version working ESP32_HUB75_ArduinoDoubleBufferTearingTest.zip that you posted 3 days ago.

That version has no skew with a 10ms delay between columns shifts, so I'll build upon that. My application is quite simple - 3 horizontal panels.

Thanks for your patience and perseverance through this issue.

@mrcodetastic
Copy link
Owner

mrcodetastic commented Mar 20, 2023

I'm not sure how you get no skew, I still see it (but no where near as bad). I think it comes down to how quickly the app refreshes and the timing - so it's not consistent. Glad it worked however.

@mrcodetastic
Copy link
Owner

@xemjeff - Uploaded the .zip code as a seperate repo @ https://github.com/mrfaptastic/ESP32-HUB75-MatrixPanel-DMA-Lite

Did some minor optimisations etc.

@xemjeff
Copy link
Author

xemjeff commented Mar 21, 2023

@mrfaptastic Thanks for that. I posted a minor issue there regarding initialization of the gpio_matrix for the clock line. Verified using my newly arrive logic analyzer.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
not an issue with library This library works as expected, but something else is the root cause, such as AdaFruitGFX
Projects
None yet
Development

No branches or pull requests

5 participants