Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bad jpeg format returned from esp_camera_fb_get() #162

Closed
jameszah opened this issue Jul 21, 2020 · 6 comments
Closed

Bad jpeg format returned from esp_camera_fb_get() #162

jameszah opened this issue Jul 21, 2020 · 6 comments

Comments

@jameszah
Copy link

Hi, I have a program that makes a mjeg avi on an ESP32-CAM board, and I have been trying to resolve the issue of too much data for the jpeg, which happens if you have a high quality setting like 6 or 8 and then you expose it to bright sun, and you only get 3/4 of the image. My solution was to configure the camera quality with a high number (like 6) then switch to a lower quality (like 10 or 12) when I am recording. I think the initial config allocates buffers, then the lower quality photo does not exceed those buffers. This works okay, but whenever a single jpeg exceeded its allowable size, it ruined an entire .avi with 36,000 pictures or more as the avi player on the computer cannot understand the format, and will try to rebuild the index, ...

So I wrote this bit of code to select only good jpegs:

   do {
          fb = esp_camera_fb_get();
          int x = fb->len;
          if (fb->buf[x-1] != 0xD9){
            bad_jpg++;
            Serial.print("Bad jpg, frame # = "); Serial.print(frames_so_far); 
            Serial.print(", len= ");
            Serial.print(x); Serial.print(" "); 
            Serial.print(fb->buf[x-2],HEX); Serial.print(":");
            Serial.println(fb->buf[x-1],HEX);
            esp_camera_fb_return(fb);
          } else {
            break;
          }
        } while (1);

The code checks that the final two bytes of the jpeg are 0xFF, 0xD9 which is the end-of-image code, then prints out the frame length and the final two bytes.

What I discovered is that about 5% of images do not have the end-of-image code at the end of image. This recording is a vga or svga video of my ceiling in a medium dark room at quality 10 or 12, so it not full of light and bright colors, just ceiling tiles. In some cases there is a D9 one byte off, other cases there are random data. But the size of the frames are usually the same.

I am surprised that this many jpegs are wrong, as I have only seem rare problems with bright sunlight when using too high a quality (high quality = low number). The jpeg and avi viewer software usually handles these problems without complaining, ... until a real problem occurs and 1/4 of the frame is gone, etc.

So my questions is: Anybody know that jpeg creation code, and if there are problems with it?

In these 3 examples, the camera is initially configured for the following, then framesize and quality adjusted to record a particular video.

  config.pixel_format = PIXFORMAT_JPEG;
  config.frame_size = FRAMESIZE_UXGA;          
  config.jpeg_quality = 6;  
  config.fb_count = 7;

svga, quality 8

11:38:22.129 -> Bad jpg, frame= 367, len= 18400 98:A0
11:38:22.533 -> Bad jpg, frame= 371, len= 18400 73:43
11:38:23.851 -> Bad jpg, frame= 384, len= 17701 D9:0
11:38:27.925 -> Bad jpg, frame= 424, len= 18400 1:86
11:38:29.906 -> Bad jpg, frame= 444, len= 18400 60:14
11:38:33.818 -> Bad jpg, frame= 482, len= 18400 50:3
11:38:36.424 -> Bad jpg, frame= 508, len= 18400 3D:29
11:38:36.941 -> Bad jpg, frame= 513, len= 18400 9A:8E
11:38:40.113 -> Bad jpg, frame= 543, len= 17600 78:A4
11:38:41.195 -> Bad jpg, frame= 554, len= 18400 9A:6D
11:38:43.119 -> Bad jpg, frame= 573, len= 18400 3B:52
11:38:43.795 -> Bad jpg, frame= 580, len= 17600 94:64
11:38:46.428 -> Bad jpg, frame= 592, len= 17600 9E:B4
11:38:51.024 -> Bad jpg, frame= 638, len= 18400 2E:DA
11:38:51.600 -> Bad jpg, frame= 644, len= 18400 BB:69
11:38:53.421 -> Bad jpg, frame= 662, len= 17801 D9:0

svga, quality 12

13:52:45.975 -> Bad jpg, frame= 1132, len= 14400 2D:0
13:52:48.294 -> Bad jpg, frame= 1155, len= 14400 40:5
13:52:51.802 -> Bad jpg, frame= 1189, len= 14400 51:2D
13:52:52.073 -> Bad jpg, frame= 1192, len= 14400 FA:29
13:52:52.139 -> Bad jpg, frame= 1192, len= 14400 40:85
13:52:53.122 -> Bad jpg, frame= 1202, len= 14400 52:D0
13:52:53.190 -> Bad jpg, frame= 1202, len= 14400 14:84
13:52:53.258 -> Bad jpg, frame= 1202, len= 13701 D9:0
13:52:53.730 -> Bad jpg, frame= 1207, len= 14400 43:48
13:52:54.135 -> Bad jpg, frame= 1211, len= 14400 85:14
13:52:55.864 -> Bad jpg, frame= 1228, len= 14400 90:B
13:52:56.239 -> Bad jpg, frame= 1232, len= 14400 5A:4A
13:52:56.475 -> Bad jpg, frame= 1234, len= 13701 D9:0
13:52:57.265 -> Bad jpg, frame= 1242, len= 14400 0:A5
13:52:58.140 -> Bad jpg, frame= 1251, len= 14400 84:14
13:52:58.949 -> Bad jpg, frame= 1259, len= 14400 82:69
13:52:59.751 -> Bad jpg, frame= 1267, len= 14400 9:9A
13:53:00.156 -> Bad jpg, frame= 1271, len= 14400 68:1
13:53:03.051 -> Bad jpg, frame= 1298, len= 13600 1D:49
13:53:05.040 -> Bad jpg, frame= 1318, len= 13600 0:2D
13:53:06.116 -> Bad jpg, frame= 1327, len= 13600 9:45
13:53:06.584 -> Bad jpg, frame= 1332, len= 13600 A:28

vga, quality 10

14:38:17.335 -> Bad jpg, frame= 209, len= 9801 D9:0
14:38:21.904 -> Bad jpg, frame= 253, len= 10400 A5:14
14:38:22.212 -> Bad jpg, frame= 256, len= 10400 40:5
14:38:22.281 -> Bad jpg, frame= 256, len= 10400 69:C
14:38:23.977 -> Bad jpg, frame= 273, len= 9801 D9:0
14:38:24.688 -> Bad jpg, frame= 280, len= 10400 FA:95
14:38:25.263 -> Bad jpg, frame= 285, len= 10400 24:14
14:38:25.600 -> Bad jpg, frame= 288, len= 10400 C0:5A
14:38:25.668 -> Bad jpg, frame= 288, len= 10400 B0:A
14:38:27.844 -> Bad jpg, frame= 308, len= 10400 0:99
14:38:28.869 -> Bad jpg, frame= 316, len= 10400 30:A0
14:38:29.072 -> Bad jpg, frame= 317, len= 9801 D9:0
14:38:33.037 -> Bad jpg, frame= 349, len= 10400 40:84
14:38:36.583 -> Bad jpg, frame= 377, len= 10400 0:48
14:38:39.278 -> Bad jpg, frame= 398, len= 10400 40:F
14:38:41.888 -> Bad jpg, frame= 418, len= 10400 85:20
14:38:44.974 -> Bad jpg, frame= 440, len= 10400 5A:0
14:38:46.839 -> Bad jpg, frame= 457, len= 10400 52:58
14:38:47.839 -> Bad jpg, frame= 467, len= 10400 13:2D
14:38:50.308 -> Bad jpg, frame= 492, len= 10400 29:81
14:38:51.454 -> Bad jpg, frame= 503, len= 10400 28:0
14:38:54.171 -> Bad jpg, frame= 530, len= 11200 69:D4
14:38:54.273 -> Bad jpg, frame= 531, len= 10400 C0:28
14:38:55.259 -> Bad jpg, frame= 541, len= 10400 81:13
14:38:55.765 -> Bad jpg, frame= 546, len= 10400 40:5
14:38:56.168 -> Bad jpg, frame= 550, len= 10400 9D:2A

@jameszah
Copy link
Author

jameszah commented Jul 23, 2020

It turns out that these jpegs are not bad, but have junk appended at the end after the FFD9 end-of-image code.

Whenever the FFD9 is the last, or last but 1 byte of a 16 byte block, then there will be a bunch of data after the FFD9. It must be some unused memory allocation (giving those round numbers 10400 ,14400, ...) that is sent along with the image rather than specifying exactly where the image ends.

If the FFD9 ends the group of 16, then there will be 96, 144, 160 more bytes ... a multiple of 16.
And if the FFD9 is last but 1 of a 16 group, then there will be 512+1 bytes after it, in the one example I have seen (must be all the 9801, 13701 examples above).

I cannot seem to find a length of the start-of-scan to end-of-image, so I guess the jpeg reader just takes all data available after the start-of-scan, and processes it, and when the data runs out before the end-of-image arrives, then you get the strange pictures. But that would mean you would have to scan the jpeg from the back looking for the end-of-image code to confirm that it is there, and the image isn't corrupted or truncated. Or maybe just scan back 512 bytes or so. It will be at the very end 95% ish of the time, and you could spend the cpu time to scan back 500 or 1000 bytes rather than discard the image and get a fresh one.

@me-no-dev
Copy link
Member

you are all correct :) the code does go backwards to find the end of the image. It actually scans for FFD9 and zero after it.

@me-no-dev
Copy link
Member

@jameszah
Copy link
Author

Interesting ... I wonder why it checks for the two 00 bytes after the FF D9. In my examples where there is more data after the FFD9, they do not have two 00 bytes -- which must break that test to find the end of image, and it just gives the length of entire buffer. I'll have to look backwards at the code that generates that.

I re-did my example to look for these truncated frames, and searched backwards for the FFD9 by 1025 bytes to see if I could find the end-of-image. Using vga and quality 10 pointing at a bland ceiling, the camera gives about 5% frames with extra bytes after the FFD9, but no truncated frames - missing the FFD9. I'll have to take it outside to the bright sun, where I can usually find these problems.

But there is quite a bit of extra data tagging along if you save/transmit the entire jpeg including the bytes after the FFD9. This is what I got with my bland ceiling for vga quality 10 (running at 10 frames per second)

19:13:04.252 -> Len = 11200, Extra Bytes = 384
19:13:07.476 -> Len = 11200, Extra Bytes = 417
19:13:09.029 -> Len = 11200, Extra Bytes = 384
19:13:14.118 -> Len = 14400, Extra Bytes = 305
19:13:26.101 -> Len = 10400, Extra Bytes = 449
19:13:31.405 -> Len = 11200, Extra Bytes = 144
19:13:41.479 -> Len = 10400, Extra Bytes = 96
19:13:41.748 -> Len = 10400, Extra Bytes = 161
19:13:42.450 -> Len = 10400, Extra Bytes = 337
19:13:46.591 -> Len = 10400, Extra Bytes = 352
19:13:48.277 -> Len = 10400, Extra Bytes = 577
19:13:55.487 -> Len = 10400, Extra Bytes = 673
19:13:55.992 -> Len = 10400, Extra Bytes = 641
19:13:56.195 -> Len = 10400, Extra Bytes = 672
19:13:56.597 -> Len = 10400, Extra Bytes = 673

I think those extra-byte calculations are correct, but they do not match my rule of 16.

@jameszah
Copy link
Author

I think the check for those latter two zero bytes can be dropped. I think the FFD9 will not appear in the data. This misses the end of image if it comes at the 15th or 16th byte on a 16 byte block. When it is at the 15th, there is one zero, and at the 16th, there is no zero.

Still haven't located my nemesis - the truncated image.

if(dptr[0] == 0xFF && dptr[1] == 0xD9 && dptr[2] == 0x00 && dptr[3] == 0x00){

image

image

image

@jameszah
Copy link
Author

jameszah commented Aug 3, 2020

This is the code I ended up using.

It searches for the FFD9 for 1025 bytes from the end of frame -- if there is no FFD9 it discards the frame, and if there is an FFD9, it justs uses the frame along with the extra bytes. You could trim the extra bytes, but that is a nuisance in my frame queuing system.

       do {
          fb = esp_camera_fb_get();
          int x = fb->len;
          int foundffd9 = 0;
          
          for (int j = 1; j <= 1025; j++) {
		  
            if (fb->buf[x - j] == 0xD9) {
              if (fb->buf[x - j - 1] == 0xFF ) {
			  
                //Serial.print("Found the FFD9, junk is "); Serial.println(j);
                if (j == 1) {
			normal_jpg++;
                } else {
			extend_jpg++;
			//Serial.print(", Len = "); Serial.print(x);
			//Serial.print(", Corrent Len = "); Serial.print(x - j + 1);
			//Serial.print(", Extra Bytes = "); Serial.println( j - 1);
					
		}
                foundffd9 = 1;
                break;               // break out of for() - we found a ffd9
              }
            }
          }

          if (!foundffd9) {
            bad_jpg++;
            //Serial.print("Bad jpeg, Len = "); Serial.println(x);
            esp_camera_fb_return(fb);

          } else {
            break;                 // break out of while(1) - we have a good jpeg 
          }

        } while (1);

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants