Rework image format to support 1bpp and 4bpp formats #1

drwonky · 2020-03-01T01:54:03Z

When VGALIB was designed, the screen was the primary buffer, and mode 13h was the principal target. Having an 8bpp linear framebuffer makes programming very simple and updating of the video memory quite fast. You can use 32bit instructions to read and write data to video memory, if the underlying bus is 8bit, then waits are inserted while the data is shuffled each clock cycle, but if the bus is 16bit, it's 2 clocks, and if you have a local bus like VLB, PCI, EISA, or something else, the write can be done in one shot.

The video card in my 486dx4-120 retro PC is a VLB ET4000/W32p and memory accesses are really fast, on topbench the MemoryTest is 55us while the VidramTest is 61us. The identical motherboard is used on my testbench PC which has an Everex 8-bit EGA card. The VidramTest is 1600us (1.6ms) on the EGA card in a 486dx2-66. Using the conventional approach to write nibbles to EGA memory involves a mask register on the EGA card and a read to populate the main 32-bit register. I'm seeing about 1-2fps update with this method on an EGA while the CGA version of the program runs 5-10fps (I haven't added benchmarks to the demo programs yet).

Games like Commander Keen used a tile based approach, a packed bitmap was written to EGA memory and the EGA hardware was used to pan the memory around. Sprites were updated directly, but when you've got maybe 10 sprites on the screen, the amount of data written is pretty small.

So, the EGA write mode 2 isn't going to be very fast and I have serious doubts about the performance of double buffering an offscreen buffer to an EGA screen buffer. It's possible to have 2 screen pages in the address space of EGA with 320x200x16 mode.

This issue really is about adding 1bpp and 4bpp support to the image type internally, so here are some thoughts:

Image doesn't really implement much of an interface, originally most of the canvas functionality was in Image, but I moved that into Canvas because I imagined display adapters could be a subclass of Canvas, each implementing the Cavas API calls where needed. This may still be a viable approach with double buffered EGA and it may be the fastest solution. The downside to this approach is that as you optimize for speed, you become more specialized and specific in each implementation. Pushing the access down to Image or Adapter classes can improve things but at the cost of bulk operations. Copying images around is the underlying design approach to VGALIB, using offscreen buffers to blit images and composite a final image, then copy the offscreen buffer to the video memory.
Image could be redesigned to have access methods for each of the 1bpp, 4bpp, and 8bpp color depths, if the array operator [] was overloaded, then Canvas could treat the Image as a linear buffer and Image could translate the access to the appropriate memory underneath. The display adapters would expect the image buffer to be the same depth as the video memory, so 1bpp modes would expect 1bpp buffers, 4bpp modes would expect 4bpp buffers, etc. I expect the fastest way to update EGA is to write an entire plane at once, bypassing the internal registers and switching planes just 4 times. Writing to I/O ports is slow and selecting the bit mask register for every pixel is a performance killer. To make EGA and VGA mode 12h fast, the Image buffer really needs to be in a planar format, R, G, B, and I representing contiguous regions of 1bpp data. Then updating the screen simply involves copying each of the bitmaps. This mode would also mimic the 1bpp modes, since EGA and VGA really just implement them the same way. Doing contiguous 1bpp allocations would also solve the hires problem, since no single color plane would be bigger than 1 segment, and far pointers would still work (fast). A 1bpp image could just implement 1 plane, 4bpp just requires 4 times the work as 1bpp modes.
Implementing a planar image format in the Image class would mean pushing memcopy routines down to Image, so it knows how to deal with source Images of 8bpp, 1bpp, and 4bpp. This only applies to Image-to-Image copying, which is the bread and butter of the underlying design. The display drivers will need to have the implementation specific code for writing to video memory.
Implementing 1bpp and 4bpp planar Image formats should have each plane allocated to a new pointer so there are no segment overruns.
Canvas implements setpixel, this needs to be pushed down to Image
Canvas implements image to image copy with the assumption that images are 8bpp linear, the image to image copy needs to be push down to Image
Canvas implements transparency, this needs to be implemented in the pushed-down Image methods
Image should be able to handle arbitrary source and destination color depths; no palette munging will be done on image copy, the assumption is that a lower color depth will be a subset of a higher color depth. Image palettes can be specified when an image is instantiated, the png loader obeys the target palette and does its best to remap. This means 4 color CGA images will remap correctly into a 16 color or 256 color Image target as long as the palettes are the same. The convention is to set the default canvas palette to match the display's palette, if you do this and load a 4 color CGA image into a 16 color Image buffer, the colors will be remapped to the palette entries of the 16 color Image. If you then draw this 16 color image onto a 256 color image, as long as the 16 color image palette is a subset of the 256 color palette, the colors will be compatible.

drwonky self-assigned this Mar 1, 2020

drwonky added the enhancement New feature or request label Jun 27, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Rework image format to support 1bpp and 4bpp formats #1

Rework image format to support 1bpp and 4bpp formats #1

drwonky commented Mar 1, 2020

Rework image format to support 1bpp and 4bpp formats #1

Rework image format to support 1bpp and 4bpp formats #1

Comments

drwonky commented Mar 1, 2020