Quoted from AlanJ:I’m happy to have a look at these, I need to grab some arduino/esp32s code that displays those file formats to a tft display.
I'm not entirely sure what the other suggestion was getting at. PNG and GIF are both compressed as well. They are just "non-lossy" formats, whereas JPEG is "lossy". With PNG and GIF you get exactly the same pixels back after compressing and decompressing, while JPEG will invariably involve subtle differences (but typically not noticeable to the human eye). With any of those formats, there will be a decompression step.
What kind of profiling tools are available for software dev on this platform? I've barely used Arduino myself, and don't have any experience with the ESP32 stuff. But if this were a regular PC platform, I would be using a debugger with a profiler to track down what's actually taking time.
That said, given that the rendered numerals work fine (if I understood your notes correctly), that suggests that the issue is either I/O from the SD card, which can vary considerably depending on what kind of media controller you're using, or compression. It seems you've determine it's the latter; hopefully it's by some means you consider reliable.
What is the RAM budget for the project? Is there even any separate RAM? I looked at what I think are the specs for the ESP32 kit you're using, and all I see are specs for flash RAM and 520 KiB of SRAM. And I don't see a separate RAM component on your materials list. Can you increase it considerably while keeping the whole project cost within your goals? Because it seems to me that if the rendered fonts are fine, the most obvious approach would be to just cache the uncompressed images in RAM.