More efficient ASTC decoding

Oleksandr Popov
2 min readJun 10, 2021
Image source: ARM

ASTC is a very efficient texture compression format - it combines decent image quality with high compression. It helps saving a lot of memory bandwidth on modern mobile GPUs.

But can we crank it to 11 and make it run even faster? It appears we can (in certain scenarios and on supported hardware).

ARM Mali GPUs support the extension GL_EXT_texture_compression_astc_decode_mode. According to ASTC specifications, even LDR textures are decoded into 16-bit floating point values. This extension provides a possibility to switch the hardware ASTC decoder into faster mode, decoding textures into lower precision normalized 8-bit unsigned integers. This is good enough for most real-life applications, since source textures are usually 24-bit RGB or 32-bit RGBA bitmaps.

Using extension is as simple as adding 1 line of code:


Visually there was no image quality degradation, and performance was at the same steady 60 fps. However, looking at certain metrics in the GPU profiler, we can see reduced load on the compressed texture decoder and improved texture cache access.

Here are measurements from ARM Streamline profiler for our 3D Buddha Live Wallpaper, taken on mid-range Galaxy A21s phone:

As you can see, this simple trick have noticeably improved texture lookups. As a result, it reduced memory bandwidth usage and device power consumption. So don’t be lazy and if this extension is detected, use it — it’s a minor change to code which gives virtually free performance boost on Mali GPUs.

This optimization was suggested in 3-part ARM webcast “Optimizing Android Graphics” — I highly recommend watching them.



Oleksandr Popov

Front-end developer making 3D live wallpaper apps for Android.