WebGL Experiments: Texture Compression

Lilli Thompson from Google asked us how we were doing texture decompression in the pixel shaders and what algorithm we were using. We thought we would share our answer…

Texture compression was a bit of journey – as no one at Illyriad had ever implemented anything in 3d before; to us texture compression was mostly a tick box on a graphics card.

It started when we found out our 90MB of jpegs expanded to 2GB of on-board memory and we were worried we’d made a terrible mistake, as this was certainly beyond the possibilities of low-end hardware! Half of it this was due to Three.js keeping a reference to the image and the image also being copied to the GPU process – so essentially the required texture memory doubled.

Dropping the reference Three.js held after the texture was bound to WebGL resolved this. I’m not sure how this will play out with context lost events – as I assume we will have lost the texture at that point – but local caching in the file system and reloading may help with recreating them at speed.

With 1GB of memory remaining we were faced with three choices – either deciding what were were trying to do wasn’t possible; reducing the texture sizes and losing fidelity or trying to implement texture de-compression in the shader. Naturally we opted for the last.

We were originally planning to use 24bit S3TC/DX1; however this proved overly complex in the time we had available as the pixel shaders have no integer masking or bitshifts and everything needs to be worked in floats. The wonders we could unleash with binary operators and type casting (not conversion) – but I digress…

In the end we compromised on 256 colour pallettized textures (using AMD’s The Compressonator to generate P8 .DDS textures). This reduced the texture to one byte per pixel – not as small or high colour as DX1 – but already 4 times smaller than our original uncompressed RGBA textures.

It took a while to divine the file format; which we load via XMLHttpRequest into an arraybuffer. The files have 128 bytes of header which we ignore, followed by the 256×4 byte palette which we load into a lookup table texture RGBA. The rest we load into a Luminance texture. Both textures need to use NearestFilter sampling and not use mipmapping to be interpreted sensibly.

We have created our own compressed texture loaders – the colour texture loader looks a little like this:

[code lang=”javascript”]
Illyriad.TextureCompColorLoader = function (path, width, height, uniforms) {
var texture = new THREE.DataTexture(0, 1, 1, THREE.LuminanceFormat,
(new THREE.UVMapping()), THREE.RepeatWrapping, THREE.RepeatWrapping,
THREE.NearestFilter, THREE.NearestFilter);

var request = new XMLHttpRequest();
request.open("GET", path, true);
request.responseType = "arraybuffer";

// Decode asynchronously
request.onload = function () {
if (request.status == 200) {
var imageDataLength = request.response.byteLength – width * height;
uniforms.tColorLUT.texture = new THREE.DataTexture(
new Uint8Array(request.response, 128, 256 * 4),
256, 1, THREE.RGBAFormat, (new THREE.UVMapping()),
THREE.ClampToEdgeWrapping, THREE.ClampToEdgeWrapping,
THREE.NearestFilter, THREE.NearestFilter);
uniforms.tColorLUT.texture.needsUpdate = true;
texture.image = { data: new Uint8Array(request.response, imageDataLength),
width: width, height: height };
texture.needsUpdate = true;
}
}
request.send();
return texture;
}
[/code]

When we first did the decompression in the pixel shader, it was very blocky as we had turned off filtering to read the correct values from the texture. To get around this we had to add our own bilinearSample function to do the blending for us. In this function it uses the diffuse texture with the colour look up table and using the texture size and texture pixel interval samples the surrounding pixels. The other gotcha is that the lookup texture is in BGRA format so the colours need to be swizzeled. This makes that portion of the shader look like this:

[code]
uniform sampler2D tDiffuse;
uniform sampler2D tColorLUT;

uniform float uTextInterval;
uniform float uTextSize;

vec3 bilinearSample(vec2 uv, sampler2D indexT, sampler2D LUT)
{
vec2 tlLUT = texture2D(indexT, uv ).xx;
vec2 trLUT = texture2D(indexT, uv + vec2(uTextInterval, 0)).xx ;
vec2 blLUT = texture2D(indexT, uv + vec2(0, uTextInterval)).xx;
vec2 brLUT = texture2D(indexT, uv + vec2(uTextInterval , uTextInterval)).xx;

vec2 f = fract( uv.xy * uTextSize );
vec4 tl = texture2D(LUT, tlLUT).zyxw;
vec4 tr = texture2D(LUT, trLUT).zyxw;
vec4 bl = texture2D(LUT, blLUT).zyxw;
vec4 br = texture2D(LUT, brLUT).zyxw;
vec4 tA = mix( tl, tr, f.x );
vec4 tB = mix( bl, br, f.x );
return mix( tA, tB, f.y ).xyz;
}

void main()
{
vec4 colour = vec4(bilinearSample(vUv,tDiffuse,tColorLUT),1.0);

[/code]

This performs fairly well; and certainly better than when your computer feels some virtual memory is required because you are using too much! However, I’m sure on-board graphics card decompression should be swifter and hopefully open up the more complex S3TC/DX1-5 compression formats.

There is a major downside however with decompressing this way in the pixel shader. You have to turn off mipmapping! Not only does turning off mipmapping cause a performance hit as you always have to read the full-size textures – but more importantly it doesn’t look good. In fact in the demo – we had to use full-size textures for the grass so we could apply mipmapping as otherwise in the distance it was a wall of static!

Unfortunately, as far as I’m aware, WebGL while you can create mipmaps with generateMipmap – you can’t supply your own. Again, real compressed textures should help here.

EDIT: Benoit Jacob has pointed out this is possible by passing a non-zero ‘level’ parameter to texImage2D – one to look into.

Some caveats on the demo:

  • Obviously even 90MB of jpeg textures is far too much – the production version will be substantially smaller, as we are being a bit smarter on how we will be using them.
  • This has been a learning process both for us and Quantic Arts (who are used to boxed set games).
  • This was a tester to see the upper limits of what we can do in WebGL, so we haven’t been focusing on optimization yet.
  • We will be reworking the obj models to reduce their download size substantially.
  • The way the game works is that no one player will need all the textures at once (the time between queuing a building and it’s actual appearance in the game allows us to download the models/texture)

So the actual game requirements will be much much lower.

JavaScript: Don’t spam your server, Debounce and Throttle

If you make server call-backs or update the browser GUI based on user interaction you can end up doing far more updates than necessary…

In fact on one occasion, during Illyriad-beta testing, we started tripping our server’s dynamic IP request rate throttling and banning our own IP address – D’oh!

Common situations where over responding to events are mousemove, keypress, resize and scroll; but there are a whole host of situations where this applies.

We update our pages dynamically via ajax, based on user clicks around the user interface. In this situation if the user makes several page navigation requests in quick succession – the only important one to be made is the last, as for all the previous ones the data will be thrown away. So why make the requests in the first place? It’s just unnecessary server load.

Equally, if a user is moving an mouse around a map and you are updating UI coordinates for every pixel change the mouse makes, it potentially a huge number of DOM updates far beyond the perception of a user; for no real benefit, but it will cause a degradation in performance.

Initially, we controlled the situation using complex setups of setTimeout clearTimeout and setInterval. However, as the various places we made use of this became more varied, with different sets of needs, the code became more and more complex and difficult to maintain.

Luckily, it was around this time we discovered Ben Alman’s jQuery plugins debounce and throttle; which are extremely powerful – yet simple to use of and cover a whole host of permutations.

Throttling

Using his  jQuery $.throttle, you can pass a delay and function to $.throttle to get a new function, that when called repetitively, executes the original function no more than once every delay milliseconds.

Throttling can be especially useful for rate limiting execution of handlers on events like resize and scroll. Check out the example for jQuery throttling.

Debouncing

Using his jQuery $.debounce, you can pass a delay and function to $.debounce to get a new function, that when called repetitively, executes the original function just once per “bunch” of calls, effectively coalescing multiple sequential calls into a single execution at either the beginning or end.

Debouncing can be especially useful for rate limiting execution of handlers on events that will trigger AJAX requests. Check out the example for jQuery debouncing.

In fact its well worth checking out all Ben Alman’s jQuery Projects as they are full of little wonders, that you soon are sure how you did without them!

(Fix) Memory Leaks: Ajax page replacement

Resolving memory issues with HTML replacement and Ajax

In Illyriad we do very large numbers of ajax requests using jQuery over a players session. Some of these are pure data requests, but many of them are navigational HTML page replacements.

A simple replacement seems to leak memory in various browsers:

[code lang=”js”]
$(‘#ElementToReplaceContents’).html(htmlToReplaceWith);
[/code]

Even calling the jQuery function .empty() before the replacement doesn’t seem to help in all cases. To fix this we have created a clean up function which is a common or garden “overkill” variety which seems to do the trick. Usage below:

[code lang=”js”]
function ChangeLocationSucess(data, textStatus, XHR) {
if (XHR.status == 200) {
var div = $(‘#ElementToReplaceContents’);
DoUnload(); // More on this further down
div.RemoveChildrenFromDom(); // Clean up call
div.html(data); // HTML replacement
}

[/code]

Clearing the memory from the existing HTML is done with the clean up code below:

[code lang=”js” title=”Added: RemoveChildrenFromDom jQuery plugin”]
(function( $ ){
$.fn.RemoveChildrenFromDom = function (i) {
if (!this) return;
this.find(‘input[type="submit"]’).unbind(); // Unwire submit buttons
this.children()
.empty() // jQuery empty of children
.each(function (index, domEle) {
try { domEle.innerHTML = ""; } catch (e) {} // HTML child element clear
});
this.empty(); // jQuery Empty
try { this.get().innerHTML = ""; } catch (e) {} // HTML element clear
};
})( jQuery );
[/code]

Some pages have extra resources cached and extra elements with wired up events. For these we have introduced a DoUnload function which also gets called before the replacement. This runs through the list of on page clean up functions and calls them one by one:

[code lang=”js” title=”Extra clean up system”]
var unloadFuncs = [];
function DoUnload() {
while (unloadFuncs.length > 0) {
var f = unloadFuncs.pop();
f();
f = null;
}
}
[/code]

We create these unload functions on the page that has any extra clean up to do by using the following code; which adds clean up functions to the unloadFuncs array above. This code looks like the code below:

[code lang=”js” title=”On page clean up registration”]
var unload = function() {
OnPageCachedResources.clear(); // Clear array of cached resources
OnPageCachedResources = null;
$("#OnPageElementWithWiredUpEvents")
.unbind()
.undelegate(); // Remove attached events
}
unloadFuncs.push(unload);
[/code]

Obviously this varies from page to page and most of our pages don’t need it – at which point the DoUnload function just returns with out doing work.

This coupled with the Fix for IE and JQuery 1.4 Ajax deal with most of the regular unexpected leaks.

(As an aside: When replacing HTML with Ajax, do not wire up events using onclick=”” etc in the HTML or this will also leak. Use event binding from script – a third party library like jQuery, Prototype or Dojo will make this very straight forward)

(Fix) Memory Leaks: IE and JQuery 1.4 Ajax

While testing Illyriad we found that over time the browsers IE7 and IE8 leak memory on Ajax calls using JQuery 1.4.

If you make a lot of ajax calls without changing page this can build up quite quickly and start causing issues. This is caused by the onreadystatechange event not being detached and IE not garbage-collecting the associated memory.

To fix this, locate these lines in the source JQuery file; already kindly annotated with “Stop memory leaks”:

[code lang=”js” firstline=”6019″ title=”Original jquery-1.4.4.js”]
// Stop memory leaks
if ( s.async ) {
xhr = null;
}
[/code]

Add in these extra lines to change them to the following (we detach the abort event also for good measure):

[code lang=”js” firstline=”6019″ title=”Modified jquery-1.4.4.js”]
// Stop memory leaks
if ( s.async ) {
try {
xhr.onreadystatechange = null;
xhr.abort = null;
} catch (ex) { };
xhr = null;
}
[/code]

Don’t forget to minify your jquery file before including it. [We prefer using UglifyJS]

Also rename it slightly so users with cached versions pick up the new file

e.g. jquery-1.4.4a-min.js