Reading list on Z-buffer precision

Posted on 2015-07-22 by Julien Guertault

Nathan Reed recently published a blog article plotting his numerical findings of Z-buffer precision under different uses. On the way he references a couple of previous articles, that also reference other resources; I think it’s a good opportunity to list some of them. They all tell a part of the story and I recommend reading all of them to get the complete picture.

A couple of notes about Z, 2009, by Emil Persson, on why Z-buffer is non linear and how to counter the loss of precision.
Attack of the depth buffer, 2010, by Matt Pettineo, on visualizing depth buffer error when using different buffer formats.
Maximizing Depth Buffer Range and Precision, 2012, by Brano Kemen, with a significant part on how OpenGL and DirectX are differently affected by the precision issue (spoiler: OpenGL’s coordinate convention hurts pretty badly in this case; fortunately as Nathan Reed’s article points, an extension now fixes this).
Depth Precision Visualized, 2015, by Nathan Reed, on visualizing depth buffer precision as a function of distance, when using different buffer formats and Z representations.

GDC 2015 presentations

Posted on 2015-03-09 by Julien Guertault

The Game Developers Conference took place last week in San Francisco. As I am starting to see more speakers publish their slides, I am creating this post to keep track of some them (this list is not meant to be exhaustive).

Advanced VR Rendering, by Alex Vlachos (Valve).
Code Clinic 2015: How to Write Code the Compiler Can Actually Optimize, by Mike Acton (Insomniac Games).
Doing Math with RGB (and A) Correctly, by Jim Van Verth (Google).
Great Management of Technical Leads, by Mike Acton (Insomniac Games).
Parallelizing the Naughty Dog engine using fibers, by Christian Gyrling (Naughty Dog).
Physics for Game Programmers : Numerical Methods, by Erin Catto (Blizzard Entertainment).
Rendering the World of Far Cry 4, by Steve McAulley (Ubisoft Montréal).
VR Direct: How NVIDIA Technology is Improving the VR Experience, by Nathan Reed (NVidia).

For a more extensive list, Cédric Guillemet has been garnering links to GDC 2015 papers on his blog.

Bret Victor – Seeing Space

Posted on 2014-06-13 by Julien Guertault

Following on his previous talks on data visualization and programming interfaces, Bret Victor presents the idea of what he calls a “seeing space”, meant to improve understanding of problems in the context of collaborative engineering.

Seeing Spaces from Bret Victor on Vimeo.

Guest post: Incremental reshaping

Posted on 2013-07-07 by Julien Guertault

A couple of dear friends and I are (too) often having discussions by email, on topics that range from video games to politics, and mostly software development and design in general. A while back, one of those discussions started after sharing an AltDevBlogADay article on optimization. As we were confronting our experiences on optimization or lack thereof, one of them, Rubix as we call him, made some insightful points by explaining an approach he would call “incremental reshaping”.

When I found myself coming back to his mails several times, I suggested he turned them into an article. With his permission, here it is.

Incremental reshaping

The most common case at my work is to stumble upon slow spaghetti & copypasta code, full of lists of hashtables of lists, all within a big class with 45 methods hitting them directly. So I start off by pulling the data out and into new classes, to encapsulate it. Then I clean up the code that uses them, by factorizing all similar operations into methods of these new classes, as well as trying to come up with good names for them (I know this part is done when the original data structure ends up being private in the class).

This way the operations eventually make sense, and I can at last understand what the code is trying to do on a higher level. Finally I can simplify the calculations and rework the (now hidden and loosely-coupled) data structures to best match the most critical uses. All of this has to be done incrementally. Do one change at a time, and then pause to look at the new code to search for the next simple refactoring. Otherwise, your changes will not be optimal.

Once everything is clean, if it’s still too slow, *then* optimize, even if it means to add complexity by removing good abstractions. Code that is complex in a particular way because that’s what is time-critical, will be both more maintainable and faster than code that gradually, without global monitoring, evolved into something complex everywhere, uniformly slow for many reasons.

Encapsulation rarely affects performance in my case. I work in environments where the cost of a function call is negligible, and functions are the most common abstraction tool.

Now, an example.

Yesterday, I saw a four parameters function I didn’t like. I first created a tiny class to contain just that function (all call sites became “new Foo().method(a1, a2, a3, a4)”), then I moved two of these arguments, that were often the same, into the class (call sites would become “new Foo(a1, a2).method(a3, a4)”). At some places in the calling code, I was then able to cache and reuse the instance of “Foo” into a local variable, because “a1” and “a2” were the same for several calls to “method”.

From there, I found that the calling code was more or less always doing the same kind of things before calling the method, so I moved that stuff to a second method of Foo (it turned out to work so well that the first method became private). Then I noticed a loop that was calling Foo’s method repeatedly, so I wrote a new method that took a list as a parameter. (I also ended up finding an appropriate name for Foo!)

Now I ‘could’ also have done it the hard way: enter the zone, stare at the code for a while, think a lot, decide that I wanted it to eventually be like this, then put my headphones and implement it directly instead of going through any of the sub-steps, which are unnecessary when you can fit all of it in your head.

But incremental reshaping has several advantages:

No need to know in advance what to do from begin to end, ideas come as you go.
The possibility of changing your mind more easily; you had an idea but you may find a better one later.
You can take advantage of the compiler & IDE at each step, to always know what state your code is in, and make sure nothing was forgotten, thus avoiding bugs and going faster. At a given time, it’s always the same message from the compiler, so it’s easier to read.
It keeps your code compiling and correct between every change. Also, for every little change, if you know your tools well you should be able to convince yourself that it breaks nothing.
It diminishes greatly the cost (and the pain) of being interrupted while focused on your changes.

Triangle order optimization

Posted on 2012-11-27 by Julien Guertault

This article by Adrian Stone discusses the impacts of triangle ordering in meshes and compares some algorithms.

Readings on vector class optimization

Posted on 2012-04-24 by Julien Guertault

Now that Revision has passed, we feel tempted to grab the ax and happily chop into parts of our code base we wanted to change but couldn’t really since we had other priorities. One tempting part is the linear algebra one: vector, quaternion and matrix data structures. Lets say vector for a start. Not that it’s really necessary, but the transformations are the most time consuming parts after the rendering itself, and the problem itself is somewhat interesting.

After a little googling, I basically found three approaches to this problem:

Every here and there, people seem to think of SSE instructions as a silver bullet and propose various examples of code, snippets or full implementations. The idea being to use dedicated processor instructions to apply operations on four components at a time instead of one after another.

Quite on the opposite, Fabian Giesen argued some years ago that it was not such a good idea. A quick look at the recently publicly released Farbrausch codebase shows they indeed used purely conventional C++ code for it.

At last this quite dated article (with regards to hardware evolution) by Tomas Arce takes a completely orthogonal approach, consisting of using C++ templates to evaluate a full expression component after component, thus avoiding wasting time moving and copying things around.

I am curious to implement and compare them on nowadays hardware.

Update: this is 2016 and the topic was brought back recently when someone wrote the article How to write a math library in 2016.

The point of the article is that the old advice to not bother with SSE and stick with floats doesn’t apply anymore, and it goes on to show results and sample code. This sparked a few discussions on Twitter, with opinions voiced to put it mildly.

@nothings No, that’s bullshit. Let the compiler do it, and if it can’t, don’t worry. At all. Cost >> benefit.

— Tom Forsyth (@tom_forsyth) March 16, 2016

It seemed the consensus was still against the use of SSE for the following reasons:

Implementation is tedious.
For 3 dimensional vector, which is the most common case, there is a 25% waste.
For 4 dimensional vectors, like homogeneous coordinates and RGBA, it doesn’t work so well either since the fourth component is treated differently than the other ones.
Even if the implementation detail is hidden behind a nice interface, the alignment requirements will leak and become constraints to the rest of the code.
Compilers like clang are smart enough to generate SSE code from usual float operations.

@kenpex @Zavie "against" list: It's SSE only. NEON, and other SIMDs might have deeper pipeline, it makes no sense with wider SIMD like AVX.

— Branimir Karadžić (@bkaradzic) April 9, 2016

Light is beautiful

Thoughts of a graphics programmer, demoscener and spare time photographer

Tag Archives: optimization