Tips for writing high-performance shaders

Tips for writing high-performance shaders

The GPUs on iOS devices have fully supported pixel and vertex shaders since the iPhone 3GS. However, the performance is nowhere near what you would get from a desktop machine, so you should not expect desktop shaders to port to iOS unchanged. Typically, shaders will need to be hand optimized to reduce calculations and texture reads in order to get good performance.

Complex mathematical operations

Transcendental mathematical functions (such as pow, exp, log, cos, sin, tan, etc) will tax the GPU greatly, so a good rule of thumb is to have no more than one such operation per fragment. Consider using lookup textures as an alternative where applicable.

It is not advisable to attempt to write your own normalize, dot, inversesqrt operations, however. If you use the built-in ones then the driver will generate much better code for you.

Bear in mind also that the discard operation will make your fragments slower.

Floating point operations

You should always specify the precision of floating point variables when writing custom shaders. It is critical to pick the smallest possible floating point format in order to get the best performance.

If the shader is written in GLSL ES then the floating point precision is specified as follows:-

  • highp – full 32-bit floating point format, suitable for vertex transformations but has the slowest performance.
  • mediump – reduced 16-bit floating point format, suitable for texture UV coordinates and roughly twice as fast as highp
  • lowp – 10-bit fixed point format, suitable for colors, lighting calculation and other high-performance operations and roughly four times faster than highp

If the shader is written in CG or it is a surface shader then precision is specified as follows:-

  • float – analogous to highp in GLSL ES, slowest
  • half – analogous to mediump in GLSL ES, roughly twice as fast as float
  • fixed – analogous to lowp in GLSL ES, roughly four times faster than float

For further details about shader performance, please read the Shader Performance page.

Hardware documentation

Take your time to study Apple documentations on hardware and best practices for writing shaders. Note that we would suggest to be more aggressive with floating point precision hints however.

Bake Lighting into Lightmaps

Bake your scene static lighting into textures using Unity built-in Lightmapper. The process of generating a lightmapped environment takes only a little longer than just placing a light in the scene in Unity, but:

  • It is going to run a lot faster (2-3 times for eg. 2 pixel lights)
  • And look a lot better since you can bake global illumination and the lightmapper can smooth the results

Share Materials

If a number of objects being rendered by the same camera uses the same material, then Unity iOS will be able to employ a large variety of internal optimizations such as:

  • Avoiding setting various render states to OpenGL ES.
  • Avoiding calculation of different parameters required to setup vertex and pixel processing
  • Batching small moving objects to reduce draw calls
  • Batching both big and small objects with enabled “static” property to reduce draw calls

All these optimizations will save you precious CPU cycles. Therefore, putting extra work to combine textures into single atlas and making number of objects to use the same material will always pay off. Do it!

Simple Checklist to make Your Game Faster

  • Keep vertex count below:
    • 40K per frame when targeting iPhone 3GS and newer devices (with SGX GPU)
    • 10K per frame when targeting older devices (with MBX GPU)
  • If you’re using built-in shaders, peek ones from Mobile category. Keep in mind that Mobile/VertexLit is currently the fastest shader.
  • Keep the number of different materials per scene low – share as many materials between different objects as possible.
  • Set Static property on a non-moving objects to allow internal optimizations.
  • Use PVRTC formats for textures when possible, otherwise choose 16bit textures over 32bit.
  • Use combiners or pixel shaders to mix several textures per fragment instead of multi-pass approach.
  • If writing custom shaders, always use smallest possible floating point format:
    • fixed / lowp — perfect for color, lighting information and normals,
    • half / mediump — for texture UV coordinates,
    • float / highp — avoid in pixel shaders, fine to use in vertex shader for vertex position calculations.
  • Minimize use of complex mathematical operations such as pow, sin, cos etc in pixel shaders.
  • Do not use Pixel Lights when it is not necessary — choose to have only a single (preferably directional) pixel light affecting your geometry.
  • Do not use dynamic lights when it is not necessary — choose to bake lighting instead.
  • Choose to use less textures per fragment.
  • Avoid alpha-testing, choose alpha-blending instead.
  • Do not use fog when it is not necessary.
  • Learn benefits of Occlusion culling and use it to reduce amount of visible geometry and draw-calls in case of complex static scenes with lots of occlusion. Plan your levels to benefit from Occlusion culling.
  • Use skyboxes to “fake” distant geometry.

See Also

Read More