Performance Showdown: Starling vs ND2D vs Genome2D vs Haxe NME

[UPDATE] These numbers are old. New numbers have been posted here: http://esdot.ca/site/2012/runnermark-scores-july-18-2012

In this post we’ll use my RunnerMark lib to benchmark the rendering performance of the various Stage3D 2D Engines, and compare them to renderMode=GPU and HaxeNME.

Big thanks to Philippe for the Haxe NME port! Now we can compare all the Stage3D based implementations against Native performance and see exactly where AIR is standing (hint: pretty damn good!)

Without further ado…

Test Devices

  • Nexus One – Android 2.3.3
  • Galaxy Nexus – Android 4.0.3
  • iPhone 4 – iOS 5
  • iPad 2 – iOS 5

Results

Loading chart…

Loading chart…

Loading chart…

Loading chart…

Conclusion

I am actually extremely impressed by both Starling and Genome2D in their latest incarnations, in many cases they are within 20% of NME’s performance, and finally we’re seeing stage3d begin to consistently out-perform renderMode=gpu. As for ND2D, well it obviously has quite a bit of work to do to reach the same performance class as the others.

The one area where Haxe NME truly blew away AIR was on the iPad 2 where we saw a 2x slap-down. I purposefully added some CPU load to each enemy’s AI, and I’m guessing this is a situation where the AS3 overhead just becomes too much with the number of objects… but that’s just a guess. It’s worth noting though, that Genome2D with a score of 1210, is pushing over 600 animated sprites @ 58fps, and even GPU Mode is able to push several hundred animations on iPad 2. At that level of performance I’m pretty sure you’re going to be ok ;)

Till next time!

Written by

15 Comments to “Performance Showdown: Starling vs ND2D vs Genome2D vs Haxe NME”

  1. focusfocus says:

    Hey, Shawn! Great post, thanks. Could be nice to see engines versions though.

  2. sHTiF says:

    The Nexus One results are strange I get the exat oposite numbers for GPU vs Starling with GPU at 450 and Starling at 400.

    As for the IPad2 I got better results than HaxeNME with Genome2D with pure blitting sprites to the screen with over 9000 32×32 sprites at 60FPS where HaxeNME was around 6000. But as you said the AS3 is probably the overhead when you do anything else except drawing stuff.

  3. Shawn says:

    Ya in trying to create a nice game scenario, I was pretty generous with the CPU overhead as most games have a tons of processing going on each frame as well. I figured the CPU overhead would be consistent between all tests, so it would be fine, but then NME came along ;)

  4. Philippe says:

    Yes it’s quite disappointing how AS3 is slow, especially when you see how much overhead and room for optimization there is in haxe NME…

    • Emiliano says:

      Are those 2388 points in the iPad2 from Philippe’s RunnerMark? I got 1547 testing in my iPad 2 with iOS 5.

      • shawn says:

        Ya they are based on a build he sent me. I was surprised as well, as I’d already seen your score, so I re-ran the test 2 times and it continued to throw down those insane #’s.

        Also with iOS 5 btw…

  5. Considering how RunnerMark assigns scores, I think these differences can also be particularly relevant to game design.

    For example, NME scored 930 on the Galaxy Nexus, while AIR was able to score 728. That is a 28% score improvement, but the first 580 of each score is the ability to render the basic scene at 60 FPS. This means that NME had 350 additional enemy instances without dropping the frame rate, while AIR had 148, which is a 236% improvement :)

    • shawn says:

      Ya thats true, but not really so cut and dry. The baseline load of the scene is pretty heavy as it really pushes the fillRate of older devices.

      I tried to design this benchmark in a way that could be used across all devices, so it would give meaningful results for older devices, but have headroom for newer devices, which is why the first bit is FPS based, and the second bit is based on adding additional enemy characters. It seems to be working fairly well, the results look fairly in line with hardware across the various devices.

  6. Matt Lockyer says:

    The CPU will make a big difference here in the device tested.

    If I understand correctly your test doesn’t push through a single GPU call to a bunch of geometry, but rather makes several calls to the GPU.

    Find out more here: http://www.bytearray.org/?p=4074

    While this test is a good test of the implementation of these engines and how they organize and minimize actionscript overhead such as function calls, it puts the emphasis on the CPU when compared with HaxeNME.

    Therefore I don’t feel it’s a straight up throwdown between Stage3D, GPU Render and Haxe NME.

    My two cents…

    • shawn says:

      If you check out the Starling implementation, everything is running off a single TextureAtlas. In the other tests, while they’re not using a single draw batch, I do try to keep it down to just a few different batches…

      Unfortunately G2D doesn’t really offer you a good way to dynamically combine movieclips, and I kinda want to avoid using TexturePacker or some other 3rd party tool. I’m going to do a good optimization pass on ND2D once Lars finishes his performance upgrades.

  7. Daniel Dourado says:

    It would be nice to see the other Stage3d library, Axel:
    http://axgl.org/

  8. Emiliano says:

    Hi Shawn,
    Just wanted to tell you that the latest score I got in my iPad 2 (iOS 5.1.1) is 3121 for the haxe NME version.

    • Shawn says:

      Haha WOW, insane performance! Looking forward to a new benchmark test soon with improved versions of Starling and ND2D as well.

  9. David Barlia says:

    This is a very interesting comparison.
    It would be really helpful if you could include iPad1 in the mix.

  10. [...] 在我早前的文章performanceshowdown中,我们比较了Starling,ND2D,genome等一些基于Stage3D的2D架构的性能,以GPU渲染模式和Haxe作为对比。几个月过去了,Starling和ND2D都有了升级,我也从GIT Repo更新了Haxe。 [...]

  11. eco_bach says:

    Considering the extra work in implementing Starling, I really can’t see a case when it would be justified to use. Perhaps someone can post a link to an example but I’ll stick with GPU

Leave a Reply

Message