Farseer performance compared to native Box2D ( used in libGDX )

Topics: User Forum
Feb 17, 2014 at 7:14 PM
Edited Feb 18, 2014 at 7:15 AM
Hello,

i would really love to program my game in C# and use Farseer Physics as my physics engine. But i can't do it because i have big issues with the performance of Farseer. Maybe i'm doing something wrong ( which i probably do ) but i hope you can help me resolve the problems. I have an application that i ported from C#+Farseer+XNA to Java+Box2D(native libGDX wrapper)+libGDX. It is essential the game "Pong" but im testing the performance by trying to spawn 4000 dynamic bodies. In Farseer this works "ok" up until 100 dynamic bodies that move simultanously and it "breaks" at about 400 dynamic bodies on my configuration ( world.step takes way over 40ms ). Even when all bodies start sleeping the step time stays at about 8ms. in my libgdx test application ( which is essentially the same but also has added features ) i can have up to 4000 ( fourTHOUSAND, this is not a typo )dynamic bodies, all moving at the same time, and i only have ~50ms per step and when they all sleep it stays under 1ms.
As advertised and tested, Farseer should have similiar or even better performance than native Box2D and so im quite confused. I do have 3 computers ( 1. win xp, 2. win vista, 3. win 7 ) and tested the apps on all of these and its always the same: I can have 10 times more dynamic moving bodies in java than i can have in c# and the c# version has unreasonable high step times even with 10 times less bodies. So what should i now expect of Farseer? I'm expecting similiar performance.
You can reproduce my "problem" with the Farseer XNA Sample app. Modify the simple sample 4 to let the pyramid consist of 140 instead of 14 blocks and in the pyramid prefab change the size of the rectangles to 0.125f instead of 0.5f ( so it fits into the scene ). This is a lagfest on my computer but probably would be piece of cake in my libGDX native Box2D app.
So what do i want from you? Please could you verify that when you change simple sample 4 to something between 100 and 300 rectangles, that you also have a LAGFEST or tell me how many bodies you could spawn. Then, if you can spawn more than 2000, please try to help me to change my implementation (which is identical to original box2d except for the slight differences of using the factories etc. ).
Also please dont ask why i want 4000 dynamic bodies ( this is just for testing purposes ). This is not the question. The question is why farseer it 10 times slower than native box2d or what could be the reason that i am having these problems.

All running .net 4 ( farseer 3.5 ) or java 7 x64 server compiler ( dont know the box2d version, newest libgdx )
Computers:
  1. Win xp
    http://en.wikipedia.org/wiki/Samsung_NC10
  2. win vista
    core i7 920 ( first generation i7 )
    6gb ram
    nvidia gtx 560
  3. win 7
    Intel® Core™2 Quad Processor Q6600
    4 gb ram
    nvidia gtx 560
Code:

// Physics2D is just a wrapper class that wraps the body with some additional information
public Physics2D GetPhysics(Vector2 origin, Shape worldScaledShape)
    {
        origin = origin * WorldScale;

        Physics2D physics = this.GetComponent();
        physics.Body = BodyFactory.CreateBody(World, physics);
        physics.Body.LinearDamping = 0.1f;
        physics.Body.AngularDamping = 0.1f;
        physics.Body.SleepingAllowed = true;
        physics.Body.IsStatic = false;
        physics.Body.BodyType = BodyType.Dynamic;
        physics.Body.Enabled = true;
        physics.Body.Position = origin;
        physics.Body.IsBullet = false;
        var fix = physics.Body.CreateFixture(worldScaledShape, physics);
        fix.Restitution = 1;
        fix.Friction = 0.1f;
        physics.AABBSize = GetAABBSize(worldScaledShape);
        // because AABBSize is in box2d scale we need to normalize it back
        // its only used for drawing
        return physics;
    }
// stepping method, usual business
public override void Process(Microsoft.Xna.Framework.GameTime gameTime)
    {
        World.Step((float)gameTime.ElapsedGameTime.TotalSeconds);
    }
// factory method that spawns balls. There the GetPhysics method is called
public Entity GetCircularBall(float originX, float originY, float radius)
    {
        Entity e = new Entity();
        var tex = engine.Content.Load<Texture2D>(GlobalValues.BallTexture);
        var drawable = renderer.GetActorDrawable(tex, null, Color.OrangeRed, 1);
        var phys = physics.GetPhysics(new Vector2(originX,originY), physics.CreateCircular(radius));
        var actor = actors.GetActor(phys, drawable);
        e.AddComponent(drawable);
        e.AddComponent(phys);
        e.AddComponent(actor);
        return e;
    }
The world is scaled by a factor so the balls size lies between 0.1 and 10

I also would like to say that i program for about 12 years and that i usually know what i am doing. Also my libgdx native box2d version is running like a charm. So either i am just overlooking something or c# jit doesn't compile to "good optimized" native code or maybe i found a performance bug in Farseer. I'm tending to number 2 as my other tests show that C# is just unreasonable slow compared to c++ or java -server compiler.

Edit: Besides all that i also converted my java app with ikvm to a .net app and it then has the same speed as the java app. This could also mean that farseer isn't as fast as native box2D because essentially libgdx uses JNI all over the place to use box2d and i would assume that ikvm uses P/Invoke to do the same then and that the problem then really is that native code runs fast but c# managed code is the heck slower
Feb 18, 2014 at 1:53 PM
To make it clear: This is just kind of a "sanity" check. Maybe someone can just test if he/she can spawn something like 2000 dynamic bodies ( with one circle fixture ) and move them all simultanously and tell me the time in ms that the World.Step method needed.
Feb 19, 2014 at 11:32 AM
I ran a few test on the stacked obj sample with different number of bodies. I switched off the IsFixedTimeStep flag to get proper timings.
I also enabled the Debug perfomance panel in the LoadContent() of PhysicsGameScreen:
EnableOrDisableFlag(DebugViewFlags.DebugPanel); and
EnableOrDisableFlag(DebugViewFlags.PerformanceGraph);

With 2000 bodies, It took 1000 to 4000ms in the first moments when the pyramid collapsed. I found that for a first guess, the total time Wold.Step takes is roughly proportial to the number of contacts, with about 0.01ms per contact. That means to stay below already unstable 16ms per frame, you shouldn't top 1600 contacts at once. This was achieved with 100 bodies, which is in line with Nogiax's numbers.
I can't quite believe that Box2D can handle 4000 bodies with 50ms at the same amount of contacts, can you check the contacts there and report back?

I could think of some reasons why there may be such a performance gap.
1: There could be major differences in some of the important algorithms. Maybe one of the developers can judge if this is the case.
2: One may could tweak the settings of FPE in some way that is applicable to a massive multiple body scenario. I played around with some of the settings (VelocityIterations, PositionIterations, ContinuousPhysics, MaxSubSteps), but I got only little impact on the performance and lost 'proper' (visual) physical behavior.
3: C# is to blame, as Nogiax already mentioned.
4: Some combination of the last points + issues Nogiax and me overlooked.

I really would appreciate some opinions on all that.
Greets,
streikbrecher
Feb 19, 2014 at 2:11 PM
Just a quick reply: I'm almost certain that these problems are due to the "bad" jit compiler of c# as i have done several so called "micro benchmarks" between c++/java/c#. And i would be fine with this statement but i am not because Farseer is "advertised" as "as fast or faster than Box2D". And that brings me too the thought that maybe i am doing something wrong.
I'll check back later and report back about native Box2D and contacts. I already did some simple checking in my tests and contacts are reported normaly in my application. I am willing to share the executables of both apps ( jar of java and exe of c# ) so you can see those "unbelievable" results.
I don't have time now but i will check back later with my results. Thanks for helping me/us :-D
Developer
Feb 19, 2014 at 3:16 PM
I'm mostly responsible for the samples and haven't done any "low level" stuff on Farseer, so Ian would be the right person to talk to. As far as I know though all those benchmarks are from 3.3. Farseer 3.5 has been in development for a long time and in the end Ian cut quite a few features and pushed them towards the next release. (Stuff like fluid simulations etc.) To my knowledge Farseer 3.5 is not anywhere near as optimized as 3.3 as the main goal was to update to a newer Box2D version first and optimize again later. Last time I heard from Ian he was very busy but still committed to keep working on Farseer. That said it may very well take a long time until a new version is released. For any specific answers you'd have to wait and see if Ian shows up here at some point ;)

Personally I still kinda dig c# and am not a big fan of Java but with XNA development being kinda dead switched over to libGDX anyway a while ago. I know there still is MonoGame but it also appears to be a dead end at least to me. They would need to stop being "just" a XNA port and start adding some new features at some point soon, because they have quite some catching up to do feature wise compared to similar frameworks and broken legacy stuff like the XNA content pipeline.

So if I were you I guess I'd just go for libGDX... just my two cents ;)

Best regards,
Helge
Feb 20, 2014 at 1:31 PM
Edited Feb 20, 2014 at 1:42 PM
I made a quick test with 4000 dynamic moving bodies and i had about 12k contacts ( System.out.println(world.getContactCount()); ) with 40ms-45ms measured time for step in native box2d. I kept all balls moving with a rotating "pong" paddle that constantly hit them around :-). I also have the same performance when i run the java app with ikvm. Looks like native box2D can take about ~10 times more dynamic bodies than Farseer with ~same timestep length
Feb 23, 2014 at 2:49 PM
Hi Nogiax,

Could you upload the examples you mentioned? I'd love to compare the two. I've made a couple of changes to the engine in one of my projects and I'm just curious to see how it effects your tests with the alterations.

Cheers,

HAL
Jun 4, 2014 at 5:14 AM
Did you run these tests in debug or in release? Often there's quite a big difference between the two.

/mattias