Garbage Issues (Proposed Fixes)

Jul 7, 2009 at 8:26 PM
Edited Jul 7, 2009 at 8:47 PM

Hello,

I've found some garbage issues with Farseer Physics 2.1.1 and fixed them; the main one is in Vertices.cs, in the GetCentroid function. The GetCentroid function should be:

 

    public Vector2 GetCentroid(float area)
{
//calc centroid on counter clockwise verts. -- this generates lots of garbage...
//Vertices verts = new Vertices(this);
//verts.ForceCounterClockWiseOrder();

Vertices verts = this;

float cx = 0, cy = 0;
int i;

float signedarea = GetSignedArea();
float factor;

if(signedarea > 0) //if it's meant to be reversed, go through the vertices backwards
{
for (i = Count - 1; i >= 0; i--)
{
int j = (i - 1) % Count;

if(j < 0) { j += Count; }

factor = -(verts[i].X * verts[j].Y - verts[j].X * verts[i].Y);
cx += (verts[i].X + verts[j].X) * factor;
cy += (verts[i].Y + verts[j].Y) * factor;
}

}
else
{
for (i = 0; i < Count; i++)
{
int j = (i + 1) % Count;

factor = -(verts[i].X * verts[j].Y - verts[j].X * verts[i].Y);
cx += (verts[i].X + verts[j].X) * factor;
cy += (verts[i].Y + verts[j].Y) * factor;
}
}

area *= 6.0f;
factor = 1 / area;
cx *= factor;
cy *= factor;
_res.X = cx;
_res.Y = cy;

return _res;
}

 

This will make it generate no garbage. Also, Arbiters generate a bit of garbage too; Fixing these is a bit more complex. The problem is that they allocate new contact lists in their construct method: to get around this, the contact lists should be allocated when the Pool<Arbiter> is allocated beforehand, otherwise every created arbiter generates a bit of garbage rather than using a prechaced contact list. This also means the Pool needs to be made enumerable so you can call your InitContactLists function for each Arbiter in the pool before they are constructed - there was an older topic about this that explained it with a code sample but it must've not been integrated into Farseer.

Another source of garbage is the default list allocations: most of the lists are created without supplying a capacity value. There's very little memory footprint (we're talking a couple of hundred bytes per list) in specifying a default allocation of about 64 for every list, and this prevents garbage being generated each time the list goes over its capacity. This is also true for the ArbiterList and GenericList classes, which should implement a constructor where you can supply a default capacity (since they inherit from List<T> this is easy: just do base.Capacity = capacity in a constructor accepting an int capacity parameter).

EDIT: Also, the AABB class would make more sense as a struct - especially in the case of the sweep and prunce collider, which has some garbage generated from allocating AABBs for intersection tests.

Coordinator
Jul 7, 2009 at 10:26 PM

roonda reported the same issues here. We have also discussed the initial capacity on lists in another topic.

Thanks for taking the time to profile Farseer Physics and improve on its garbage generation. We are currently focusing on getting a good foundation for 3.0 so all of this will be taken into consideration. Up until Farseer Physics 2.1 we mainly focused on an easy to learn physics engine with rapid development. We did not focus too much on performance since it usually came back to bite us in the end. Farseer Physics 3.0 will be different tho. We have another structure that should be more flexible and give us increase performance compared to older version.

If people have trouble with the garbage generated by 2.1.1 they should try and implement the fixes suggested by both roonda and those listed in this thread. I will see if I can squeeze another release out some time before 3.0 comes out.

Jul 17, 2009 at 4:21 PM

I've just uploaded a patch that contains:

  1. The Vertices.GetCentroid() change that Ultima made above.
  2. Arbiter contact list initialisation. Most of the changes I previously suggested are now redundant due to changes for FP2.1.1, so the changes can be isolated to Arbiter.cs now.
  3. Minor tweaks to SpatialHashCollider. I haven't tackled the other colliders yet because at first look this one seemed to be the easiest to get garbage to minimal level. If anyone knows of pitfalls of using this collider over other colliders, I'd be interested to know.

 

Jul 17, 2009 at 8:31 PM

Thank you for the patch!
I have tried it with my RollerCoaster360 game(http://www.northwindent.com/?p=109). It seems to me that the game is running smoother now.

-Stefan

Coordinator
Jul 17, 2009 at 9:12 PM
This discussion has been copied to a work item. Click here to go to the work item and continue the discussion.
Coordinator
Jul 20, 2009 at 11:23 PM

What kind of test did you use to see the garbage generated by arbiters?

I've run the Pyramid sample (it creates a lot of arbiters) for a while and I can't seem to find anything.

Also, I guess you ran some sample with SAT enabled to see the garbage generated by GetCentroid() right?

Coordinator
Jul 21, 2009 at 12:03 AM

I've had a look at your patch roonda, the centroid stuff would probably work but I'm not sure about the arbiter stuff.

You are actually creating more garbage by initializing the lists in the default constructor. The initializations are never used since ConstructArbiter() is always called when an arbiter is needed.
Not all garbage is bad - this is a case of "lesser than two evils" since clearing the list is slower than simply allocating a new one and let the garbage collector get it. At least that is what I'm told, I still need to see some data on the subject.

That means that the change you made to the spatialhashcollider will actually generate less garbage, but might be slower than the one generating garbage (with the garbage collection time taken into account).

Jul 21, 2009 at 2:07 AM

Hey genbox i just thought i would menchin that garbage on 360 can cause alot more problems.

For a few reasons detailed here (for anyone that wants to know)

The garbage collector on the 360 differs from Windows in a lot of ways, but the most important are:
1) the 360 is not generational. On Windows, memory is divided into 3 generations, and garbage collection can occur separately on each one. Of these 3, two of them can GC on a separate thread, so you rarely see performance impacts. On the 360, when GC runs, every thread stops and waits for it.
2) the 360 is deterministic. The garbage collector runs every time 1mb total of memory is allocated, whether or not you need or want it to. So, if you're allocating, say, a 100x100 array of ints every frame, then you can go no more than 25 frames between each garbage collection. That's perhaps not bad if that's the only thing you're doing, but add a few dozen of that same array to your game, and you're garbage collecting one or more times per frame, which gets really painful.

#1 gets you in trouble because you can develop for quite a long time on Windows without even having to think about how often the GC is running. It might be running almost constantly, but if your game is mostly single-threaded, you won't see much impact from it. Move that same code to the 360, and the GC will suddenly hammer your performance.
#2 gets you in trouble because it's very easy to allocate huge amounts of memory without realizing it.
#3 Some frequent offenders that can really make the garbage collector go crazy are 
     A: Alocating variables within tight loops.
     B: Using string concatenation.
     C: Using foreach.
     D: Any boxing and unboxing. 

Anyway i just thought that mabey his fix would be better if it was just used when running on 360?
I also found that my adverage framerate when up in one of my old games when i used that patch. (Only by 5-10%)

Anyway just thought i would menchin this.

Jul 21, 2009 at 3:32 AM

All of these changes were made due to garbage hunting in my game. I do have SAT enabled, so that is why the Vertices centriod fix was important. NOTE: I haven't actually tested it with polygons of both winding orders, because I don't use them. I think there might even be code that forces the correct winding order, but I'm not sure if it's always called.

I think the reason you don't see garbage on the pyramid sample is because the maximum number of arbiters are created/initialised right near the start when the initial stacking occurs. In a more realistic game scenario, you'd probably have a gradual increase in the number of arbiters being used at any one time as the action heats up. Before my change, the arbiters wouldn't initialise their contact lists until the first time they get used. This goes against the mantra of "initialise everything up front", which is so much more important on the 360. I think that the extra memory which might be wasted due to the early initialisation of arbiters that aren't ultimately used, is outweighed by the fact that any garbage is extremely bad on the 360. Danthekilla's point 2) describes why this is so.

I guess the argument is the same for the spatial hash collider. Admittedly I haven't profiled it for a speed comparison. I just made the changes since I could see they would reduce garbage. Obviously the call is yours genbox as to what goes into the engine, but I've made my argument. Others are free to use the patch even if it doesn't go in.

Coordinator
Jul 21, 2009 at 4:26 AM

@danthekilla: Are you also running SAT?

Do you have some numbers from Remote Performance Monitor on the Xbox? My tests with CLRprofiler shows about 121kb of arbiters allocated (2,43% of total - the rest has nothing to do with the physics engine) over 60 seconds. That is next to no garbage.

As for the garbage collector - It is needed and it will have to run. If it does not run, you are not allocating anything ;)
If you suddenly allocate a lot it will take the GC a lot of time to remove it all (when it no longer have a reference) and compact memory.

When you look at the numbers from the garbage collector, you need to take the number of GC and the time for each GC into account. I need to see some numbers from the Xbox to see where we can improve garbage generation.

@roonda:

I agree that the vertices centroid generates a lot of garbage and that is bad. The GetCentroid() method was never designed to run all the time, only once per geometry and not 100 times on each collision. So that will be fixed.

As for the arbiter fix; it should not fix anything because the pre-loaded lists are not used at all. You call InitContactLists inside the default constructor so that each arbiter (Default is 100 of them) will have initialized collections. But once you get an arbiter out of the pool the ConstructArbiter() method is called on the arbiter and the contact lists are initialized again. So the pre-loaded list are never used and have actually become garbage themselves.

There are only two solutions since arbiters are destroyed and created all the time (the pool is there to make sure the arbiter itself does not get removed/become garbage)

1. Initialize the lists on each run.
This would obviously create some garbage since new memory is allocated each run. It is very little garbage since each collection only holds a few items.

2. Clear the lists on each run.
This would make the lists stay in memory but clearing the list will take a long time since you will have to remove all items in the list.

 

And as for the pyramid sample, it does create arbiters. In fact it creates a lot of arbiters. Even tho they are standing on top of each other they are jittering, but I didn't just let them stand there, I ravaged the whole pyramid for 60 seconds and I did not even get one garbage collection. Just wondering what test you made to see any garbage.

I'm more than happy for your patch and I'm not trying to counter-argument anything. I'm just saying that the arbiters fix does nothing and I observe no garbage generation in the arbiters besides what is expected.

Jul 21, 2009 at 6:25 AM
Edited Jul 21, 2009 at 8:54 AM

I have to dispute this: "As for the arbiter fix; it should not fix anything because the pre-loaded lists are not used at all. You call InitContactLists inside the default constructor so that each arbiter (Default is 100 of them) will have initialized collections. But once you get an arbiter out of the pool the ConstructArbiter() method is called on the arbiter and the contact lists are initialized again. So the pre-loaded list are never used and have actually become garbage themselves."

Inside ConstructArbiter(), this code is called:

            //Initialize the contact lists
if (_contactList == null)
{
_contactList = new ContactList(PhysicsSimulator.MaxContactsToDetect);
}
if (_newContactList == null)
{
_newContactList = new ContactList(PhysicsSimulator.MaxContactsToDetect);
}
if (_mergedContactList == null)
{
_mergedContactList = new ContactList(PhysicsSimulator.MaxContactsToDetect);
}

So the contact lists are only created if they were null to begin with. My change simply moves this code to a private function
that gets called in ConstructArbiters() AND in the default constructor. The result should be that the only time these lists are
newed is when the global Pool<Arbiter> becomes empty and new arbiters are created.

Jul 21, 2009 at 8:14 AM
Edited Jul 21, 2009 at 8:15 AM

What roonda says about the arbiters is correct.

As for GetCentroid, the only reason that generates garbage is because it depends on the winding order of the polygons; at the beginning of the function it creates a new copy of the vertices, then forces those into the correct winding order; my change to this function skips this so as to not generate any garbage, and to make the function work correctly it steps through the vertices backwards if they're in the wrong winding order (thereby fixing the problem) - this will probably be a tad faster too since it doesn't allocate a whole new vertices object.

There's no difference between garbage on windows and garbage on the xbox, only the garbage collection time. With a library like Farseer it's critically important that it's possible to use it without generating ANY garbage while your game is running in some way, because the garbage colelction time depends on the game. Obviously garbage collections between levels are ok, but stuff like garbage collections in the broad/narrow phase colliders, arbiters etc that may be allocated at any time is a problem.

To give you an idea, my game is reasonably simple at the moment, but garbage collections take about 3-4 seconds (on the xbox - on PC they take no time at all, which might be why you don't see it as such a big problem :P) since I have a lot of objects on the heap. If this is triggered mid gameplay it's a HUGE problem... game breaking even. So I've made modifications to farseer to make sure it doesn't allocate anything while my game is running.

As for clearing lists vs allocating new ones; you always want to go with clearing them (at least for the xbox). Clearing the lists takes a predictable amount of time in case by case scenarios, whereas memory allocation may trigger a garbage collection and in a lot of games (on the xbox, maybe occasionally on windows) this could cause a few seconds of lag, or at least a big stutter every second or so.

I'm going to london today but if I can get on a PC at home I'll try and get you some solid garbage collection data. I'm really keen to nip this in the bud because Farseer is a fantastic physics engine, and I think it's important that it can run on the xbox without garbage! :)

 

NOTE: I'm also running SAT, but I've tried the Distance Grid too.

Jul 21, 2009 at 9:07 AM

Thanks for speaking up Ultima, I couldn't have said it better.

Maybe we all need to donate $20 so the Farseer team can get a 360. Seems like a worthy cause to me.

Coordinator
Jul 21, 2009 at 4:05 PM

@roonda, @Ultima2876, @danthekilla: First I would like to say sorry, I totally skipped the fact that they were only created if they were not allocated beforehand. I wrote my last reply at 6:26 in the morning and at that time I've been up for nearly 30 hours - that does not excuse my wrongful reply and lack of details. Thanks for sticking by what you know and proving me wrong.

I'm working the whole day so I don't have a lot of time for Farseer Physics, and that is why I'm usually working on the in the night (In Denmark, GMT +1).

 

Back to the topic; The patch seems 100% valid and I'm going for the clearing of lists instead of allocating new ones. I'm also going to put default capacity on the lists even tho I waited until 3.0 to do stuff like that.

@roonda: Thanks for suggesting donations. Matthew already has an Xbox360 but it is limited what he can test for me. A little off topic, but I have actually tried to win an Xbox on several occasions for this project. I've been entering the Ziggyware article contest and other article contests to win an Xbox for this project (I don't care about the rest of the prizes, Matthew could have it if he wanted it).
We currently have $50 on our donation account as seen here.

I will work on lowering the garbage today. Last night I reworked the whole samples structure to make it easier to maintain, so I might even be able to reduce garbage generated there too.

Jul 21, 2009 at 7:53 PM

Hey genbox, that's totally cool - everyone makes mistakes!

I'm up for donating $20.

Coordinator
Jul 21, 2009 at 8:16 PM

Thanks Ultima2876.

I've written a note on the frontpage. All donations will go directly to the Xbox 360. Do you guys have any preferences or tips when it comes to buying a Xbox360?

 

Coordinator
Jul 22, 2009 at 2:51 AM
Edited Jul 22, 2009 at 2:58 AM

I've added the patch uploaded by roonda and I've implemented some of the other stuff mentioned here and in the other thread.

I skipped the whole init contact lists and simply initialized the lists in the constructor of the arbiter. Whenever an arbiter is created, so are the contact lists. Every time the arbiter is needed, the lists are cleared.

I've also reduced the memory copying by introducing the ref keyword to vectors in the internals. There were only a few places, but it should improve performance a little.

It would be great if you could test the latest changeset on Xbox and see what kind of garbage is generated.

Edit: Note that the number of cached arbiters have increased to 128 for testing purposes. The garbage collector will run when you create a physics simulator object because of this. I guess it is better to preallocate a lot of arbiters and have the garbage collector run when the physics simulator is created instead of having it run inside the game.