Xbox360 Slowdown

Topics: User Forum
Jun 7, 2009 at 3:02 PM

Hey guys,

I need a bit of help figuring out things to try to improve performance on the Xbox.

The game I'm working on typically has 200-300 objects in each level. While this seemed to be completely fine on the PC, it brought the Xbox to it's knees. I've attempted to optimize my level design a little (by pooling repetitive objects, etc), but the game still comes to a standstill when multiple collisions arise. According to a snapshot of the GC heap, it would appear that a lot of Vector2 garbage is created by the Geom and Vertices classes.

Does anyone have any suggestions/tips on what I can do to improve performance?

Thanks :)

Jun 7, 2009 at 3:12 PM

It would be really great if you could track down the excact lines where the garbage happens on Xbox. I borrowed an Xbox 360 for a day, and it seemed like it does a lot less optimizations than the PC JIT compiler.

Anyway, if you find a line or pattern where the garbage is coming from, I'll be more than happy to tweak it and get it out.

Jun 7, 2009 at 5:48 PM
Edited Jun 7, 2009 at 9:37 PM

This is my first time working with profiling tools, so I'm not entirely sure how to determine the source of the garbage.

I downloaded the trial version of dotTrace, and It indicates that about 90% of the game runtime is spent in physics with about half that time performing the BroadPhaseCollision. It appears that the comparison and sorting calls for List<> are consuming the largest amounts of time.

I have taken two screenshots of the breakdown of DoBroadPhaseCollision. Let me know if this was helpful, or if there is something specific I can do/use to help you out.

I'm still learning how to use profiling effectively, so any instruction would be appreciated :)

EDIT: oops, forgot the screenshots


Jun 7, 2009 at 8:54 PM

Sorry that I reply so late - I've spend all day tracking down a bug that was really not a bug, but a feature. :)

Anyway, here it goes:

I've always known that the contact comparer generated garbage. I've moved it once before to be static and out of scope to make sure it was only created once. From my profilings I could still see it on the list of allocations, but it was much less. I profiled FP 2.1 again today and noticed that the comparer actually was about the only thing that generated garbage, so I took the liberty of creating a simple insertion sort and use that instead. Here is the good stuff first:

1. No more garbage from it.
2. Better performance since we use less complex code and insertion sort works great on small lists.

Not so good stuff:

1. The simulation ran differently and things that fell in a unique way before started to fall in a new unique way.

The bug I mentioned before is the not so good stuff - but after some testing my mind finally had a "oh snap"-moment: ... That's right, .net List.Sort uses an unstable quicksort algorithm!
Anyway - because it is unstable and the insertion sort is stable, the simulation behaved a little different. The simulation has not become more inaccurate or accurate. There is no way of telling. So I just call it a feature :)


As for the results of the garbage removal. here is the improvement:

Test was (excatly) 200 seconds long in both cases.

Before change:
5 gen1 garbage collections

After change:
3 gen1 garbage collections

The 3 gen1 garbage collections consists of Vector data from the drawing of geometries and strings for updating the FPS counter in the game.

Jun 7, 2009 at 9:02 PM

@sterlingware: In my experiance List<> is much slower then using arrays. Here is one example of the potential speed differences As you can see using arrays can provide a huge speed up. I also bet that the Xbox360 there is a huge difference. I very much doubt that me or genbox get time to try and rewrite anything to use arrays instead of lists. I will assure you the next version will avoid all the pitfalls this version has.

Jun 7, 2009 at 9:46 PM

@genbox: thanks for looking into this :) I appreciate the help.

So, just to clarify, the garbage issue is something that will be resolved in 2.1?

@mattbettcher: I am not at all familiar with how the engine works, so I think it's worth asking before moving forward; do you think it would be possible (for me) to replace the List<> instances with arrays?

Jun 7, 2009 at 10:02 PM

@sterlingware: Well, 5 garbage collections (small in size) over 200 seconds is not really an issue. It is expected and within acceptable bounds. But removing the garbage created by List.Sort() is always nice. This means even less garbage collections.

I uncommented the FPS counter i the demos and ran the version where I changed to another sort implementation, and there were absolutly no garbage collections (0 gen1, gen2 or gen3) - there were no garbage build-up at all actually.

Xbox360 can't perform as good as a PC (not using only one core at least) so you will need to do some clever programming to make sure you have as little in the simulation as possible. Example: Only activate geometries when your player is near by.

You should also test performance when you have compiled Farseer Physics in release mode (and without the debugger). If the JIT compiler knows it can do release mode and no debugger is detected, it will optimize the code and you will gain 2x the performance.

@mattbettcher: Let the never ending micro-optimization war begin!

No, just kidding. :)

Since array is a datastructure and List is an implementation that uses that datastructure, it is bound to be slower. List does a lot of the normal operations that we would have to implement anyway; such as resizing when dataitems exceed the array size. However, list is not needed everywhere (we could send around arrays internally and only expose lists) and we could perhaps gain maybe 5% speedup (my guess) by changing to arrays.

I would rather have the flexibility of lists or build an implementation myself than to switch everything to array. I would like to see how much of a gain we would have tho. :)

Jun 7, 2009 at 10:15 PM
Edited Jun 8, 2009 at 2:29 AM

@genbox: silly me, it never occurred to me to compile it in release mode, I will try that now. Is the version with the alternative sort implementation you mentioned above 2.1 or 2.0?

EDIT: nvm, I used my eyes properly and read,

"I profiled FP 2.1 again today and noticed that the comparer actually was about the only thing that generated garbage, so I took the liberty of creating a simple insertion sort and use that instead"

from your post above

I will also work on activating geometries when the player is close to them. Thanks for the suggestions :)

Jun 8, 2009 at 1:44 PM


Before you go ahead and implement something that you might regret, I wanted to quick-comment on the following line: "I will also work on activating geometries when the player is close to them."

Because we want Farseer Physics to be really easy to work with, we have omitted some complex stuff that would prevent our users from doing bad design. The most common case I see is using 2 foreach loops inside Update() that checks each and every geometry against the next. This is really computational expensive and really bad design too since the engine does that anyway and it does it way better using complex algorithms.

Anyway, that is not the point here. The point is that when you write "activating geometries when the player is close to them" I can already see the code to check each and every geometry against the player. There are way better methods of doing that. One way is to load everything on the level and activate the inactivity controller. It needs a lot of tweaking, but it can improve performance a lot in platform games.

Another method is to segment the level (still thinking platform games here) into parts and only enable the parts when the player enters the area. This can be done using AABBs that you check against the player. This is a lot cheaper than checking the distance between all geometries.
Oh, and another note: make sure to make the AABB big. Crysis uses this technique to only start animations/physics/CD/AI when the player enters the area, but sometimes you can actually enter an area in such a way that it does not detect that the player entered it. I had a lot of fun with that.

Jun 9, 2009 at 12:29 PM

I wrote a little test application on my laptop that made a list and an array.

The list started off at size 5000
the array at size 10,000

then I filled the array with 5000 real objects (the same as the list)

then I just did some stuff on them both

stuff like
               - changing the data in them using some for loops
               - adding new objects to the list and the array
               - removing objects from them both

then I looked at the results

there were some things I didn’t expect (these were all running in release)

iterating though a list and changing one value in each object was 1.3 times slower than array (I didn’t expect this, I would of thought they would have been the same)
adding 1000 objects to a list took about 3.3 times longer than just finding and changing null's in the array to these objects
removing objects from the list took 17 times longer than setting something to null in an array

now most of this stuff I would have thought would be slower
but I didn’t think it would be that much slower

the other thing is I tested it on a 360 at work too and lists in general were about 3-5 times slower on top of what they were on pc.

I have also noticed that most of the 360's slowdown is in the broad phase and not the narrow phase (in the pyramid demo for example)
where as on the pc the broad phase is usually less than the narrow

I haven’t really ever seen farseer generate garbage though I must say
if it does I never really noticed as it was less than the little bits that some of my other programs make. :)

Anyway I just thought I would post my random thoughts.

Jun 9, 2009 at 12:34 PM

Thanks for running those tests, its always cool to see

Makes sense to me tho,

ive always thought i should be using arrays whenever i can,

it was just all to convenient with list,

so easy to use, of course there has to be something bad about them.

What would be cool for a different test is instead of removing the objects from the list

call the objects by index and set them to null like you were for the array.

Jun 9, 2009 at 12:37 PM

Thats a good idea i never thought of that

i am used to using lists like lists i suppose
i generally just remove stuff off of them

that might get rid of the really bad remove time

any idea why going though a list with an index is slower than an array?
its about 4-5 times slower on 360

Jun 9, 2009 at 1:27 PM
Edited Jun 9, 2009 at 1:30 PM

Have you already tested it? That was fast

Unless your asking why that might happen, in that case

I would assume there shouldnt be a difference when modifying through index

I can imagine that because the way lists were made and were intended to be used would cause a difference,

List are meant to work in a dynamic manner taking advantage of certain things that are perhaps to improve ease of use or accomadate certain features but arent computationally fast or performance friendly,

an array is rather static, you provide it ahead of time with a capacity and fill it with "blank" objects and it is not intended to be extended or shortened only modified although that is allowed.

However this is just my theory, that is why it is great you are doing tests.

edit: Also an array is given almost all the information it needs at initialization and may improve its organization, which may make iterating through it easier on the comp or xbox.

Jun 9, 2009 at 1:41 PM
Edited Jun 9, 2009 at 1:45 PM

I just tested it now

its still about 1.3 times slower than the array even though im just setting them too null
and on 360 its about 3-4 times slower (the 360's cpu or way it gets optimized must suck in comparison to pc lol)

Edit: I suppose they have had a lot longer to optimize the pc compared to the 360 and i think the 360 technically uses the pocket pc .NET Compact Framework not the same one as the pc perhaps thats why its so slow (although you would think the pocket pc version would be lighter and faster..)

Jun 9, 2009 at 1:52 PM

Well atleast now we all know a way to squeeze out some more optimization

I have recently been using only arrays anywho but its cool to actually know it makes a difference

Jun 9, 2009 at 4:27 PM
Edited Jun 9, 2009 at 4:29 PM

@danthekilla: The one that surprised me was the 17 times slower remove.

To give a hint why List is slower than Array, here are the technical details about List<>:

.Add() ensures that the size of the array is enough and if it is not, it will create a new array with double the size.
Example: If you have a list of 4 items, your array size will be 4, if you add one item more, it will create a new array of size 8. - The default capacity of a list is 4.

Using the constructor that takes a capacity will minimize this overhead. I did not bother using the constructor because FP was simple and easy to develop :) I will focus on stuff like that for 3.0 to make sure that 1) it does not create garbage and 2) to gain some performance (even tho it is very small)

.Remove() simply finds the index of the item (Using IndexOf() ) and removes the item (using RemoveAt() ) - Internally RemoveAt() uses the Array.Copy() function.

.Sort() sorts the List using Array.Sort() that uses an unstable QuickSort algorithm. For anyone that know QuickSort, it is not that quick for small lists :) - It's very efficient for large lists tho. It would be great if they implemented a switching algorithm that used a faster sort on small lists. I can imagine that would create some chaos tho.


Those are the methods used in Farseer Physics. (since 2.1 we have removed the use of .Sort()). Have in mind that List<> has several reasons why it is slow, one of them is because it uses exceptions everywhere. Catching exceptions is expensive - Don't get me wrong, exceptions are needed but should only be in runonce-code and not in iterative code (Code that runs in a continues loop). That is the way we implemented Farseer Physics.

At some point Matthew or I will probably create our own List<> implementation that uses an array directly and skip all the functionality. We will have a look at this in 3.0.

EDIT: This information comes from the C# ECMA standard specification and the Microsoft Developer Network

Jun 9, 2009 at 8:01 PM

That part about multiples of 4 sounds familiar,

but yeah it seems like it is just a class that is meant to do alot of the work for you even though they could be achieved with a normal array.

Two Quick questions though whats a linkedlist for? and if a list is like a shell around an array, why cant there be multidimensional lists? Or can there?!

Jun 9, 2009 at 10:22 PM

@tlegion: LinkedList is a datastructure like array is. It is a sequence of data fields where one of the fields contain a reference to the next field in the list. This makes it possible to access the next field very easily. You might wonder why use a LinkedList instead of array, well, the order that the data is stored is not really important when using linked lists. The first data record has a pointer to the second, the second has a pointer to the third... and so on.

In arrays, the data needs to be laid out sequential (that is also why it is so fast). There is also a data structure called double linked list. Each data record simply consist of two pointers and a data field. One pointer to the previous field and a pointer to the next field. This way you can iterate the data both from the beginning and from the end - without worring about how the data is laid out in memory.

As for the multidimensional lists, that is indeed possible. You simply do something like this:

List<int>[,] list = new List<int>[2, 2];
list[0, 0] = new List<int>();
list[0, 0].Add(42);

//should write 42
Console.WriteLine(list[0, 0][0]);

You can also take it a step further and create dynamic-dimensional lists by using generics in C# 2.0 - I consider it a bad practice tho :)


Jun 10, 2009 at 8:34 AM

yeah haha, i just thought it be interesting to see if it were possible,

in a tight squeeze I do tend to use lists for stuff like a random number of integers i will need to gather for a method,

although checking the capacity and increasing it if it cant hold a new item, would make an array just as functional, but its hard to believe thats how they were intended to be used so I am not sure if it would perform any better.

That would be another good test!

Add items to a list the normal way, and do the same with an array increasing its capacity when neccesary.

Perhaps to keep it fair increase its capacity by 4 in the same manner a list does

Jun 27, 2009 at 5:34 PM

@genbox: I'd like to grab the changes you made to remove the list sorting garbage, but I'm a little hesitant to integrate the latest checkin. Can you please point me to the change set number and file(s) concerned? (assuming it didn't make it into 2.1). Thanks a million. I noticed that there seemed to be a huge reduction in garbage from 2.0.1 to 2.1 so nice work.

Jun 27, 2009 at 5:42 PM

The changes are in 2.1. They can be seen in Arbiter.cs from line 389.

Jun 28, 2009 at 1:24 PM

Ah. I've done some further investigation. It seems the garbage I'm seeing is generated by SAT.PolygonCollision(), because it calls Vertices.GetCentroid() which does a new Vertices().

Jun 28, 2009 at 1:42 PM

I removed the call to new Vertices(). It was only there to ensure CCW ordering. This reduced garbage hugely. Perhaps another performance gain could be made by caching the centroid value in Vertices (until a new Vector2 is added). Anyway, back to the garbage issue. Most of the remaining garbage looks be due to the new ContactList() calls in Arbiter. What do you think about making a pool for ContactList so arbiter doesn't have to new them?

Jun 28, 2009 at 3:30 PM

The vertices class could gain a lot by caching, but one thing is that it is not called a lot in the simulation itself, but another is that it is a tradeoff between CPU time and memory. I don't think it is too big of an issue, but imagine that we cached all distancegrids created (huge improvement to speed) but the engine would take 200mb ram to run. Just to put things into perspective.

As for the contactlists, I believe that Jeff once profiled that part and found that his old implementation where he cleared the list instead of newing it, actually were a lot slower than just newing it and letting the garbage collector do its job. I will take a look at it and see what can be done.

Anyway, thanks for looking at the garbage generated. I will put down more caching on my TODO list for 3.0.

Jun 29, 2009 at 12:37 PM

Glad to help. I'm happy to report that I've further reduce the garbage by another large chunk. I can see almost zero Farseer generated garbage now.

It turns out that the the ContactLists were not actually being newed for each new collision, but only the first time a new arbiter was fetched from the Pool<Arbiter>. I fiddled around with it and did two things:

1) Made the Pool<Arbiter> to have an initial size of 1000 instead of 100

2) Initalised the ContactLists right after the pool was created. To do this I had to add an "InitContactLists" function to arbiter as below and also make Pool<T> inherit from IEnumerable<T> so I could foreach through and call the init function:

        internal void InitContactLists(PhysicsSimulator physicsSimulator)
            //Initialize the contact lists
            if (_contactList == null)
                _contactList = new ContactList(physicsSimulator.MaxContactsToDetect);
            if (_newContactList == null)
                _newContactList = new ContactList(physicsSimulator.MaxContactsToDetect);
            if (_mergedContactList == null)
                _mergedContactList = new ContactList(physicsSimulator.MaxContactsToDetect);

And in ConstructPhysicsSimulator():

            //Poolsize of 1000, will grow as needed.
            arbiterPool = new Pool<Arbiter>(1000);
            foreach (Arbiter a in arbiterPool)


Aug 13, 2009 at 7:11 PM

I figured i should post in this topic since i am currently also trying to get my code to run smoothly on the xbox. My first question, however, has little to do with the topic at hand.

@genbox - I havnt updated my Farseer since back when Jay and I were asking about whether it plays nice with FRB back around January. Will i run into any issues and will i find any performance benefits by switching to the latest realease?

Also, can someone point me to or provide me with info on the inactivity controller and how to properly use it?


Aug 13, 2009 at 8:57 PM

See the 2nd post here

Aug 13, 2009 at 9:48 PM


I'm editing so many source trees and different versions that I can't remember if we had any huge performance improvements. The best way of finding out is to simply try it out. The API should not have changed too much, so only minor edits are needed to get the latest version working.

If you use copies of geometries and the distance grid narrow phase collider (it is default), you will get lower loading times since copying of geometries was broken before 2.1.1.

Aug 14, 2009 at 1:08 AM


The inactivityController has two parts:

the controller attached to the simulator:

            mInactivityController = new InactivityController(mPhysicsSimulator);
            mInactivityController.Enabled = true;
            mInactivityController.ActivationDistance = someDistanceFloat; 
            mInactivityController.MaxIdleTime = maximumTimeThatObjectCanBeIdleBeforeBecomingInactive;

then the properties on each Body object that you want to be affected by the InactivityController:

someBody.IsAutoIdle = true;
someBody.MinimumVelocity = 1f;

if the Body Velocity falls below MinimumVelocity, the idle timer starts counting up to MaxIdleTime.

In addition to the activityController, I was thinking of keeping 2 different lists within my FlatRedBall level ("screen"):

//LevelDirection determines the sort property of level objects
        //e.g. if leveldirection = right then objects are sorted by position.X+
        private enumLevelDirection mLevelDirection = enumLevelDirection.None;

        //Lists contain objects outside of screen bounds, sorted by position X or Y
        private List<PhysicalObject> LInactiveWorldObjects = null;

        //Lists contain objects within screen bounds
        private List<PhysicalObject> LActiveWorldObjects = null;

Based on the current enumLevelDirection, the objects in each list would be sorted by their Position on screen.  So in your game since the action is going from left to right, the object would be sorted by Position.X incrementing.

By keeping track of the visible screen coordinates:

//find screen edges at current Z
            float leftEdge = SpriteManager.Camera.X - SpriteManager.Camera.RelativeXEdgeAt(SpriteManager.Camera.Z * -1);
            float rightEdge = SpriteManager.Camera.X + SpriteManager.Camera.RelativeXEdgeAt(SpriteManager.Camera.Z * -1);
            float bottomEdge = SpriteManager.Camera.Y - SpriteManager.Camera.RelativeYEdgeAt(SpriteManager.Camera.Z * -1);
            float topEdge = SpriteManager.Camera.Y + SpriteManager.Camera.RelativeYEdgeAt(SpriteManager.Camera.Z * -1);

you should be able to move things from the inactive list to the active and from active to the inactive... without going through all the contents of each list.



Aug 17, 2009 at 2:41 PM

I tried updating to the farseer 2.1 and ran into an issue. The issue is that it appears a geometry cannot be removed or made to not collide if it is currently colliding with another geometry. I believe we ran into this problem in 2.0 and genbox helped megamanzz with a fix. Any idea what might be causing this problem for us?


Aug 17, 2009 at 4:30 PM

It might be the same issue again. The fix caused some major problems with stability. The real fix is to remove all contacts related to a geometry once it gets removed from the physics simulator.

The contacts should get moved if you dispose the geometry (geom.Dispose) or disable it (geom.Enabled = false).

Aug 21, 2009 at 4:22 PM
danthekilla wrote:

I have also noticed that most of the 360's slowdown is in the broad phase and not the narrow phase (in the pyramid demo for example)
where as on the pc the broad phase is usually less than the narrow


If this is the case then what would be the best way to balance optimize the level for collision detection (i.e. size of geoms). Currently my levels are made up of relatively small blocks each just a little bigger than the character itself. I did this initially to cut down on the narrow phase collision detection, however reading this makes me think im doing more harm than good.

Aug 21, 2009 at 5:20 PM

It is a tradeoff between number of geometries and size of geometries. Profiling before and after is always a good idea to see if you actually gained some performance from your optimization. You can also try changing the broadphase to a different algorithm and see if that increase your performance.

By the way, the default broad phase does a lot of operations and it might give better performance switching to a simpler broad phase like the spatial hash (on the Xbox). Only way to know is to measure it.

Aug 21, 2009 at 11:47 PM

Thanks for the tip, I gave the Spatial Hash a try and it definitely was an improvement now I will mess around with the size of the geoms. Is there a tool that lets you profile code that is running on the xbox? I have been profiling on my PC but it would be more helpful if the profiling took into account the differences and limitations in platform.


Aug 22, 2009 at 11:50 AM

The performance panel in the debugview will show you the areas to concentrate on. For everything else you can use the Remote Performance Monitor (you might already have it installed).