2013-04-06

Performance Fixes

To try and improve the stability and responsiveness of the site (sill hovering at about 99.6% which is okay but not great) I grabbed a trial copy of the ANTS Performance Profiler and did some CPU and Memory profiling. Good news - there was some low-hanging fruit that may explain some of the issues. All of it came down to non-obvious behavior in the PDFsharp library I use as an intermediary for rendering to bitmaps and PDFs.
  • GPU: Each transform applied to the graphics state resulted in two matrix multiplies. In many cases my code was stacking up to 5 transforms before rendering anything (scale, rotate, translate, rotate again, scale again...). This showed up as a CPU hot-spot. This was greatly reduced by composing the matrices locally then sending the final matrix into the graphics context.
  • Memory: Every text operation (DrawString, MeasureString) resulted in the allocation of several temporary StringFormat objects. Just a few seconds of panning around the map would generate 3MB of such objects - even more than temporary strings and at least an order of magnitude more than any other memory churn. They would all be GC'd eventually, but under load this could have caused the site to blow past the application memory pool limit resulting in a reset. This was fixed by changing PDFsharp to re-use the default objects and I reported the issue to the maintainers.
The remaining major CPU hot spot is, unsurprisingly, in clip path intersections for borders and sector boundaries, which is probably the most subtle and complex part of the map rendering process. Unfortunately, it will be hard to optimize directly but I can probably reduce the need to do it in the first place, e.g. more aggressively pruning which borders to render for a tile, and only doing sector edge clipping if there's a border that goes outside the sector in the first place.

I'm hoping that above fixes will result in higher site availability - I'll be watching my monitoring service (Pingdom) to see if they've made a material difference.

By the way, the ANTS profiling tools are quite nice. There's one for Memory and one for CPU. Like most profilers they work by attaching to your process and recording what functions are taking time (for CPU) or what objects are being allocated/freed (Memory). The ANTS tools integrate very nicely with ASP.NET - I was able to just point them at my development directory and say "go" and they launched a dedicated web server and browser so I could generate some traffic. The results are also really easy to understand - there's the traditional drill-in display that shows you inclusive time for each function, but also a source display and interactive call/allocation graph. Very effective!

UPDATE (2013-04-07):

Just to show how helpful profiling tools can be: I was investigating the border clip path logic and found a flaw in the "selector" code which helps determine which sectors are "in view" when rendering a tile. It was under-computing the number of sectors overlapped by a tile. Fortunately, there's already a 1-sector "slop factor" that's necessary so that e.g. routes or labels that cut across a sector's boundary are rendered, so this bug never manifested as missing content. Unfortunately, the correct code meant that even more time was spent on border path clipping (c/o the ANTS CPU profiler).

Borders are a special case where they are always clipped to the sector bounds anyway, though, so I rather than relying on the expensive clipping operations to do the right thing I added a simple bounds-intersection test. This cut border rendering from nearly 55% of the time of a typical tile rendering pass to only about 15%, under the overhead for drawing worlds (about 22%). This should also help with site stability.

Ω