Functional Benchmarking Accessibility

by Ross Patterson last modified 2009-01-08T07:55:00+02:00
More progress on load test benchmarking

Yesterday was another great day at the Plone Performance Sprint in Bristol, UK.

I continued working with the load test benchmarking team yesterday. One of the more enjoyable aspects of our teams work is how natural and effective the division of labor has been. Tom and I worked on the funkload buildout and the Plone core load read-only tests. Andrew built on the read-only tests to produce a write-heavy load test. Ed and Russ have been working at least in part on different content profiles against which to run the different test scenarios.

Toward the end of the day, Tom and I moved onto working on making the funkload more generally usable to the wider Plone ecosystem. One of the first things I did after having a buildout that could run read-only load test benchmarks was to install and turn on CacheFu without a cache proxy. Then I ran the benchmarks again and had Funkload plot some pretty benchmark diff graphs. Tom started working on packaging this extension of the buildout as a sample so that add-on maintainers and integrators can see how to do the same for other add-ons. Then they can easily compare how their add-on affects base plone performance using funkload benchmark diffs. Funkload rocks! Meanwhile, I began work on making the funkload script invocations simpler and more familiar to those of us in the zope.testing world.

One goal here is to make load test benchmarking more accessible in general. Ideally, an integrator who is savvy enough to work with buildout, can use the collective.loadtesting buildout or extend it, record a new Funkload test using the recorder proxy. Then they can post the resulting test module and configuration with their problem report or question. Part of me shudders at the thought of encouraging broader access to benchmarking, especially since it's so easy to create unrepresentative benchmarks. I think, however, that drawing back the curtains on Plone performance to expose both the positive and the negative, even if messy, can be best in the end.

Meanwhile, the Andrew's write-heavy load tests reproduced the write-concurrency ZODB conflict bug that has recently been discussed on the lists. This is actually my big yin so I'm totally stoked to see some light being shed on this. The test scenario registers a new user, logs them in, goes to their member folder, adds a folder, adds a page to the new folder with lipsum field values, and logs out. The problems began to show themselves pretty heavily starting at about 5 concurrent users hitting one instance. After brainstorming with Lawrence, Andrew began generating load test diffs after experimenting with changes to try and isolate the write concurrency bug. First, Andrew looked into whether the response was being rendered before hitting a conflict error and thus being forced to render it again when it retries. The idea was that this could extend the duration of the transaction long enough to significantly increase conflicts. Archetypes does already, however, do a redirect after successful edit. We do have many more ideas to test out and now we have real measurements. The day ended with Andrew factoring out the member registration part of the test scenario to try and isolate the problem further.

Today Tom and I will likely focus on further polishing and documenting the buildout for release to the community. We'll probably also work on the buildbot configuration that Andreas provided. We want to package it to be run against Plone core development on a regular basis. It would be great to have a set of pages available to view the diff of performance for the last day of changes, the last week of changes, the last month of changes, etc..