Skip to content. | Skip to navigation

Personal tools


You are here: Home / Blog / Cloud Busting

Cloud Busting

by Ross Patterson last modified Oct 23, 2008 07:58 AM
What does it mean to be a good cloud citizen?

I know I'm not alone, but I've been thinking a lot recently about what the cloud means for rich web applications like CMS's such as Plone.

Firstly, I find Google App Engine (GAE) much more inspiring than Amazon's Elastic Compute Cloud (EC2) when it comes to thinking about cloud computing.  Kinda funny considering that EC2 actually has "cloud" in its name while GAE doesn't.

It seems to me that EC2 isn't really a cloud.  For my money, EC2 is just a really big virtual hosting provider that has done a great job of limiting hosting costs by bringing efficiency to client relationships and expectations.  They do this by simultaneously limiting their responsibility and clearly defining their remaining responsibility in accessible APIs.  They exploit a very interesting and relatively new technology sweet spot to do this, virtualization. This has the potential to become the *foundation* for cloud computing,but is not cloud computing in itself.  The strategy seems to be to provide the foundation, open up the technology race, and wait and see who succeeds in building good clouds on that foundation.  I kinda like this strategy, but I do see some problems with it.  Of course, mostly,I just don't like thinking about hardware, virtualized or no.  :)

Firstly, computing in the cloud is definitely going to require changes from the code that operates in the cloud.  EC2 enables application builders to construct large distributed web applications without ever making their code into good cloud citizens.  That my not be a good thing.  Similarly, various parties are building true cloud frameworks on top of EC2.  It may be that an inferior cloud framework places fewer requirements on the hosted code, thus gaining more adopters,thus starving better cloud frameworks.

GAE seems to come at the cloud from the other direction.  Rather than providing all options on something that isn't yet a cloud as EC2 does,GAE provides a true cloud but with very limited options.  GAE too exploits a technology sweet spot, but a very mature and well established one, Python.  The strategy there would seem to be to open up the technology race in the other direction inviting framework/application developers and GAE to come together over time. That is, framework/application developers will make their code better cloud citizens guided by the restrictions of a true cloud and GAE will become richer and less restrictive as Google sees where to put its resources in terms of extensions or flexibility.  I've heard GAE called a platform for toy applications.  It seems to me it's not so much a toy platform as a very serious platform in its adolescence.

Now, what about large, feature rich applications like CMS's such as Plone?  The ideal large, feature rich application still has a lot of code, the code that implements and integrates all those features.  As such, I'm going to set aside the question of whether or not a given large, feature rich application is *too* large, just for the moment. Such waste definitely becomes even more painful in the cloud and good cloud citizens should definitely continually evaluate their code waste even more vigilantly, it's just not what I want to consider right now.

I think specialization is going to be one of the most important aspects of cloud computing.  In order for a cloud to be flexible and transparent while simultaneously achieving responsiveness and efficiency, applications are going to have to cooperate with the cloud so that requests can be handled by portions of the cloud that are already more or less familiar with handling those requests.  Good cloud citizens will have to factor themselves such a way as to support this.  Once code running on the cloud has factored itself into specializable requests, the cloud can be left to do the specialization as they see fit.  The cloud provider is the party most interested to squeeze the most possible efficiency out of the cloud so the implementation of specialization should really be left to them.  Of course, once again, mostly I just don't want to think about specialization, too close to the dreaded hardware.  :)

AJAX to the rescue!  It seems like AJAX first burst onto the scene to improve client side performance.  If you replaced a portion of the DOM with the results of an XMLHttpRequest, you could avoid the page reload time cost incurred by the *browser* and the cost of the user experience disjoint incurred when their browser renders a new page. Using AJAX to compose a page from multiple requests is, however,potentially *more* valuable to specialize things on the *server* side.  The win here is not so much the AJAX on the client side as is the capacity to break a page up into multiple specializable requests for the server side.

Specialization becomes most important with large, feature rich applications like CMS's.  Without specialization, the application eventually requires too much in the way of resources to handle any given request and becomes a very bad cloud citizen.

Effective specialization also requires that applications be frugal about how much of their code needs to be loaded to serve a given request.  This cuts down on memory consumption but perhaps more importantly, it cuts down on startup time when an portion of the cloud that is not yet specialized is called upon to serve a new request.  It seems like any large application worth it's salt uses some sort of modular registry (plugins, components, whatever).  I think it's vital that any such registry that hopes to be a good cloud citizen shouldn't load the code behind any given lookup until the lookup first occurs.

So lets talk about Plone.  At the moment, Plone has a lot of work to do before it becomes a good cloud citizen, but it seems like it will be possible to get there.  I will be blogging about this in more detail in the near future, but here's a sketch.

I wrote a small test that ran against an *existing* Plone trunk site and simply loaded the front page in a testbrowser.  I confirmed that zope.testing coverage reporting does not report on modules that aren't imported as a part of the test run.  Then I ran that test with a coverage report to get a sense of how much of the code that is imported for Plone startup is actually used to serve that page. Finally, I removed the coverage numbers for all Python standard modules and any third party dependencies in order to get a sense of the import waste in the Zope/CMF/Plone stack:

lines cov% lines uncovered lines covered
167,752 40.00% 100,643.21 67,108.79

Firstly, 67k lines of code to serve up the front page seems like too much, but we're aware of that.  The limbo between the more monolithic Zope 2 and the much more elegant successor Zope 3 is probably a significant part of that.  I'm going to do some more detailed examination of the coverage report, but here is the CSV of the simple coverage numbers for those interested.

The thing I find most interesting is how much is imported code isn't used.  I've been thinking about this for a while.  While some of this is certainly import waste from unused portions of the more monolithic Zope 2 and Archetypes stacks, I think much of it may also come from ZCML and the Zope Component Architecture.

The ZCA gives us a wonderful registry for a very modular architecture but using the ZCA requires importing all the code for everything registered.  Some significant advantage might be had by using a ZCA registry that didn't import the code until the first time that given lookup is performed.

There is an interesting exception, local site managers.  Since local site managers store their registry in pickles, it may be that the code behind the component isn't loaded until the lookup unpickles the registration.  I haven't confirmed this, but even if this isn't the case and the whole registry is unpickled at once, it might not be too hard to make each registration its own pickle.  Marius mentioned an effort to reduce ZCML load time by pickling the configuration actions.  Maybe it's possible to direct the ZCML actions at a local site manager instead which could then be persisted and used as the global site manager.  Then loading the ZCML at startup would be unnecessary.  In essence, this would mean *compiling* ZCML into a site manager pickle before deploying the applictaion to the cloud.  This notion of compiling configuration for deployment brushing shoulders with pickles has inspired some other interesting notions I hope to blog more about soon.

As for AJAX, Viewlets seem like an ideal framework to exploit for AJAX composed request specialization.  I might be feasible to replace viewlet renderers with ones that insert JavaScript into the containing page which will load the viewlets in separate, specializable requests. Someone's already tried it.

As for Plone on GAE, there are other, possibly larger hurdles, not the least of which is Zope's use of C extensions.  I know work has already been done to lessen dependency on Acquisition wrappers, but I don't know how close that brings us to eliminating the C extensions there. As for persistent, I don't know if what is done in persistent.cPersistence can be done in python but I have heard noises that the effort to port ZODB to Jython may yield fruit here.  It also seems like it might be possible to use some of the ORM libraries out there to implement a python only version of BTrees that uses tables on the backend.  At any rate, all of this really is pie in the sky at this point, but I have a 10% Pie in the Sky Manifesto and an awful lot of good things have started with such thinking.  I'll be blogging on this more later.

At this point I'm more interested in what computing in the cloud means in the more general senses detailed above but I'd love to hear any thoughts on any of this.

OpenID Login


IRC: rossp
Yahoo IM: patterson_ross
AIM: rosspatters
Skype: merpattersonnet

PO Box 7775 #10587
San Francisco, CA

+1 (415) 894-5323