iconGoogle App Engine


Just tried converting our Sword of Fargoal news server to Google App Engine because I heard good things about it. And indeed using its Python framework is quite nice. Instead of a database they provide a data storage facility which is similar to a database - just using GQL instead of SQL. Guess what the G stands for. Basically the code for our server remains identical - which is great. Usually when trying a framework like this I have to use lots of useless classes and XML stuff which supposedly is easier than just doing things by hand. With Google App Engine apparently not so - there was not a single XML file to edit. Big plus points for that. Also, deployment is extremely simple. Under Windows you get an application to start up with big New and Browse buttons. Simply click on New to create an example web app and then on Browse to view it in your browser. After that you can just modify the web app and simply refresh in the browser - it will automatically use the changed version. There's also a Deploy button - when I pressed it the application identifier was enough to tell google to actually deploy it to the right location. It only asked for my google user name and password. Then when I got home later under Linux I could simply download the complete web app again and update it just as easily (with a command line tool). So again a big plus point for Google - web app frameworks I used in the past required from several hours to several days just to get a steady dev environment set up. Now what about performance. In my system to request news all the client sends is this:
That is, just my application data (so far consisting of only the version). There's no need to send anything else. On the server side the server is listening on a socket and receives the client data and puts them into a database along with IP and date. Then it returns the news:
It's pretty minimal. And it's all that's needed. We use almost no bandwidth and almost no CPU. Now, what happens with the app engine version. First, I need to create a POST request. That's much better than if it was a SOAP request with several KB worth of XML headers, but still:
GET /news?version=X HTTP/1.0
Host: newsserver.fargoal.com
If we assume that X has 10 bytes then instead of 10 bytes we now send 60 bytes (newlines are sent as 2 bytes and there's an empty line at the end). Now for the reply:
HTTP/1.0 200 OK
Content-Type: text/html; charset=utf-8
Cache-Control: no-cache
Expires: Fri, 01 Jan 1990 00:00:00 GMT
Date: Tue, 25 Jan 2011 18:50:54 GMT
Server: Google Frontend

Hello from Google
Another News line
And even more news
Well, we receive 240 bytes. The Y from before is just the news text which is 55 bytes. So we have to send 600% of data compared to now and we receive 436% of data. Or in other words we send 6 times as much and receive 4 times as much. Now, considering news fetching is a very low traffic operation and the byte amounts are ridiculously small either way it doesn't matter too much. We simply get the HTTP overhead added to all out data. Now, what about CPU? The important part for us is the client. It now has to parse all the stuff Google sends back. But with HTTP/1.0 that's not too hard - just ignore all stuff until there's an empty line and consider the rest data. On the server side my standalone server simply listens on a socket for incoming connections. When one comes in the news is sent back and the IP logged into the database. With Google App Engine the web server sees the request, then decided which application to route it to, creates an application instance and executes the application. The application writes out the news (just HTTP encoded this time) and adds the IP to the Google storage. In principle not too much more - it just depends on how much all those steps can be optimized to know how much slower they are. Still, it's now running on Google's hardware so it shouldn't matter too much. How good it performs can only be known once it's out there and it takes forever for news to appear. In theory it could perform just as well as now as long as available bandwidth and CPU are never maxed out even if they are both quite a bit higher than now.