This post demonstrates how to get SuRF running on Google App Engine.
I’m quite fond of App Engine: It’s easy to learn, with some python web programming background you can learn the basics in one evening. It does good job abstracting away many of the complexities associated with building and running web apps. It scales if used properly. And it’s free!
There are some drawbacks too. App Engine environment is pretty locked down, you don’t get shell access or ability to use binary libraries apart from those provided by Google.
So how do you work with RDF data on App Engine, can you use SuRF on it? My preliminary experiments show that yes, you can! Uncompressed python libraries can be just included in the application. The only notable obstacle is the binary bit in RDFLib 2.4.x. Luckily, RDFLib 2.5 is pure python and I got it working with SuRF.
Another thing to keep in mind is that App Engine has file count limit of 3000. If using several libraries that pack lots of files you can realistically hit that limit. And it’s not cool to shuffle around hundreds or even thousands of files. It would be nice if App Engine supported .egg files–ZIP-compressed libraries–or something similar. And yes, it does have something similar. No support for egg files as such but you can use zipimport module to import modules from ZIP archives.
So I decided that I want to have SuRF with rdflib and sparql_protocol plugins in my AppEngine-powered app. Here’s the dependency hierarchy:
- SuRF (SVN version)
- sparql_protocol plugin for SuRF, SVN version
- rdflib plugin for SuRF, SVN version
Stripping extra whitespace in source files
There was one minor inconvenience with
zipimport, it seems that it doesn’t like source files that contain trailing whitespace at end of lines. A quick google search turned up shell one-liner that fixes that issue. I had to run it on SuRF and rdflib, the other packages apparently didn’t have any trailing whitespace.
The decompressing, whitespace stripping and archiving steps add together and they are not overly enjoyable. However, if there’s enough interest, a SuRF & App Engine “starter kit” could be put together where these things are already integrated.
Putting it all together
Here’s how my App Engine project folder looks (
run.sh is just for running
Here’s the contents of
import sys # Put .zip files in sys.path, Python will use zipimport # to import these. for package in ["simplejson", "rdflib", "SPARQLWrapper", "surf", "surf_rdflib"]: sys.path.insert(0, "%s.zip" % package) import surf import surf.store # We don't have setuptools so cannot load plugins the # normal way, this is a workaround that manually # imports rdflib plugin and injects it into surf.store # so the store knows about it. from surf_rdflib.reader import ReaderPlugin as RdflibReaderPlugin from surf_rdflib.writer import WriterPlugin as RdflibWriterPlugin surf.store.__readers__["rdflib"] = RdflibReaderPlugin surf.store.__writers__["rdflib"] = RdflibWriterPlugin store = surf.Store(reader = "rdflib", writer = "rdflib") session = surf.Session(store) store.load_triples(source = "http://www.w3.org/People/Berners-Lee/card.rdf") store.load_triples(source = "http://monkeyseemonkeydo.lv/foaf.rdf") print 'Content-Type: text/plain' print '' print "Some persons:\n" Person = session.get_class(surf.ns.FOAF.Person) all_persons = Person.all() for person in all_persons: print person.foaf_name.first
And here’s the script in action: http://cuu508-test.appspot.com/.
The example above is very basic of course and doing more sophisticated things is likely to reveal new problems. Anyway, it’s a start!