Extending SuRF resource classes

One cool feature SuRF has had for long but I didn’t know about is the ability to extend resource classes depending on their RDF:type. By default the resources in SuRF are all instances of surf.Resource. So they all have common methods like save() and dynamic attributes in “prefix_predicate” form. What if you wanted to have some logic that operates just on one kind of resources? The object-oriented way would be to store this logic inside the class. i.e., instead of


you would rather have


It turns out you can do that with SuRF. Write your own class that implements do_stuff method (or adds some extra attributes or properties), then put it in session.mapping:

session.mapping[surf.ns.EXAMPLE_NAMESPACE.some_type] = MyClass

From then on all resources of type example_namespace:some_type will also be subclasses of MyClass. Here’s a complete example that extends foaf:Person type resources with method get_friends_count:

import surf

class MyPerson(object):
	""" Some custom logic for foaf:Person resources. """
	def get_friends_count(self):
		return len(self.foaf_knows)
session = surf.Session(surf.Store(reader = "rdflib", writer = "rdflib"))		
session.mapping[surf.ns.FOAF.Person] = MyPerson

# Now let's test the mapping
john = session.get_resource("http://example/john", surf.ns.FOAF.Person)

# Is `john` an instance of surf.Resource? 
print isinstance(john, surf.Resource)	
# outputs: True

# Is `john` an instance of MyPerson?
print isinstance(john, MyPerson)
# outputs: True

# Try the custom `get_friends_count` method:
print john.get_friends_count()
# outputs: 0

Attribute aliases with properties, descriptors

Being able to customize SuRF resource classes allows for many nifty things. For example, you can add a short-named property to your class that works as an alias for some longer but frequently used attribute. If you expect the attribute to always have just one value you can encapsulate that in the property code as well:

class MyPerson(object):

	def name(self):
        return self.foaf_name.first

If you define several such properties, you’ll start to see duplication of code. Following the DRY principle we can replace this code with descriptors (which is, by the way, the standard way how things are done in RDFAlchemy):

First, fragment from surf/util.py:

class single(object):
    """ Descriptor for easy access to attributes with single value. """
    def __init__(self, attr):
        if isinstance(attr, URIRef):
            attr = rdf2attr(attr, True)
        self.attr = attr
    def __get__(self, obj, type = None):
        return getattr(obj, self.attr).first
    def __set__(self, obj, value):
        setattr(obj, self.attr, value) 

    def __delete__(self, obj):
        setattr(obj, self.attr, []) 

In addition to read access, this descriptor also supports setting and deleting value. Here’s how it would be used:

from surf.util import single

class MyPerson(object):
    name = single("foaf_name")

More complex example: implementing rdf:Bag

There was a question on SuRF mailing list recently: does SuRF support RDF containers–Bag and Seq? Currently the answer is, unfortunately, no, or at least not very well. But customizable classes can help us here.

Bag and Seq are basically conventions how to model ordered and unordered lists in RDF. Here’s an example of a Bag in RDF/XML representation from W3Schools:

<?xml version="1.0"?>




Here’s the same data in N-Triples notation:

_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://www.w3.org/1999/02/22-rdf-syntax-ns#Bag> .
_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#_1> "John" .
_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#_2> "Paul" .
_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#_3> "George" .
_:genid1 <http://www.w3.org/1999/02/22-rdf-syntax-ns#_4> "Ringo" .
<http://www.recshop.fake/cd/Beatles> <http://www.recshop.fake/cd#artist> _:genid1 .
<http://dog> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://xmlns.com/foaf/0.1/Person> .

So we see that there is a resource (a blank node in this case) of type rdf:Bag and with predicates rdf:_1, rdf:_2, … containing bag items. Let’s write a class that extends resources of type rdf:Bag and provides an iterator over these predicates:

class MyBag(object): 
    def iterator(self): 
        # First, load everything that's known about this resource. 
        # Now bag items are available in attributes 
        # "rdf__1", "rdf__2", ... 
        # We can either generate attribute names 
        # and use getattr function, or we can generate 
        # predicate URIs and look into self.rdf_direct 
        # dictionary directly. I chose the latter. 
        i = 0 
        while True: 
            i += 1 
            predicate_uri = surf.ns.RDF["_%d" % i] 
            if not predicate_uri in self.rdf_direct: 
                raise StopIteration 
            yield self.rdf_direct[predicate_uri] 
    def __iter__(self): 
        return self.iterator() 
session.mapping[surf.ns.RDF.Bag] = MyBag 

And here’s the iterator in action:

cd = session.get_resource("http://www.recshop.fake/cd/Beatles", surf.ns.OWL.Thing) 
artists = cd.cd_artist.first 

for name in artists: 
    print name 
# prints: 
# [rdflib.Literal(u'John')] 
# [rdflib.Literal(u'Paul')] 
# [rdflib.Literal(u'George')] 
# [rdflib.Literal(u'Ringo')] 

This is of course just a proof-of-concept code, and some essential functionality is missing. For example it would be nice for bags and sequences to also support modification and element access by index not just iteration. However, we see that ability to customize resource classes provides a nice and non-intrusive way to implement such features.


One thought on “Extending SuRF resource classes

Comments are closed.