What’s been happening with SuRF

There was a SuRF 1.1.0 release on January 20, 2010, and it brought the following:

  • Class mapping based on rdf:type of resources and “single” descriptor. I’ve already blogged about it here: Extending SuRF resource classes.
  • MIN, MAX, AVG functions, UNION groups in query builder. For situations when querying capabilities of SuRF resources are not enough, you’ll sometimes resort to writing manual SPARQL/SPARUL queries and interpreting their results yourself. SuRF can help here a bit: instantiate Query object, call its methods to add things like FROM and WHERE clauses, and SuRF will translate it to string representation upon execution. Building queries this way can result in cleaner code than doing lots of string concatenations. Query builder doesn’t yet support all SPARQL/SPARUL syntax features and we’re extending it as we go along. Starting from v1.1.0 it can build queries containing aggregate functions and SVN version can do unions:
    import surf
    from surf.query import a, select
    from surf.query.translator.sparql import SparqlTranslator
    
    # get session here...
    
    query  = select("min(?price)", "max(?price)")
    query.union(("?s", surf.ns.SURF.price, "?price"), 
                 ("?s", surf.ns.SURF.discount_price, "?price"))
    
    # Now either execute the query
    result = session.default_store.execute(query)    
    
    # ... or translate it to string and look at it:
    print SparqlTranslator(query).translate()
    # prints: u'SELECT  min(?price) max(?price)  WHERE { {  ?s <http://code.google.com/p/surfrdf/price> ?price  } UNION {  ?s <http://code.google.com/p/surfrdf/discount_price> ?price  } }
    
  • get_by() accepts resource instances as values. To demonstrate:
    Person = session.get_class(surf.ns.FOAF.Person)
    mary = Person("http://mary.example.com/me")
    
    # Using URIRef as argument value--
    persons_who_know_mary = Person.get_by(foaf_knows = mary.subject)
    
    # And now, use resource 'mary' in get_by() directly--
    persons_who_know_mary = Person.get_by(foaf_knows = mary)
    
    # BTW similar effect can also achieved with
    # inverse attributes:
    persons_who_know_mary = mary.is_foaf_knows_of
    
    
  • ResourceValue supports “in” keyword. ResourceValue is the class that represents resource attributes. It tries to mimic Python lists and it got better in this by supporting “if john in mary.foaf_knows: ... ” syntax.
  • Fixed multiple bugs in Sesame2 plugin. Due to lack of manpower in SuRF development there are few dusty corners in codebase, like Sesame2 plugin. Well, it got a bit better!

And here are more recent developments in SVN trunk, not yet available in released version:

  • HTTP 1.1 Keep-alive support in sparql_protocol plugin. Keep-alive feature lets you reuse single connection for several requests. This is especially important on Windows systems if doing many requests in short period of time. Client creates a connection for each request, each connection occupies a port on client machine for 120 seconds or so. Due to default port configuration on Windows after few thousand requests the system runs out of free ports and requests start to fail. Keep-alive solves this. To use it, you’ll need to upgrade SPARQLWrapper to version 1.4.1 (released today!) and supply argument use_keepalive = True when creating SuRF store.
  • Fixed memory leak when eager-loading resources (using .full() modifier). All eager-loaded resources were incorrectly marked as dirty/unsaved and added to the pool of dirty resources. In intensive or long running processes this pool slowly grows, memory consumption grows, all SuRF operations gradually become slower because Python has to manipulate monster-sized set of dirty resources.
  • Significantly increased the efficiency of updates/deletes in sparql_protocol plugin:
    • store.update(), store.save(), store.remove() methods now accept multiple resources, and in case of sparql_protocol, these updates are performed in one or two queries. Previously, a separate SPARUL query would be issued for each resource.
    • OpenLink Virtuoso SPARQL endpoint supports multiple SPARUL queries in one request. SuRF can be instructed to utilize this feature by initializing store with “combine_queries = True” parameter.
    • Queries that delete resources now specify graph in WHERE clause. This greatly speeds up deletes on OpenLink Virtuoso with default indexes.

Locating and eliminating performance bottlenecks is fun! More fixes and improvements to come!

Advertisements

3 thoughts on “What’s been happening with SuRF

  1. just a little note: I think the release date was in 2010?
    Is not for being be picky, but an article talking about current development from one year ago might cause people not to use the library…

    You dont need to keep this comment!

    bye

Comments are closed.