A follow up, more dict to SqlAlchemy fun

Just a quick follow up to my last post on adding the ability to add some tools to help serialize SqlAlchemy instances. I needed to do the reverse. I want to take the POST'd values from a form submission and tack them onto one of my models. So I now also add a fromdict() method onto base that looks like. [sourcecode lang="python"] def fromdict(self, values): """Merge in items in the values dict into our object if it's one of our columns

""" for c in self.__table__.columns: if c.name in values: setattr(self, c.name, values[c.name])

Base.fromdict = fromdict [/sourcecode]

So in my controllers I can start doing [sourcecode lang="python"] def controller(self): obj = SomeObj() obj.fromdict(request.POST) Session.add(obj) [/sourcecode]

Hacking the SqlAlchemy Base class

I'm not a month into my new job. I've started working for a market research company here locally. Definitely new since I don't know I've ever found myself really reading or talking about 'market research' before. In a way it's a lot like my last job in advertising. You get in and there's a whole world of new terms, processes, etc you need to get your head around. The great thing about my new position is that it's a Python web development gig. I'm finally able to concentrate on learning the ins and outs of the things I've been trying to learn and gain experience with in my spare time.

So hopefully as I figure things out I'll be posting updates to put it down to 'paper' so I can think it through one last time.

So started with some more SqlAlchemy hacking. At my new place we use Pylons, SqlAlchemy (SA), and Mako as the standard web stack. I've started working on my own first 'ground up' project and I've been trying to make SqlAlchemy work and get into the way I like doing things.

So first up, I like the instances of my models to be serializable. I like to have json output available for most of my controllers. We all want pretty ajax functionality right? But the json library can't serialize and SqlAlchemy model by default. And if you just try to iterate over sa_obj.__dict__ it won't work since you've got all kinds of SA specific fields and related objects in there.

So what's a guy to do? Run to the mapper. I've not spent my time I pouring over the details of SA parts and the mapper is something I need to read up on more.

Side Notes: all these examples are from code using declarative base style SA definitions.

The mapper does this great magic of tying a Python object and a SA table definition. So in the end you get a nice combo you do all your work with. In the declarative syntax case you normally have all your models extend the declarative base. So the trick is to add a method of serializing SA objects to the declarative base and boom, magic happens.

The model has a __table__ instance in it that contains the list of columns. Those are the hard columns of data in my table. These are the things I want to pull out into a serialized version of my object.

My first try at this looked something like

[sourcecode lang="python"] def todict(self): d = {} for c in self.__table__.columns: value = getattr(self, c.name) d[c.name] = value

return d [/sourcecode]

This is great and all but I ran into a problem. The first object I ran into had a DateTime column in it that choked since the json library was trying to serialize a DateTime instance. So a quick hack to check if the column was DateTime and if so put it to string got me up and running again.

[sourcecode lang="python"] if isinstance(c.type, sqlalchemy.DateTime): value = getattr(self, c.name).strftime("%Y-%m-%d %H:%M:%S") [/sourcecode]

This was great and all. I attached this to the SA Base class and I was in business. Any model now had a todict() function I could call.

[sourcecode lang="python"] Base = declarative_base(bind=engine) metadata = Base.metadata Base.todict = todict [/sourcecode]

This is great for my needs, but it does miss a few things. This just skips over any relations that are tied to this instance. It's pretty basic. I'll also run into more fields that need to be converted. I figure that whole part will need a refactor in the future.

Finally I got thinking, "You know, I can often do a log.debug(dict(some_obj)) and get a nice output of that object and its properties." I wanted that as well. It seems more pythonic to do

[sourcecode lang="python"] dict(sa_instance_obj) # vs sa_instance_obj.todict() [/sourcecode]

After hitting up my Python reference book I found that the key to being able to cast something to a dict is to have it implement the iterable protocol. To do this you need to implement a __iter__ method that returns something that implements a next() method.

What does this mean? It means my todict() method needs to return something I can iterate over. Then I can just return it from my object. So I turned todict into a generator that returns the list of columns, values needed to iterate through.

[sourcecode lang="python"] def todict(self): def convert_datetime(value): return value.strftime("%Y-%m-%d %H:%M:%S")

d = {} for c in self.__table__.columns: if isinstance(c.type, sa.DateTime): value = convert_datetime(getattr(self, c.name)) else: value = getattr(self, c.name)

yield(c.name, value)

def iterfunc(self): """Returns an iterable that supports .next() so we can do dict(sa_instance)

""" return self.todict()

Base = declarative_base(bind=engine) metadata = Base.metadata Base.todict = todict Base.__iter__ = iterfunc [/sourcecode]

Now in my controllers I can do cool stuff like

[sourcecode lang="python"] @jsonify def controller(self, id): obj = Session.query(something).first()

return dict(obj) [/sourcecode]

Auto Logging to SqlAlchemy and Turbogears 2

I've been playing with Turbogear2 (TG2) for some personal tools that help me get work done. One of the things I've run into is an important missing feature that my work code has that isn't in my TG2 application yet. In my PHP5 app for work, I use the Doctrine ORM and I have post insert, update, delete hooks that will actually go in and log changes to the system. It works great and I can build up a history of an object over time to see who changes which fields and such.

With my TG2 app doing inserts and updates I initiall just faked the log events by manually saving Log() objects from within my controllers as I do the work that needs to be done.

This sucks though since the point is that I don't have to think about things. Anytime code changes something it's logged. So I had to start searching the SqlAlchemy (SA) docs to figure out how to duplicate this in TG2. I wanted something that's pretty invisible. In my PHP5 code I have a custom method I can put onto my Models in case I want to override the default logged messages and such.

I found part of what I'm looking for in the SA MapperExtension. This blog post got me looking in the right direction. The MapperExtension providers a set of methods to hook a function into. The hooks I'm interested in are the 'after_delete', 'after_insert', 'after_update' method. These are passed in the instance of the object and a connection object so I can generate an SQL query to manually save the log entry for the object.

So I have something that looks a little bit like this:

[sourcecode lang="python"] from sqlalchemy.orm.interfaces import MapperExtension

class LogChanges(MapperExtension):

def after_insert(self, mapper, connection, instance): query = "INSERT INTO log ( username, \ type, \ client_ip, \ application ) VALUES( '%s', %d, '%s', '%s')" % ( u"rick", 4, u'', u'my_app')

connection.execute(query) [/sourcecode]

Then I pass that into my declarative model as:

[sourcecode lang="python"] __mapper_args__ = {'extension': LogChanges()} [/sourcecode]

This is very cool and all, but it's not all the way where I want to head. First, the manual SQL logging query kind of sucks. I have an AppLog() model that I just want to pass in some values to to create a log entry. I'm thinking what I really should do is find a different way to do the logging itself. I'm debating between actually doing a separate logging application that I would call with the objects details.

The problem with this is that one of the things I do in my current app is store the old values of the object. This way I can loop through them and see which values actually changed and generate that in the log message. This is pretty darn useful.

The other downside is that I don't have a good way to have a custom logging message generator is I just call a Logging app API.

So I think I might try out the double db connection methods that SA and TG2 support. This way I could actually try to use the second db instance with a Logging() object to write out the changes without messing up the current session/unit of work.

The missing part here is that I'm still not really sure how to get the 'old' object values in order to generate a list of fields that have been changed. Guess I have some more hacking to do.