Hacking the SqlAlchemy Base class

I'm not a month into my new job. I've started working for a market research company here locally. Definitely new since I don't know I've ever found myself really reading or talking about 'market research' before. In a way it's a lot like my last job in advertising. You get in and there's a whole world of new terms, processes, etc you need to get your head around. The great thing about my new position is that it's a Python web development gig. I'm finally able to concentrate on learning the ins and outs of the things I've been trying to learn and gain experience with in my spare time.

So hopefully as I figure things out I'll be posting updates to put it down to 'paper' so I can think it through one last time.

So started with some more SqlAlchemy hacking. At my new place we use Pylons, SqlAlchemy (SA), and Mako as the standard web stack. I've started working on my own first 'ground up' project and I've been trying to make SqlAlchemy work and get into the way I like doing things.

So first up, I like the instances of my models to be serializable. I like to have json output available for most of my controllers. We all want pretty ajax functionality right? But the json library can't serialize and SqlAlchemy model by default. And if you just try to iterate over sa_obj.__dict__ it won't work since you've got all kinds of SA specific fields and related objects in there.

So what's a guy to do? Run to the mapper. I've not spent my time I pouring over the details of SA parts and the mapper is something I need to read up on more.

Side Notes: all these examples are from code using declarative base style SA definitions.

The mapper does this great magic of tying a Python object and a SA table definition. So in the end you get a nice combo you do all your work with. In the declarative syntax case you normally have all your models extend the declarative base. So the trick is to add a method of serializing SA objects to the declarative base and boom, magic happens.

The model has a __table__ instance in it that contains the list of columns. Those are the hard columns of data in my table. These are the things I want to pull out into a serialized version of my object.

My first try at this looked something like

[sourcecode lang="python"] def todict(self): d = {} for c in self.__table__.columns: value = getattr(self, c.name) d[c.name] = value

return d [/sourcecode]

This is great and all but I ran into a problem. The first object I ran into had a DateTime column in it that choked since the json library was trying to serialize a DateTime instance. So a quick hack to check if the column was DateTime and if so put it to string got me up and running again.

[sourcecode lang="python"] if isinstance(c.type, sqlalchemy.DateTime): value = getattr(self, c.name).strftime("%Y-%m-%d %H:%M:%S") [/sourcecode]

This was great and all. I attached this to the SA Base class and I was in business. Any model now had a todict() function I could call.

[sourcecode lang="python"] Base = declarative_base(bind=engine) metadata = Base.metadata Base.todict = todict [/sourcecode]

This is great for my needs, but it does miss a few things. This just skips over any relations that are tied to this instance. It's pretty basic. I'll also run into more fields that need to be converted. I figure that whole part will need a refactor in the future.

Finally I got thinking, "You know, I can often do a log.debug(dict(some_obj)) and get a nice output of that object and its properties." I wanted that as well. It seems more pythonic to do

[sourcecode lang="python"] dict(sa_instance_obj) # vs sa_instance_obj.todict() [/sourcecode]

After hitting up my Python reference book I found that the key to being able to cast something to a dict is to have it implement the iterable protocol. To do this you need to implement a __iter__ method that returns something that implements a next() method.

What does this mean? It means my todict() method needs to return something I can iterate over. Then I can just return it from my object. So I turned todict into a generator that returns the list of columns, values needed to iterate through.

[sourcecode lang="python"] def todict(self): def convert_datetime(value): return value.strftime("%Y-%m-%d %H:%M:%S")

d = {} for c in self.__table__.columns: if isinstance(c.type, sa.DateTime): value = convert_datetime(getattr(self, c.name)) else: value = getattr(self, c.name)

yield(c.name, value)

def iterfunc(self): """Returns an iterable that supports .next() so we can do dict(sa_instance)

""" return self.todict()

Base = declarative_base(bind=engine) metadata = Base.metadata Base.todict = todict Base.__iter__ = iterfunc [/sourcecode]

Now in my controllers I can do cool stuff like

[sourcecode lang="python"] @jsonify def controller(self, id): obj = Session.query(something).first()

return dict(obj) [/sourcecode]