python

Bookie Weekly Status Report Returns! - April 15 2012

Ok, I'm overdue for a 'weekly' status report. I'm going to try to kick this back into gear as it helps you out there track things and me feel like I'm moving forward by writing down all the little things I've done over the last bit.

Trello board to keep up to date: https://trello.com/board/bookie/4f18c1ac96c79ec27105f228

New Projects

In an effort to add some features to Bookie I've ended up starting two new repos of code meant to interact with Bookie.

  1. Bookie Parser

This is meant to start taking over the work of reading the page content and readable parsing the important content out. It was a chance to play with Tornado and Heroku. This also means that in the future I'll be able to scale out the readable processing serperatly from the main Bookie website and host. It's pretty bare bones right now and doesn't directly talk to Bookie, but I'll look at adding that integration soon as the API stabilizes and I get more tests going in it.

So far the Heroku bit has been pretty awesome. I have to deal with the fact that the app gets shut down and has to restart on first request, but hopefully that gets better as traffic and use picks up. You can tinker with it at http://readable.bmark.us

  1. Bookie Api

I've been wanting to start up a command line client for some of the Bookie work. The big thing is that I need tools to help manage invites and such. So it's currently very admin centric, but eventually I'd like to get this into a ncurses cool command line interface to pull up recent bookmarks and even do some quick searches via the API. Aren't API's cool. This will also contain the reference Python API implementations so we'll have two implementations soon. One in JS and one in Python.

I've got a beta version (which is really an alpha) up on PyPi so you can

$ pip install bookie_api
$ bookie ping

Build baby build

I spent some quality time with http://build.bmark.us to get the JS tests running via grover and phantomjs and that's awesome. I also added the new projects into the builder as well. So, while I don't have all the tests I need, at least now the ones I do have run consistantly.

Other little tweaks

  • Prettied up the new user invite email and landing page
  • Fixed a bug with dupe tags in the tagcontroller
  • Added more icons from the fontawesome set to pretty up the ui, especially the account page.
  • Lots of changes to the make/build steps for JS and CSS including actually doing the pyscss transition.
  • Everything is now on the final stable release of YUI 3.5. It's been a good ride through the development releases.

Upcoming events

I'll be giving a talk at Penguicon on using YUI for JS app development. If you're in the area stop by. This is Friday April 27th, at 6pm. Then on Saturday I've got a Bookie mini-sprint going on. I'll probably be hacking most of the weekend. Feel free to stop by and check things out.

A few ideas, quick ways to get on the Bookie contrib list

Quick ideas for improving Bookie

Well, with all the great stuff going on with Bookie, I've gotten a bit buried in some big changes. The background processing and importing updates are going to take a bit to get right.

This means, there's a great chance for others to hack up the little tweaks that we need to really add some polish to Bookie. So below I've listed a few ideas that should be pretty simple things to add, but with a really good positive and visible effect on the site.

  • Add notification that user has invites

    Now that invites are there, we should highlight a user's account navigation link to let them know they have invites available. I'll periodically add them to the system, and we don't want users to have to go to their account page each time to see they've got invites. I think a simple adding of one of the envelope/message type icons from our font-icon set would be perfect, with some sort of hover message to start. We might also want to highlight the block in their account page so it stands out that the invites are available.

  • Flash message system.

    We want to be able to let users know things have happened successfully after doing something that redirected them. Imports are going to be doing this, saving/updating bookmarks, etc. It'd be nice to have a consistent type of ui to drop flash messages in and them to show after a redirect.

  • Show new user message if self bookmarks page has no results

    When a new users starts up and logs in, they default to their own page of bookmarks...which is going to be empty. So we should detect this in our JS code that fetches the results and displays a set of default content with links to things like importing instructions, where to get the chrome extensions, and other handy new user tips.

    Some of this might also be nice to use for the email that a new user gets when they've been invited to Bookie.

  • Add firefox bookmark importer

    Ok, so this one is a bit more involved, but really, it's a single class and a couple of Python methods. The hard part is reading in and figuring out how to match bookmarks to tags in Firefox's JSON dump of bookmarks. Once we get the Firefox extension rolling, it'll be great to have a good import system for the browser as well.

Well, here are four things I'd love to see happen in the near future to help make the experience a level nicer for everyone. If you're interested in all or have any questions, ping me in #bookie in irc or shoot me a comment below. I'd be happy to help walk anyone through these or any other ideas you might have.

PyCon 2012: What a ride!

Phew, tiring trip to PyCon this year. This was my second year after hitting up my first last year. The conference definitely felt larger than last year as they crossed 2,200 attendees. It's unbelievable to see how large the Python community has gotten. I can't stress what great job the people that put this together.

Last year I hardly knew anyone. This year, however, I got to put faces to people I've interacted with over the last year, welcome back those I met last year, and get some face to face time with new co-workers from Canonical. The social aspect was a larger chunk of my time this year for sure.

Side note, I listen to The Changelog podcast from time to time, and I love their question on who you'd love to pair up/hack with as a programming hero type question. I got to meet and greet mine at this PyCon by meeting up with Mike Bayer. He's behind some great tools like SqlAlchemy and Mako. What I love is that, not only does he rock the code part, but the community part as well. I'm always amazed to see the time he puts into his responses to questions and support avenues. Highlight of my PyCon for sure.

I'll post a seperate blog post on my sprint notes. I feel that if you're going to go, you might as well stay for sprints. I get as much out of that as the conference parts itself. I think I made some good progress on things for Bookie this year. The big thing is that an invite system is in place, so if you'd like an account on Bmark.us let me know and I'll toss an invite your way.

Notes

  • Introduction to Metaclasses
    • Basic but reminded me how the bits worked and had some good examples. I like this because I often write 'the code I want to be writing' and then write my modules/etc to fit and metaclasses help with this sometimes.
  • Fast Test, Slow Test
    • Just a reminder that fast tests are true unit tests and run during dev which helps make things easier/faster as you go vs the whole 'mad code' then wait for feedback on how wrong you are.
  • Practical Machine Learning in Python
    • mloss.org - check out for lots of notes/etc on ML in OSS
    • ml-class.org - teach me some ML please
    • sluggerml - app he built as a ML demo
    • scikit-learn : lots of potential, very active right now
  • Introduction to PDB
    • whoa...where have you been all my life 'until' command?
    • use 'where' more to move up stack vs adding more debug lines
  • Flexing SQLAlchemy's Relational Power
  • Hand Coded Applications with SQLAlchemy
    • <3 SqlAchemy. Some really good examples of writing less code by automating the biolerplate with conventions.
  • Web Server Bottlenecks And Performance Tuning
    • lesson: if you think it's apache's fault think again. You're probably doing it wrong.
  • Advanced Celery
    • check out cyme https://github.com/celery/cyme, possible way to more easily run/distribute celery work?
    • cool to see implementations of map/reduce using celery
    • chords and groups are good, check them out more
  • Building A Python-Based Search Engine
    • Good talk for into into terms and such for fulltext search
  • Lighting talks of note
    • py3 porting docs: http://docs.python.org/howto/pyporting
    • bpython rewind feature is full of win over ipython
    • 'new virtualenv' trying to get into stdlib for py3.3, cool!
    • asyncdynamo cool example of async boto requests for high performance working with AWS api (uses tornado)
    • I WANT the concurrent.features library...but it's Python 3 :(

Been a good summer, fitness, woodworking, and new job coming soon...

I'm very late on a bunch of weekly status reports, but I've got good reason I promise. This summer has been a bit of a "summer of change" for me. Perhaps more renovation than change.

First, I decided I was sick of looking and feeling like crap and got into a weight loss program. I've finished that program today with some really good results. I'm down 37lbs, with another 19 to go until I hit my goal. Here's hoping I can get there before the year is out. This has really helped me feel a ton better. I did a lot of biking this summer, including a 48mile monster that beat me up, but man was that fun.

Next up, I finally managed to clean the garage out. That means the woodworking tools are open for business. I've only started working on putting some finish on some new oak closet doors, but that's a huge move forward. I've not done any woodworking since the boy was born and now that the space is cleared out I'm looking forward to getting back to making some shavings.

Finally, the big news. I've put in my two weeks notice with Morpace and have accepted an offer to work for the Launchpad team with Canonical.

I can't express how excited I am to get this chance to work with a team of smart people that are really working hard to build something awesome. In many ways this is the culmination of a goal I set years ago to get paid to work on open source software, and not only that, but to get paid to work with Python as well.

Morpace was really good to me and I feel sad to leave them. They gave me my first chance to prove that I could be a productive Python developer. I've grown though and the opportunity to work with all of the great people at Canonical is just too much to pass up. I can't wait to jump in and see what I can help with.

It's been a good summer. I can only hope that each year continues to be as fulfilling.

Bookie Status Report: Jun 22nd 2011

Phew, it's been a good week with Bookie. First, we have some new ways to contact and follow Bookie. There's now a Google Group for better long form discussions and assistance. You have a feature idea, question, or just feedback? Then go ahead and send it to the list. Don't forget to follow BookieBmarks on Twitter. I want to start using these things to bounce ideas off of people as we move forward.

We managed to get user authentication working and added support across the site for routes for each user in the system. This was great because it means that http://bmark.us is a live running instance with a few different real users. So far we're running seven different users on the system. Part of that was getting all the routes updated, the queries, the templates to only show links for import/export if you're logged in, all of that stuff. The extension needed a little bit of updates, but the only big thing is changing your API URL to have your username in it.

We moved the documentation to http://docs.bmark.us and I tried to update the links in the docs as well. Much as Identi.ca is a running instance of Status.net, Bmark.us is a running instance of Bookie. I just can't get a cool url for Bookie such as http://bookie.net.

With the new site I purchased an SSL certificate. So you'll notice that the site is now all behind https. Part of the other sysadmin items were to setup twice daily pgdumps of the data which is sync'd off to S3, munin to monitor the server resources, including plugins for Nginx and Postgresql, and some Nginx config tweaks which I still need to document in the docs.

All of the code updates are available in the develop branch on Github. If you've got any questions feel free to hop into #bookie on irc and let me know.

With all this going on I'm looking for some early alpha testers. If you're interested in Bookie, but couldn't, or didn't want to, setup your own instance, let me know. I'd love to get another 6-10 people on this instance. The feedback of current users has already paid dividends. You can thank them for a pair of new features in the Chrome extension, including a new keyboard shortcut (ctrl-alt-d) and support for helping auto complete recently used tags.

Coming up, we need to work on moving the app forward by adding a user account page, creating add/edit abilities for authenticated users, and a bookmarklet that uses the add ability for mobile devices and other browsers.

If you care to help or follow along make sure to follow the project on github: http://github.com/mitechie/Bookie

  • Current Chrome Extension: 0.3.3
  • Most recent code updates: develop branch
  • Current working branch: feature/auth

Bookie Status Report: Jun 15th 2011

I just finished up reading Start Small, Stay Small and there were some good points in there. One is that writing about your progress on a project each week helps people move forward. There is something about putting down what you've accomplished and what you plan into the public that helps keep the motivation motor running.

In an effort to keep Bookie from stagnating, I think that's a good thing to start doing. Count this as the first of a series of weekly progress reports I'm going to be doing. I also like that it helps show, beyond links to commit logs, that Bookie is moving forward and getting updates.

This past week has been a bit crazy. There hasn't been a ton of time to put into things, but I've managed to move a few big things forward:

First, work on making Bookie work via user accounts and logins is moving forward. Basically all of the urls in the application needed to be updated. Currently there are two sets. If you leave out a username from the url, you get overall, full site info.

In this way, a url of /recent will pull the 50 most recent bookmarks from all users on the site. However, /rick/recent will only pull the 50 most recent bookmarks of the user rick. The API urls needed to be updated as well. There's a ton of work in getting this going, but it's a major step of progress to allowing me to host a version of Bookie that other users can sign up for. Since that's really the big goal that I've set myself by the end of the year, I'm feeling good on this one.

The idea of multiple users has me realizing that my little readability.py script that fetches url content from bookmarks and stores the clean, readble parsed html for that page needs some work. It'll never scale that way. So I've split the work into a couple of parts.

One part is a node.js script that will fetch a list of urls to go fetch and asyncronously goes out and fetches the html content. It then shoves the bookmark id and the content into a beanstalkd queue for processing. The queue is polled by a python script that then calls a new Bookie API call with the content and the id. Bookie then runs the parsing code against the content and stores it in the database. The async code on node.js can fetch the html content in a hurry. In testing with my SSD hard drive and sqlite, I'm able to pull, process, and store more than one url per second. This is with 1 node.js producer and two running instances of the python consumer.

I'm definitely looking forward to ramping this up on a real server with Postgresql running. I'd love to be able to pull down and parse content at some decent rates to be able to cope with new users signing up to the service.

So that's this week's report. Next up is more work on the multi user setup. The tag urls still need work and all of the unit tests that I had need to be updated to test the new urls. This also means some duplicate tests to check both with/without usernames in the urls. Work is never done!

If you care to help or follow along make sure to follow the project on github: http://github.com/mitechie/Bookie

Introducing Bookie: only for developers

Introducing Bookie: only for developers

Delicious has been pretty stagnant over the years. I love using Instapaper and really kept waiting for Delicious to add that feature in to my pages I send into Delicious, but it never arrived. I also kept waiting for link checkers, a decent mobile view, and maybe even some support for full text search of actual web page content. Now that Yahoo has leaked out that Delicious is going to be "Sunset", I've finally gotten the push to start up my own bookmark storing application that I can add all the great features I want to have in there.

So without further ado, I give you bookie. It's built in Python using the Pyramid web framework. It's my first Pyramid app, but I do Pylons for the day job so it's not too far off.

This is the 0.1 release I'm calling "only for developers". It's not an easy one-click install, and right now it'll want you to have the -dev packages for the db engines to work right, but it does work. It can store, retrieve, and search bookmarks stored. It supports Delicious and Google bookmarks import, and will do an export. There's a fully functioning Google Chrome extension you can use to store and edit bookmarks.

This is currently a self hosting application. In time, I hope to provide a hosted version perhaps. I also hope to get a Firefox extension, as well as get started on the fancy features I noted above. So this is a LONG way from complete, but I feel like I've hit at least a 0.1 worthy milestone and want to put a stake in the ground. Here's the start of something, and hopefully it'll keep improving in future releases.

I want to thank my local guys for helping, testing, and adding great bits of code to bookie already. It's a lot of fun seeing other people get interested in things and contributing as well.

I'm trying to keep the todo list and some idea of the next milestones in the github wiki. Feel free to note bugs, add requests, etc. You can also chat with us in #bookie on freenode.

Testing out Ruby and comparing to the great Python

A long while ago I checked out a Ruby on Rails class that a guy was giving out. I was doing PHP work and this Rails thing had quite the hubub going on about it. So like any good techie I bought the Rails book, the Ruby book, and went work work. A week later I shelved the whole thing and decided it was entirely too much Perl-ish magic for my tastes. Lesson: never try to learn a language and a framework at the same time. You're going to get lost. It's not all bad, after that the same guy had a Turbogears weekend class and I fell in love with Python and the rest is history. Of course, I feel like that Ruby experience is a bit of a failure in my trials and when I was sick this past week decided to check out Ruby itself. It's used in some interesting projects that I've been wanting to check out, Chef, Puppet, and Capistrano.

So I spent the weekend crash coursing through the books Everyday Scripting with Ruby a The Ruby Programming Language and I figured I'd share my initial impressions on the things that look kind of cool. I'm looking at this from the standpoint of possibly doing some Ruby for some quick scripts, but definitely not heading back into the Rails area or doing web development. I love me some Python for that stuff.

First up, perl-ish regex in the string object is going to be handy. When doing quick scripts, one of the things that make Perl such a powerhouse is how engrained regexes are. The ability to do complex matching, replacing, etc in a hurry while processing a file is just good stuff. So I'm definitely looking forward to using some:

[sourcecode lang="ruby"] matches = /Ruby/.match("Checking out some Ruby") [/sourcecode]

The next thing is a pretty small one, but I love the idea of ending methods with a ? to denote it returns a bool. Code like myobj.new? looks so much nicer than myobj.is_new(). I know, small one, but small things can count.

I'm always liking symbols. It's another small thing, but it's an extra character I don't have to hit (the closing quote) and it just reads nicer in the editor since hash keys are no longer strings, but their own syntax. It also makes a lot of sense from the optimization end since there aren't multiple instances of the same string taking up space. I can't imagine it's huge, so for me it's mainly the prettier look in the editor.

[sourcecode lang="ruby"] myhash = { :first => "one thing", :second => "second thing" } [/sourcecode]

I also love the idea of running quick shell commands with just some backticks. With Python I need to import subprocess and perform an a function call, setup stderr/stdout captures, etc. With Ruby I could just do:

[sourcecode lang="ruby"] file_list = `ls $home` [/sourcecode]

and that would definitely be handy in doing some quick shell scripts.

What I'm not so sure about are things like the string variable substitutions. You basically do things like:

[sourcecode lang="ruby"] one_var = 'testing'

puts "I'm #{one_var} ruby out" [/sourcecode]

So you have to predefine the variable and it's got to be in the namespace of the string you're doing the replacement in. You can also use % to do some replacement, but I'm missing something like Python's "".format() in Ruby. I'm sure there are probably libraries that help with it, but I've not run into them yet.

I'm also not sold on the repeated use of self. You can have self at the module level as well as the class level. And the fact that it's not a reference ot an instance, but the class level, makes it something that just rubs my brain the wrong way. Of course, since self is all over Python classes and refers to the current instance I admit that it's probably more my Python usage getting in my way.

Finally, I'm not sold on the gems yet. I know it's pretty much just like eggs from the Cheeseshop,but it just seems things are much more immature and harder to find. It might still be inexperience, and I'll pick things up as I go along, but initially it's a barrier trying to find libraries to help with things.

Anyway, I do like some parts of it and I look forward to trying it out for some scripts. I think it might be just the right mix of Python and Perl for that type of work.

Working on OSS @ Work: Dozer profiling Pylons

It's been a fun little bit working on helping speed up a Pylons app at work. Performance needed improvement, and while we knew a couple of big places, I also wanted to look at getting some profiling in place. Fortunately I recently ran across Marius's post on profiling Pylons applications with Dozer. Now Dozer was originally meant for viewing app logs and memory checking, but it seemed that the in dev work added some profiling ability. So away I went checking out his code and seeing if I could get it to run. Once I realized that you had to setup the middleware for each type of activity you wanted to perform (cProfile, memory monitor, log view) I got things running. Very cool!

Right off the bat I realized I might be able to help a bit. The links were dark blue on black, there seemed to be a some issues with the ui on the profiling detail screen. Since we were using this for work I took it a chance to help improve things upstream. I moved the css images to get the arrow indicators working right, cycled the background rows vs hard coding them, and did some small ui tweaks I liked. I also coded up a set of "show/hide all" links to quickly expand things out and back.

Of course, it's not all roses. The show all pretty much times out in anything but Chrome. There's still more ui bits I think that could be improved.

Now that I had tinkered though it was time to add in some features we could use. First up, log view for json requests. We have some timer code in our SqlAlchemy logs and so I want to be able to view those in browser, but also for json requests. So I tweaked it up so that on json requests it adds a _dozer_logview object to the response. This then shows the various log entries along with the time diff that the html view shows.

Once that was going we decided this would be great to put on a staging version of our web application that the client has access to. The nice thing is that the staging is as close to production as possible, so some profiling information there would be very helpful. We don't want the html logview information visible to the client though. It detracts from the ui. To help with this I added a ipfilter config option for the LogView feature. In this way we can limit it to a couple of testing dev boxes and not worry about the client getting access.

I've pushed the code up into a bitbucket fork for Marius's repository. Hopefully this is useful and it's awesome that I got to spend some work time working on code that I can hopefully give back. I love this stuff.

Pylons controller with both html and json output

We're all using ajax these days and one of the simple uses of ajax is to update or add some HTML content to a page. What often happens is that the same data is often display in a url/page of its own. So you might have a url /job/list and then you might want to pull a list of jobs onto a page via ajax. So my goal is to be able to reuse controllers to provide details for ajax calls, calls from automated scripts, and whole pages. The trouble with this is that the @jsonify decorator in Pylons is pretty basic. It just sets the content type header to application/json and takes whatever you return and tries to jsonify it for you.

That's great, but I can't reuse that controller to send HTML output any more. So I set out to figure out how the decorator works and create one that works more like I wish.

The first thing in setting this up was to look at how to structure any ajax output. I can't stand the urls you hit via ajax that just dump out some content, outputs some string, and makes you have to look up every controller in order to figure out just what you're getting back.

I prefer to use a structured format back. So what parts do we need. Really, just a few things. Your ajax library will tell you if there's an error such as a timeout, 404, etc. It won't tell you if you make a call to a controller and don't have perimission, or maybe the controller couldn't complete the requested action. So the first thing we need is some value of success in our request.

The second component is feedback as to why the success came back. If the controller returns a lack of success we'll want to know why. Maybe it is successful, but we need some note about the process along the way. We need a standard message we can send back.

Finally, we might want to return some sort of data back. This could be anything from a json dump of the object requested to actual html output we want to use.

So that leaves us with a definition [sourcecode lang="javascript"] {'success': true, 'message': 'Yay, we did it', 'payload': {'id': 10, 'name': 'Bob'}} [/sourcecode]

I want to enforace that any ajax controller will output something in format. It makes it much easier to write effective DRY javascript that can handle this and really leaves us open to handle about anything we need.

So my json decorator is going to have to make sure that if the user requests a json response, that it gets all this info. If the user requests an html response, it'll just return the generated template html.

By copying parts of the @jsonify and the @validate decorators I came up with something that adds a self.json to the controller method. In here we setup our json response parts.

Finally, we catch if this is a json request. If so, return our dumped self.json instance. Otherwise, return the html the controller sends back. If the controller is returning rendered html and is a json request, then we stick that into the payload as payload.html

So take a peek at my decorator code and the JSONResponse object that it uses. Let me know what you think and any suggestions. It's my first venture into the world of Python decorators.

@mijson decorator Gist

Sample Controller [sourcecode lang="python"]

@myjson() def pause(self, id): result = SomeObj.pause()

if self.accepts_json(): if result: self.json.success = True self.json.message = 'Paused' else: self.json.success = False self.json.message = 'Failed'

self.json.payload['job_id'] = id

return '<h1>Result was: %s</h1>' % message

#Response: # {'success': true, # 'message': 'Paused', # 'payload': {'html': '<h1>Result was: Paused</h1>'}} [/sourcecode]

SqlAlchemy Migrations for mysql and sqlite and testing

I really want to be a unit tester, I really do. Unfortunately it's one of those things I can't seem to get going. I start and end up falling short before I get over the initial setup hurdle. Or I get a couple of tests working, but then as I have a hard time trying to test parts of things I fade. So here goes my latest attempt. It's for a web app I'm working on at work and I REALLY want to have this under at least basic unit tests. Since it's a database talking web application my first step is to get a test db up and runnging to run my tests against. At least with that up I can start some actual web tests that add and alter objects via some ajax api calls.

In order to get a test db I first had to figure out how to setup a database for the tests. For speed and ease purposes I'd rather be able to use sqlite. This way I don't need to setup/reset a mysql db on each host I end up trying to run tests on.

Of course this is complicated because I'm using sqllchemy-migrate for my application. This means part of the testing should be to init a new sqlite db and then bring it up to the latest version. In order to do this I had to convert my existing migrations to work in both MySQL and Sqlite.

Step 1: I need a way to tell the migrations code to use the sqlite db vs the live mysql db. I've setup a manage.py script in my project root so I hacked it up to check for a --sqlite flag. Not that great, but it works. [sourcecode lang="python"] """ In order to support unit testing with sqlite we need to add a flag for specifying that db

python manage.py version_control --sqlite python manage.py upgrade --sqlite

Otherwise it will default to using the mysql connection string

""" from migrate.versioning.shell import main

import sys if '--sqlite' in sys.argv: main(url='sqlite:///apptesting.db',repository='app/migrations') else: main(url='mysql://connection_stringl',repository='app/migrations') [/sourcecode]

Step 2: Not all of my existing migrations were sqlite friendly. I had cheated and added some columns by straight sql like

[sourcecode lang="python"] from sqlalchemy import * from migrate import *

def upgrade(): sql = "ALTER TABLE jobs ADD COLUMN created TIMESTAMP DEFAULT CURRENT_TIMESTAMP;" migrate_engine.execute(sql);

def downgrade(): sql = = "ALTER TABLE jobs DROP created;" migrate_engine.execute(sql); [/sourcecode]

This worked great with MySQL, but sqlite didn't like it. In order to get things to work both ways I moved to using the changeset tools to make these more SA happy.

[sourcecode lang="python"] from sqlalchemy import * from migrate import * from migrate.changeset import * from datetime import datetime

meta = MetaData(migrate_engine) jobs_table = Table('jobs', meta)

def upgrade(): col = Column('created', DateTime, default=datetime.now) col.create(jobs_table)

def downgrade(): sql = "ALTER TABLE jobs DROP created;" migrate_engine.execute(sql); [/sourcecode]

A couple of notes. This abstracted the column creation so that sqlite and mysql would take it. Notice I did NOT update the drop command. Sqlite won't drop columns, and I honestly didn't care because the goal is for my unit tests to be able to bring up a database instance for testing, I'm not going to run through the downgrade commands in the sqlite database.

Step 3. So with all of the migrations moved to do everything via SA abstracted code vs SQL strings, I was in business. My final problem was one migration in particular. I had changed a field from a varchar to a int field. Sqlite won't let you do simple 'ALTER TABLE...' and even when I had the command turned into SA based changeset code my db upgrade failed due to sqlite tossing an exception.

What did I do? I cheated. First, I updated the migration with the original column definition to an Integer field. I mean, any new installs could walk that migration up just fine. I happen to know that the two deployments right now have already made the change from varchar to int. So for them, the change won't break anything for upgrade/downgrade.

I then kept the migration with the change, but tossed it in a try/except block so that I could trap it nice and just output a message "If this fails, it's probably sqlite choking.". It's hackish, but works for all my use cases I need.

Now I can create a new test database with the commands [code] python manage.py version_control --sqlite python manage.py upgrade --sqlite [/code]

Now I can start building my test suite to use this as the bootstrap to create a test db. I'll have to them remove the file on teardown so that I don't get any errors, but that'll be part of the testing setup. Not in memory, but oh well...it works.

A follow up, more dict to SqlAlchemy fun

Just a quick follow up to my last post on adding the ability to add some tools to help serialize SqlAlchemy instances. I needed to do the reverse. I want to take the POST'd values from a form submission and tack them onto one of my models. So I now also add a fromdict() method onto base that looks like. [sourcecode lang="python"] def fromdict(self, values): """Merge in items in the values dict into our object if it's one of our columns

""" for c in self.__table__.columns: if c.name in values: setattr(self, c.name, values[c.name])

Base.fromdict = fromdict [/sourcecode]

So in my controllers I can start doing [sourcecode lang="python"] def controller(self): obj = SomeObj() obj.fromdict(request.POST) Session.add(obj) [/sourcecode]

Hacking the SqlAlchemy Base class

I'm not a month into my new job. I've started working for a market research company here locally. Definitely new since I don't know I've ever found myself really reading or talking about 'market research' before. In a way it's a lot like my last job in advertising. You get in and there's a whole world of new terms, processes, etc you need to get your head around. The great thing about my new position is that it's a Python web development gig. I'm finally able to concentrate on learning the ins and outs of the things I've been trying to learn and gain experience with in my spare time.

So hopefully as I figure things out I'll be posting updates to put it down to 'paper' so I can think it through one last time.

So started with some more SqlAlchemy hacking. At my new place we use Pylons, SqlAlchemy (SA), and Mako as the standard web stack. I've started working on my own first 'ground up' project and I've been trying to make SqlAlchemy work and get into the way I like doing things.

So first up, I like the instances of my models to be serializable. I like to have json output available for most of my controllers. We all want pretty ajax functionality right? But the json library can't serialize and SqlAlchemy model by default. And if you just try to iterate over sa_obj.__dict__ it won't work since you've got all kinds of SA specific fields and related objects in there.

So what's a guy to do? Run to the mapper. I've not spent my time I pouring over the details of SA parts and the mapper is something I need to read up on more.

Side Notes: all these examples are from code using declarative base style SA definitions.

The mapper does this great magic of tying a Python object and a SA table definition. So in the end you get a nice combo you do all your work with. In the declarative syntax case you normally have all your models extend the declarative base. So the trick is to add a method of serializing SA objects to the declarative base and boom, magic happens.

The model has a __table__ instance in it that contains the list of columns. Those are the hard columns of data in my table. These are the things I want to pull out into a serialized version of my object.

My first try at this looked something like

[sourcecode lang="python"] def todict(self): d = {} for c in self.__table__.columns: value = getattr(self, c.name) d[c.name] = value

return d [/sourcecode]

This is great and all but I ran into a problem. The first object I ran into had a DateTime column in it that choked since the json library was trying to serialize a DateTime instance. So a quick hack to check if the column was DateTime and if so put it to string got me up and running again.

[sourcecode lang="python"] if isinstance(c.type, sqlalchemy.DateTime): value = getattr(self, c.name).strftime("%Y-%m-%d %H:%M:%S") [/sourcecode]

This was great and all. I attached this to the SA Base class and I was in business. Any model now had a todict() function I could call.

[sourcecode lang="python"] Base = declarative_base(bind=engine) metadata = Base.metadata Base.todict = todict [/sourcecode]

This is great for my needs, but it does miss a few things. This just skips over any relations that are tied to this instance. It's pretty basic. I'll also run into more fields that need to be converted. I figure that whole part will need a refactor in the future.

Finally I got thinking, "You know, I can often do a log.debug(dict(some_obj)) and get a nice output of that object and its properties." I wanted that as well. It seems more pythonic to do

[sourcecode lang="python"] dict(sa_instance_obj) # vs sa_instance_obj.todict() [/sourcecode]

After hitting up my Python reference book I found that the key to being able to cast something to a dict is to have it implement the iterable protocol. To do this you need to implement a __iter__ method that returns something that implements a next() method.

What does this mean? It means my todict() method needs to return something I can iterate over. Then I can just return it from my object. So I turned todict into a generator that returns the list of columns, values needed to iterate through.

[sourcecode lang="python"] def todict(self): def convert_datetime(value): return value.strftime("%Y-%m-%d %H:%M:%S")

d = {} for c in self.__table__.columns: if isinstance(c.type, sa.DateTime): value = convert_datetime(getattr(self, c.name)) else: value = getattr(self, c.name)

yield(c.name, value)

def iterfunc(self): """Returns an iterable that supports .next() so we can do dict(sa_instance)

""" return self.todict()

Base = declarative_base(bind=engine) metadata = Base.metadata Base.todict = todict Base.__iter__ = iterfunc [/sourcecode]

Now in my controllers I can do cool stuff like

[sourcecode lang="python"] @jsonify def controller(self, id): obj = Session.query(something).first()

return dict(obj) [/sourcecode]

Auto Logging to SqlAlchemy and Turbogears 2

I've been playing with Turbogear2 (TG2) for some personal tools that help me get work done. One of the things I've run into is an important missing feature that my work code has that isn't in my TG2 application yet. In my PHP5 app for work, I use the Doctrine ORM and I have post insert, update, delete hooks that will actually go in and log changes to the system. It works great and I can build up a history of an object over time to see who changes which fields and such.

With my TG2 app doing inserts and updates I initiall just faked the log events by manually saving Log() objects from within my controllers as I do the work that needs to be done.

This sucks though since the point is that I don't have to think about things. Anytime code changes something it's logged. So I had to start searching the SqlAlchemy (SA) docs to figure out how to duplicate this in TG2. I wanted something that's pretty invisible. In my PHP5 code I have a custom method I can put onto my Models in case I want to override the default logged messages and such.

I found part of what I'm looking for in the SA MapperExtension. This blog post got me looking in the right direction. The MapperExtension providers a set of methods to hook a function into. The hooks I'm interested in are the 'after_delete', 'after_insert', 'after_update' method. These are passed in the instance of the object and a connection object so I can generate an SQL query to manually save the log entry for the object.

So I have something that looks a little bit like this:

[sourcecode lang="python"] from sqlalchemy.orm.interfaces import MapperExtension

class LogChanges(MapperExtension):

def after_insert(self, mapper, connection, instance): query = "INSERT INTO log ( username, \ type, \ client_ip, \ application ) VALUES( '%s', %d, '%s', '%s')" % ( u"rick", 4, u'127.0.0.1', u'my_app')

connection.execute(query) [/sourcecode]

Then I pass that into my declarative model as:

[sourcecode lang="python"] __mapper_args__ = {'extension': LogChanges()} [/sourcecode]

This is very cool and all, but it's not all the way where I want to head. First, the manual SQL logging query kind of sucks. I have an AppLog() model that I just want to pass in some values to to create a log entry. I'm thinking what I really should do is find a different way to do the logging itself. I'm debating between actually doing a separate logging application that I would call with the objects details.

The problem with this is that one of the things I do in my current app is store the old values of the object. This way I can loop through them and see which values actually changed and generate that in the log message. This is pretty darn useful.

The other downside is that I don't have a good way to have a custom logging message generator is I just call a Logging app API.

So I think I might try out the double db connection methods that SA and TG2 support. This way I could actually try to use the second db instance with a Logging() object to write out the changes without messing up the current session/unit of work.

The missing part here is that I'm still not really sure how to get the 'old' object values in order to generate a list of fields that have been changed. Guess I have some more hacking to do.