Thoughts filed in: Python

Prev 1 2

Thread progress monitoring in Python

My recent hobby hack-together, my photo cleanup tool FotoJazz, required me getting my hands dirty with threads for the first time (in Python or otherwise). Threads allow you to run a task in the background, and to continue doing whatever else you want your program to do, while you wait for the (usually long-running) task to finish (that's one definition / use of threads, anyway — a much less complex one than usual, I dare say).

However, if your program hasn't got much else to do in the meantime (as was the case for me), threads are still very useful, because they allow you to report on the progress of a long-running task at the UI level, which is better than your task simply blocking execution, leaving the UI hanging, and providing no feedback.

As part of coding up FotoJazz, I developed a re-usable architecture for running batch processing tasks in a thread, and for reporting on the thread's progress in both a web-based (AJAX-based) UI, and in a shell UI. This article is a tour of what I've developed, in the hope that it helps others with their thread progress monitoring needs in Python or in other languages.


FotoJazz: A tool for cleaning up your photo collection

I am at times quite a prolific photographer. Particularly when I'm travelling, I tend to accumulate quite a quantity of digital snaps (although am still working on the quality of said snaps). I'm also a reasonably organised and systematic person: as such, I've developed a workflow for fixing up, naming and archiving my soft-copy photos; and I've also come to depend on a variety of scripts and little apps, that perform various steps of the workflow for me.

Sadly, my system has had some disadvantages. Most importantly, there are too many separate scripts / apps involved, and with too many different interfaces (mix of manual point-and-click, drap-and-drop, and command-line). Ideally, I'd like all the functionality unified in one app, with one streamlined graphical interface (and also everything with equivalent shell access). Also, my various tools are platform-dependent, with most of them being Windows-based, and one being *nix-based. I'd like everything to be platform-independent, and in particular, I'd like everything to run best on Linux — as I'm trying to do as much as possible on Ubuntu these days.

Plus, I felt in the mood for getting my hands dirty coding up the photo-management app of my dreams. Hence, it is with pleasure that I present FotoJazz, a browser-based (plus shell-accessible) tool built with Python and Flask.


Jimmy Page: site-wide Django page caching made simple

For some time, I've been using the per-site cache feature that comes included with Django. This site's caching needs are very modest: small personal site, updated infrequently, with two simple blog-like sections and a handful of static pages. Plus, it runs fast enough even without any caching. A simple "brute force" solution like Django's per-site cache is more than adequate.

However, I grew tired of the fact that whenever I published new content, nothing was invalidated in the cache. I began to develop a routine of first writing and publishing the content in the Django admin, and then SSHing in to my box and restarting memcached. Not a good regime! But then again, I also couldn't bring myself to make the effort of writing custom invalidation routines for my cached pages. Considering my modest needs, it just wasn't worth it. What I needed was a solution that takes the same "brute force" page caching approach that Django's per-site cache already provided for me, but that also includes a similarly "brute force" approach to invalidation. Enter Jimmy Page.


An inline image Django template filter

Adding image fields to a Django model is easy, thanks to the built-in ImageField class. Auto-resizing uploaded images is also a breeze, courtesy of sorl-thumbnail and its forks/variants. But what about embedding resized images inline within text content? This is a very common use case for bloggers, and it's a final step that seems to be missing in Django at the moment.

Having recently migrated this site over from Drupal, my old blog posts had inline images embedded using image assist. Images could be inserted into an arbitrary spot within a text field by entering a token, with a syntax of [img_assist nid=123 ... ]. I wanted to be able to continue embedding images in roughly the same fashion, using a syntax as closely matching the old one as possible.

So, I've written a simple template filter that parses a text block for tokens with a syntax of [thumbnail image-identifier], and that replaces every such token with the image matching the given identifier, resized according to a pre-determined width and height (by sorl-thumbnail), and formatted as an image tag with a caption underneath. The code for the filter is below.


An autop Django template filter

autop is a script that was first written for WordPress by Matt Mullenweg (the WordPress founder). All WordPress blog posts are filtered using wpautop() (unless you install an additional plug-in to disable the filter). The function was also ported to Drupal, and it's enabled by default when entering body text into Drupal nodes. As far as I'm aware, autop has never been ported to a language other than PHP. Until now.

In the process of migrating this site from Drupal to Django, I was surprised to discover that not only Django, but also Python in general, lacks any linebreak filtering function (official or otherwise) that's anywhere near as intelligent as autop. The built-in Django linebreaks filter converts all single newlines to <br /> tags, and all double newlines to <p> tags, completely irrespective of HTML block elements such as <code> and <script>. This was a fairly major problem for me, as I was migrating a lot of old content over from Drupal, and that content was all formatted in autop style. Plus, I'm used to writing content in that way, and I'd like to continue writing content in that way, whether I'm in a PHP environment or not.

Therefore, I've ported Drupal's _filter_autop() function to Python, and implemented it as a Django template filter. From the limited testing I've done, the function appears to be working just as well in Django as it does in Drupal. You can find the function below.

Prev 1 2