Improve Django ORM performance on Foreign Keys
If you use Foreign Keys on a model in Django, you might not be aware of performance issues until it hits you. Navigating through ForeignKey relationships in your code/templates are very easy, but creates a db query every time.
Lets look at this problem with a simple model:
class Category(models.Model):
name = models.CharField(max_length=255)
class Article(models.Model):
title = models.CharField(max_length=255)
content = models.TextField()
category = models.ForeignKey(Category)
created_by = models.ForeignKey(User, related_name='+')
modified_by = models.ForeignKey(User, related_name='+', blank=True, null=True)
For a list of articles you could write something like this:
for article in Article.objects.all():
print "%s by %s in %s" % (article.title, article.modified_by or article.created_by, article.category.name)
Run the code and look at the generated queries (either in the django development server output or, a litte bit prettier, in the SQL tab of the Django Debug Toolbar. With 3 Article objects, this will create 7 SQL queries to your database:
- One for the list of article objects
- One for the
modified_byfield on each object - One for the
created_byfield on each object - One for the
categoryfield on each object
To work around this issue, you can call select_related() on the Manager object of the Article class. This will combine all referenced objects into one query, which is usually a lot faster! Have a look into the documentation for a list of parameters.
How to parse a syslog logfile in python
Thanks to the incredible pyparsing module it is really easy to parse arbitrary files without the hassle of regular expressions.
The following code parses a standard syslog-ng logfile:
from pyparsing import Word, alphas, Suppress, Combine, nums, string, Optional, Regex
month = Word(string.uppercase, string.lowercase, exact=3)
integer = Word(nums)
serverDateTime = Combine(month + " " + integer + " " + integer + ":" + integer + ":" + integer)
hostname = Word(alphas + nums + "_" + "-")
daemon = Word(alphas + "/" + "-" + "_") + Optional(Suppress("[") + integer + Suppress("]")) + Suppress(":")
message = Regex(".*")
bnf = serverDateTime + hostname + daemon + message
with open('/path/to/logfile') as syslogFile:
for line in syslogFile:
fields = bnf.parseString(line)
print fields
Es ist noch nicht vorbei
Inspiriert von der Python-Hackerei und den Möglichkeiten die das Yahoo! Developer Network bieten, habe ich direkt das World-Heatmap Script von Simon Willison aus seinem Vortrag von den StackOverflow DevDays mit dem Yahoo Developer Network gekreuzt und herausgekommen ist diese Karte der Welt mit den Herkunftsländern der gebannten IPs der aktuellen Welle (rot ist böse):

Und weil ich gerade dabei war, habe ich auch noch eine für die am meisten abgelehnten Mails am Mailserver erstellt:

Eclipse SVN auto keywords
In ~/.subversion/config unter
[miscellany] enable-auto-props = yes
setzen und unter
[auto-props] *.java = svn:eol-style=native;svn:keywords=Id *.php = svn:eol-style=native;svn:keywords=Id
eintragen. Dann wird bei jedem svn add auch die Properties auf *.java gesetzt.
