Archive for the ‘Python’ Category

Pretty colors: Syntax Highlighting in Mercurial

Sunday, November 18th, 2007

I recently switched my personal revision control system from subversion to mercurial. One of the great things about mercurial is the built-in web interface, but I missed the syntax highlighting that’s available in interfaces such as ViewVC.

I’ve written a mercurial extension that applies pygments

view. This extension is now available in the main mercurial repo.

To enable it, install pygments and add the following entry to hgrc:

[extension]
hgext.highlight =

An example of the output.

Thanks to micha who wrote an initial patch. black jack betting strategyfree online video pokerbackgammon gambling,online backgammon gamblinghow to play video pokervideo poker practicefree baccarat game,baccarat game,baccarat casino gametournament backgammoninternet casino gamefull pay video pokerno deposit bonus online casinofree slots,free slots game,free on line slotsbaccarat rulefree black jack gamecraps online gamevirtual casino gamblingfree video poker downloadsblackjack gambling game,blackjack gambling,online blackjack gamblingblackjack casino gameno deposit free money casinocasino link online suggestreal money backgammonplay free casino slotshow to play backgammon,instructions to play backgammon,play backgammon online freevideo poker tournamentfree on line slotsblack jack downloadon line casino wageringfree cash casinoplay free online slotslearn to play crapsdouble bonus video pokerbest online casino bonusinternet casino gambling game,internet casino gambling,internet casino gambling ukonline casino gambling,gambling casino online bonus,online casino online gamblingfree backgammon downloaddownload casino gamevideo poker doublejacks or better video pokerno download video pokeronline black jack gameonline casino gambling sitebaccarat the internet casino game,casino baccarat,virtual online casino gambling baccaratfree online black jack gameonline baccarathow to win at roulettedeuces wild video pokerfind online casinole casinoblack jack regolenouveau casino bonus sans depot

Python λ Shorthand

Saturday, November 10th, 2007

In Python a lambda expression (anonymous function) is created with the lambda keyword:

map(lambda x: x+1, [1,2,3])

Some Scheme interpreters such as Dr. Scheme allow the λ symbol (U+03BB) as a shorthand for lambda. I started wondering what this would look like in Python. For example:

map(λ x: x+1, [1,2,3])

Or for a slightly more gratuitously complex example, a recursive factorial with the Y combinator:

Y = (λ X:
         (λ p:
              X(λ arg: p(p)(arg)))
     ((λ p:
           X(λ arg: p(p)(arg)))))

F = (λ f: (λ x: (x*f(x-1) if x > 0 else 1)))

fact = Y(F)
fact(5) => 120

The interpreter changes to make this work this are quite simple. All it requires is small change to the grammar, an update to the syntax checker, and a hack in the parser generator to treat characters with the high-bit set as though they were in the alpha class.

There are a few cases where the shorthand is beneficial. For example, compare:

[i for i in lst if i > 42]

to:

filter(λ x: x > 42, lst)

While list comprehensions are the standard python way to accomplish this, I do like how the predicate comes first in the latter version.

Also, pretty much all of the functions in the itertools module could work nicely with the shorthand form:

itertools.groupby(lst, λ x: x.somevalue)

Even so this is probably isn’t something that would be worth adding to the language. Lambda expressions are rarely useful in Python so a shorthand form is not going at add much benefit compared to the added language complexity. Not to mention the need for a UTF-8 aware editor and terminal.

Using It

Since λ probably doesn’t appear on your keyboard a certain amount of editor configuration is required. The easiest way in emacs is to define a key binding that inserts the character:

(global-set-key "\C-c\C-l" (lambda () (interactive) (insert "λ")) )

Alternatively, abbrev-mode can insert this character whenever the word lambda is typed:

(define-abbrev-table 'global-abbrev-table
  '(("lambda" "λ" nil 0)))

To build an interpreter with this enabled you’ll need this patch applied to a recentish checkout of python 3k. I don’t know if it would work on earlier versions.

Update: On reddit, gwern pointed out that it’s better to translate from lambda -> λ in the editor display rather than embed this symbol in the text. There is (of course) elisp to do this.

Burkhard-Keller (BK) Trees In Python

Sunday, October 28th, 2007

I recently came across a description of an interesting data structure called a Burkhard-Keller tree (BK tree). BK trees provide efficient lookup of the set of words that lie within a certain distance of a query word. For example, this could used to suggest corrections in a spell checker.

To play around with this I’ve written up a Python implementation of the tree. For example:

>>> tree = BKTree(levenshtein, dictionary_words)
>>> tree.query("ricohet", 2)
[(1, 'ricochet'), (2, 'richer'), (2, 'riches'), 
 (2, 'richest'), (2, 'ricochets')]
>>>

In the above example ‘levenshtein’ is a function that implements the Levenshtein Distance. This is commonly used to determine the distance between a word and its misspellings.

How much faster is a bk-tree than the brute force approach? Here are a few example queries compared with the brute-force time:

word, distanceBK TreeBrute Force
amphibious, 23s15s
ricochet, 23s11s
the, 20.2s6s

You can see the BK tree is much faster for small distances. As the query distance increases the BK tree query time approaches that of the brute force method.

Of course, there’s no free lunch. Creating this tree of 57,024 words takes 94s on my system.

The source for this module is available here. Enjoy!

etframes: Applying the ideas of Edward Tufte to matplotlib

Monday, September 3rd, 2007

Edward Tufte is a professor and author known for his excellent (and beautiful!) books on the visual display of statistical information. Last year I had the opportunity to attend one of his courses and was inspired to apply his ideas to my favorite plotting library, matplotlib.

The result is etfames, a python module that operates on matplotlib plots. So far I’ve implemented two graph types described in the The Visual Display of Quantitative Information (VDQI): the dash-dot-plot and range frames.

Dash Dot Plot

A dash-dot-plot places a tick mark on the axis for each value in a scatter plot. When there are many values in the graph this can be a more effective way to understand their distribution than looking at the raw data. For example:

Example of a dash-dot-plot

See demo_ddp.py for a working example.

Range Frames

The range frame re-uses the frame (edge) of a graph to display useful information. Instead of drawing a full frame around the graph the frame is only drawn from the minimum to the maximum value along that axis. For example:

Example of a range frame

See demo_range.py for a working example.

Other Work

There are several other graph types described in VDQI that would be nice to implement, particularly the extension of range frames that turns them into a box plot.

A related project is sparkplot which uses the matplotlib library to create sparklines.

A Few Things About Python and Emacs 22

Tuesday, May 1st, 2007

This week I upgraded to the not-quite-released emacs 22. One of the big changes (for me at least) is the new default implementation of the python major mode, python.el. Some observations:

  1. python.el does not seem to have the same kinds of problems with syntax highlighting of triple-quoted strings that python-mode did. This alone is worth the upgrade in my mind.

  2. In python-mode.el the RET key was bound to py-newline-and-indent. When you typed ‘if condition:RET’ this would automatically indent the next line within the ‘if’ block. In python.el this functionality is bound to “C-j” instead of RET. You can get the old behavior (which I prefer) by putting this in your .emacs:

    
    (add-hook 'python-mode-hook
              (lambda ()
                (define-key python-mode-map "\C-m" 'newline-and-indent)))
    

    Thanks to johan for the suggestion to do this in a hook.

  3. Turn on transient-mark-mode. This is necessary for certain features like region (un)comment region (M-;) to work. In emacs 21 transient-mark-mode didn’t play well with python-mode, (e.g. py-mark-block didn’t work) so I turned it off. That doesn’t seem to be a problem anymore.

I’m sure there are many more changes but those are what I’ve noticed after bit of use. Overall I’m happy with the new implementation.

My next task will be to get Guido’s xreload to work from emacs.