etframes: Applying the ideas of Edward Tufte to matplotlib

Edward Tufte is a professor and author known for his excellent (and beautiful!) books on the visual display of statistical information. Last year I had the opportunity to attend one of his courses and was inspired to apply his ideas to my favorite plotting library, matplotlib.

The result is etfames, a python module that operates on matplotlib plots. So far I’ve implemented two graph types described in the The Visual Display of Quantitative Information (VDQI): the dash-dot-plot and range frames.

Dash Dot Plot

A dash-dot-plot places a tick mark on the axis for each value in a scatter plot. When there are many values in the graph this can be a more effective way to understand their distribution than looking at the raw data. For example:

Example of a dash-dot-plot

See demo_ddp.py for a working example.

Range Frames

The range frame re-uses the frame (edge) of a graph to display useful information. Instead of drawing a full frame around the graph the frame is only drawn from the minimum to the maximum value along that axis. For example:

Example of a range frame

See demo_range.py for a working example.

Other Work

There are several other graph types described in VDQI that would be nice to implement, particularly the extension of range frames that turns them into a box plot.

A related project is sparkplot which uses the matplotlib library to create sparklines.

8 Responses to “etframes: Applying the ideas of Edward Tufte to matplotlib”

  1. seanstickle » Blog Archive » A python module that operates on matplotlib plots Says:

    […] From Adam Hupp. […]

  2. a11en Says:

    Hi Adam, just registered- I was very excited by this, enough to finally try out an installation of python/scipy/matplotlib etc. I’m very new at all this, but I’m giving it a go, just to try it out and learn more about scipy etc.

    I’ve tried out the demos, but am having some problems… demo_range.py works just fine (I’m manually typing this into iPython right now)- but, I can’t seem to make the demo_ddp.py work. It’s telling me that it doesn’t know what random() means.

    I suspect a problem in my install, but I’ve been able to run all the numpy tests without issues… so I’m a bit stumped. I believe that numpy is importing without problems, but even if I import numpy as n, then do an n.random() I’m not getting anywhere.

    Any thoughts are greatly appreciated! Like the plots!
    -Allen

  3. a11en Says:

    Umm… replace all instances of random() with normal() my mistake! [mistake is in the comment writing- not in the nature of the comment- I still can’t run ddp.py]

  4. a11en Says:

    Ok, darnit, I hate when I do this… I figured it out. :( Sorry. It was a problem with calling the routines. For me, it doesn’t seem to be able to guess which module has the call. So, your script for the ddp demo looks like this for me:

    from numpy import random
    import pylab
    import etframes

    ys = [random.normal() for _ in range(100)]
    xs = [random.normal() for _ in range(100)]

    pylab.scatter(xs,ys)

    etframes.add_dot_dash_plot(pylab.gca(), ys=ys, xs=xs)

    show()

    Essentially, all calls for me need to reference the imports… I don’t know if import numpy as * would work… perhaps. On some SciPy sites, I’ve seen the preference of having import numpy as n, import etframes as e, etc… then your calls are like: p.gca() intead of gca().

    In anycase, very very interesting work!! :) Feel free to edit the comments as you wish- I just wanted some of this info out there for anyone trying to make your demos work with a new/newbie install of SciPy and all the goodies. :)

    Cheers!!
    -Allen

    ps- If I use any of these in publications for any reason, you will get acknowledgment and I’ll be in touch to let you know. :)

  5. a11en Says:

    Ok, ok, last post for today, I promise- I noticed that the ticks in the dash regions of the dot-dash plot appear to still be there? They can obscure the data a bit, adding a sense that data is there that isn’t really in the distribution… was it a compromise due to the way of plotting, or is it possible to remove them?

    Thanks for all your thoughts and this great solution to the dot-dash and the range-plot!
    -Allen

  6. adam Says:

    Thank you for the comments. I’d love to hear about it if this was used in any kind of publication.

    Good point about the major tick marks. You can turn them off by putting this into the script after the imports:

    rcParams[’xtick.major.size’] = 0
    rcParams[’ytick.major.size’] = 0

    I don’t see any cleaner way of doing this. If you turn off the major ticks entirely it also turns off the numbers below.

  7. Norbert Klamann Says:

    Hello Adam,
    I am new to matplotlib and maybe that is the deeper reason for my question but here we go.

    I try to use etframes integrated in a wxpython application and use a program like this, which goes into recursion and crashes.

    I presume that I use etframes in a wrong way but can’t figure it out.

    Source follows (hopefully the indentation keeps in order)

    !/usr/bin/env python

    import wx
    import wx.grid
    from numpy import *
    import matplotlib
    matplotlib.use(’WXAgg’)
    from matplotlib.backends.backend_wxagg import FigureCanvasWxAgg
    from matplotlib.figure import Figure
    import etframes

    def minmax(data):
    return min(data), max(data)

    class ActionFrame(wx.Frame):

    def __init__(self, *args, **kwargs):
        wx.Frame.__init__(self, *args, **kwargs)
        self.fig = Figure((5,4), 75)
        self.canvas = FigureCanvasWxAgg(self, -1, self.fig)
        sizer = wx.BoxSizer(wx.VERTICAL)
        # This way of adding to sizer allows resizing
        sizer.Add(self.canvas, 1, wx.LEFT|wx.TOP|wx.GROW)
        self.SetSizer(sizer)
        self.Fit()
        data = [100, 200, 300,]
        index = [1,2,3]
        self.axes = self.fig.add_subplot(111)
        etframes.add_range_frame(self.axes,minmax(data),minmax(index))
        self.axes.plot(index,data,'go')
    

    class App(wx.App):
    def OnInit(self):
    self.frm = ActionFrame(None,title=”Action Frame”)
    self.frm.Show()
    self.SetTopWindow(self.frm)
    return True

    if name == ‘main‘:
    app = App(False)
    app.MainLoop()

  8. Norbert Klamann Says:

    Same problem occurs, wen I use ’scatter’ instead of ‘plot’.

    This is the first part of the traceback:

    Traceback (most recent call last):
    File “C:Python25Libsite-packagesmatplotlibbackendsbackend_wx.py”, line 1081, in _onPaint
    self.draw(repaint=False)
    File “C:Python25Libsite-packagesmatplotlibbackendsbackend_wxagg.py”, line 61, in draw
    FigureCanvasAgg.draw(self)
    File “C:Python25Libsite-packagesmatplotlibbackendsbackend_agg.py”, line 358, in draw
    self.figure.draw(self.renderer)
    File “C:Python25Libsite-packagesmatplotlibfigure.py”, line 624, in draw
    for a in self.axes: a.draw(renderer)
    File “C:Python25Libsite-packagesmatplotlibaxes.py”, line 1345, in draw
    a.draw(renderer)
    File “C:projektelm2.worksrcviewyagutetframes.py”, line 67, in draw
    rf = self.make_range_frame()
    File “C:projektelm2.worksrcviewyagutetframes.py”, line 85, in make_range_frame
    colors=[self.color])
    File “C:Python25Libsite-packagesmatplotlibcollections.py”, line 678, in init
    self._colors = _colors.colorConverter.to_rgba_list(colors)
    File “C:Python25Libsite-packagesmatplotlibcolors.py”, line 327, in to_rgba_list
    c[i] = self.to_rgba(cc, alpha) # change in place
    File “C:Python25Libsite-packagesmatplotlibcolors.py”, line 309, in to_rgba
    raise ValueError(’to_rgba: Invalid rgba arg “%s”n%s’ % (str(arg), exc))
    ValueError: to_rgba: Invalid rgba arg “(100, 300)”

Leave a Reply