etframes: Applying the ideas of Edward Tufte to matplotlib

Edward Tufte is a professor and author known for his excellent (and beautiful!) books on the visual display of statistical information. Last year I had the opportunity to attend one of his courses and was inspired to apply his ideas to my favorite plotting library, matplotlib.

The result is etfames, a python module that operates on matplotlib plots. So far I’ve implemented two graph types described in the The Visual Display of Quantitative Information (VDQI): the dash-dot-plot and range frames.

Dash Dot Plot

A dash-dot-plot places a tick mark on the axis for each value in a scatter plot. When there are many values in the graph this can be a more effective way to understand their distribution than looking at the raw data. For example:

Example of a dash-dot-plot

See demo_ddp.py for a working example.

Range Frames

The range frame re-uses the frame (edge) of a graph to display useful information. Instead of drawing a full frame around the graph the frame is only drawn from the minimum to the maximum value along that axis. For example:

Example of a range frame

See demo_range.py for a working example.

Other Work

There are several other graph types described in VDQI that would be nice to implement, particularly the extension of range frames that turns them into a box plot.

A related project is sparkplot which uses the matplotlib library to create sparklines.

This entry was posted in Python, Tufte. Bookmark the permalink.

8 Responses to etframes: Applying the ideas of Edward Tufte to matplotlib

  1. Pingback: seanstickle » Blog Archive » A python module that operates on matplotlib plots

  2. a11en says:

    Hi Adam, just registered- I was very excited by this, enough to finally try out an installation of python/scipy/matplotlib etc. I’m very new at all this, but I’m giving it a go, just to try it out and learn more about scipy etc.

    I’ve tried out the demos, but am having some problems… demo_range.py works just fine (I’m manually typing this into iPython right now)- but, I can’t seem to make the demo_ddp.py work. It’s telling me that it doesn’t know what random() means.

    I suspect a problem in my install, but I’ve been able to run all the numpy tests without issues… so I’m a bit stumped. I believe that numpy is importing without problems, but even if I import numpy as n, then do an n.random() I’m not getting anywhere.

    Any thoughts are greatly appreciated! Like the plots! -Allen

  3. a11en says:

    Umm… replace all instances of random() with normal() my mistake! [mistake is in the comment writing- not in the nature of the comment- I still can't run ddp.py]

  4. a11en says:

    Ok, darnit, I hate when I do this… I figured it out. :( Sorry. It was a problem with calling the routines. For me, it doesn’t seem to be able to guess which module has the call. So, your script for the ddp demo looks like this for me:

    from numpy import random import pylab import etframes

    ys = [random.normal() for _ in range(100)] xs = [random.normal() for _ in range(100)]

    pylab.scatter(xs,ys)

    etframes.add_dot_dash_plot(pylab.gca(), ys=ys, xs=xs)

    show()

    Essentially, all calls for me need to reference the imports… I don’t know if import numpy as * would work… perhaps. On some SciPy sites, I’ve seen the preference of having import numpy as n, import etframes as e, etc… then your calls are like: p.gca() intead of gca().

    In anycase, very very interesting work!! :) Feel free to edit the comments as you wish- I just wanted some of this info out there for anyone trying to make your demos work with a new/newbie install of SciPy and all the goodies. :)

    Cheers!! -Allen

    ps- If I use any of these in publications for any reason, you will get acknowledgment and I’ll be in touch to let you know. :)

  5. a11en says:

    Ok, ok, last post for today, I promise- I noticed that the ticks in the dash regions of the dot-dash plot appear to still be there? They can obscure the data a bit, adding a sense that data is there that isn’t really in the distribution… was it a compromise due to the way of plotting, or is it possible to remove them?

    Thanks for all your thoughts and this great solution to the dot-dash and the range-plot! -Allen

  6. adam says:

    Thank you for the comments. I’d love to hear about it if this was used in any kind of publication.

    Good point about the major tick marks. You can turn them off by putting this into the script after the imports:

    rcParams['xtick.major.size'] = 0 rcParams['ytick.major.size'] = 0

    I don’t see any cleaner way of doing this. If you turn off the major ticks entirely it also turns off the numbers below.

  7. Norbert Klamann says:

    Hello Adam, I am new to matplotlib and maybe that is the deeper reason for my question but here we go.

    I try to use etframes integrated in a wxpython application and use a program like this, which goes into recursion and crashes.

    I presume that I use etframes in a wrong way but can’t figure it out.

    Source follows (hopefully the indentation keeps in order)

    !/usr/bin/env python

    import wx import wx.grid from numpy import * import matplotlib matplotlib.use(‘WXAgg’) from matplotlib.backends.backend_wxagg import FigureCanvasWxAgg from matplotlib.figure import Figure import etframes

    def minmax(data): return min(data), max(data)

    class ActionFrame(wx.Frame):

    def init(self, *args, **kwargs):
        wx.Frame.init(self, *args, **kwargs)
        self.fig = Figure((5,4), 75)
        self.canvas = FigureCanvasWxAgg(self, -1, self.fig)
    sizer = wx.BoxSizer(wx.VERTICAL) # This way of adding to sizer allows resizing sizer.Add(self.canvas, 1, wx.LEFT|wx.TOP|wx.GROW) self.SetSizer(sizer) self.Fit() data = [100, 200, 300,] index = [1,2,3] self.axes = self.fig.add_subplot(111) etframes.add_range_frame(self.axes,minmax(data),minmax(index)) self.axes.plot(index,data,'go')

    class App(wx.App): def OnInit(self): self.frm = ActionFrame(None,title=”Action Frame”) self.frm.Show() self.SetTopWindow(self.frm) return True

    if name == ‘main‘: app = App(False) app.MainLoop()

  8. Norbert Klamann says:

    Same problem occurs, wen I use ‘scatter’ instead of ‘plot’.

    This is the first part of the traceback:

    Traceback (most recent call last): File “C:Python25Libsite-packagesmatplotlibbackendsbackend_wx.py”, line 1081, in _onPaint self.draw(repaint=False) File “C:Python25Libsite-packagesmatplotlibbackendsbackend_wxagg.py”, line 61, in draw FigureCanvasAgg.draw(self) File “C:Python25Libsite-packagesmatplotlibbackendsbackend_agg.py”, line 358, in draw self.figure.draw(self.renderer) File “C:Python25Libsite-packagesmatplotlibfigure.py”, line 624, in draw for a in self.axes: a.draw(renderer) File “C:Python25Libsite-packagesmatplotlibaxes.py”, line 1345, in draw a.draw(renderer) File “C:projektelm2.worksrcviewyagutetframes.py”, line 67, in draw rf = self.make_range_frame() File “C:projektelm2.worksrcviewyagutetframes.py”, line 85, in make_range_frame colors=[self.color]) File “C:Python25Libsite-packagesmatplotlibcollections.py”, line 678, in init self._colors = _colors.colorConverter.to_rgba_list(colors) File “C:Python25Libsite-packagesmatplotlibcolors.py”, line 327, in to_rgba_list c[i] = self.to_rgba(cc, alpha) # change in place File “C:Python25Libsite-packagesmatplotlibcolors.py”, line 309, in to_rgba raise ValueError(‘to_rgba: Invalid rgba arg “%s”n%s’ % (str(arg), exc)) ValueError: to_rgba: Invalid rgba arg “(100, 300)”

Leave a Reply