Several people ask of the WPA graphing question, why not use a log scale? Commenter Stinky (no, I don’t know who s/he really is) kindly supplies a graph showing just this. For my money, it speaks for itself—which is to say, it screams, “don’t use me!”
We want to accomplish two things: (1) show how very outsized a chunk of money went to highways and (2) show also meaningful distinctions among lesser expenditures.
The log scale permits (2) while pretty much wiping out (1), unless you know how log scales work. I don’t think the likely consumers of such a graph do really know. But I’m wrong, Stinky says.
20 comments
January 9, 2009 at 2:23 pm
ben
Do you really need, for internet purposes, to worry about the horizontal length of the graph? You can accomplish (1) and (2) by keeping a constant scale that shows meaningful distinctions among the lesser expenditures throughout, the graph will just be extremely long.
January 9, 2009 at 2:23 pm
Vance
Perhaps you could use the length of each bar to represent the desired salience of the datum it stands for, rather than a redundant re-encoding of the numerical value.
January 9, 2009 at 2:30 pm
ben
Then there would be different graphs of equal accuracy with completely different bar lengths, depending on whose salience-desires you’re taking into account.
January 9, 2009 at 2:31 pm
ben
The length of the graph, the pain of scrolling all the way to the right, under the Wolfson Scheme would, in fact, highlight in the user’s subjective experience just how outsize was the largest chunk.
January 9, 2009 at 3:19 pm
eric
ben, I’ve got other-than-Internet uses in mind. E.g., on-screen in classroom presentations, say.
January 9, 2009 at 4:08 pm
jazzbumpa
Since you’re trying to do two things, why not use two graphs, instead of force-fitting a 1-size-fits-all solution?
Graph 1, as originally presented, when the topic was first broached, showing the outsized highway expenditures. Graph 2, perhaps with that line eliminated, or just running off the chart, can then display the relationships among all the other stuff.
I think this fits well into the classroom use scenario.
January 9, 2009 at 4:14 pm
Barry
eric: “We want to accomplish two things: (1) show how very outsized a chunk of money went to highways and (2) show also meaningful distinctions among lesser expenditures.”
I strongly disagree. (1) is the major purpose. (2) is very secondary, and is only of interest to people going beyond the first cut.
January 9, 2009 at 7:11 pm
Prof Burgos
Both good graphs but for totally different applications. The logarithmic would be great in a scholarly article, but the one you received earlier, with the bins that increased by orders of magnitude did a nicer job of capturing the differences.
Of course, if you really wanted to go crazy and this was for in-class projection on a screen, you could go the way they’ve been doing it in _Foreign Policy_ for the past few years, and have dollar bills scaled to varying sizes to represent the different amounts. (They’ve been doing a lot of this kind of caricature work lately — skyscrapers, loaves of bread, etc.) So students would see a gigantic dollar bill for highways, smaller ones as the amounts decrease. (Or pennies or dimes and etc.)
January 9, 2009 at 7:25 pm
stinky
“unless you know how log scales work. I don’t think the likely consumers of such a graph do really know.”
To accomplish (1) and (2) in the same page-sized diagram, you need to transform or elide the data. Using a log scale seems like the simplest transformation that isn’t subject to criticism as being dishonest or whatnot.
That said, the diagram above kind-of sucks. The numbers on the edge of the bars are meaningless (they’re just relative quantities, should be dollars if present at all). And the log scale isn’t made explicit in the diagram itself–the spreadsheet I used doesn’t do that, if you send me the data I’ll redo it with a proper linear-loggy grid (like this: http://www.data.scec.org/Module/Pics/s2a9logr.gif).
To the extent that students are familiar with linear-log plots, the logified axis is usually the y-axis (like in high school physics class). Using the y-axis kind of clears this up, but at the expense of requiring rotated category names or some kind of legend.
One thing I’ve tried (in a different context) is to use color to reinforce that the loggy axis isn’t linear is color. Transparency doesn’t capture this too well, saturation works better. I think the best I could do is a “temperature” fill where the bars fill from cool blue to red-hot using a log scale; don’t know how to do this in the spreadsheet either.
One thing that might work with this data to generate a nice graph is to keep the horizontal log scale, but vary the height so that the area of each bar is proportional to the dollar value. You’d satisfy (1) as the areas would show outsized highway expenditures and (2) because of the “resolution” within each of the “bands”. I’m not sure how people intuitively compare areas, though.
For web-based presentation, there are innumerable ways to reconcile (1) and (2) but on the printed page, I think linear-log is the way to go to reconcile orders-of-magnitudes within the same category without being subject to claims of obfuscation. My attempt, though, leaves a lot to be desired.
January 9, 2009 at 11:26 pm
Duncan Agnew
What does the length of the bar mean, on a log scale? Surely if you are going to use logs (and I agree that they obscure the point about highway spending) you need to use dots.
Having followed all the discussion here and at Crooked Timber, I think the most easily comprehended solution is two graphs, one being a blowup of the bottom of the other, with an outline box on the first graph and an arrow to make the connection. My version of this was more compact but perhaps too clever.
January 9, 2009 at 11:42 pm
Vance
I think you put your finger on the problem with the log scale, Duncan. A certain difference in length indicates a certain ratio in value — but can you tell, looking at the chart above, which pair of bars stands in a ratio of 2:1?
January 10, 2009 at 12:07 am
Vance
That is, which pair of values. And actually it’s not that hard. The gap between the vertical markings represents a ratio of 10, so 2 is a bit less than a third of the way.
January 10, 2009 at 3:38 am
sharon
I’ve been thinking you should jettison the bars/dots/lines approach altogether and think about bubble charts of some kind. And I needed an excuse to try out this data visualisation site I found recently. So.
January 10, 2009 at 4:29 pm
sdh
Nix the log graph — I don’t think log scales work well with a general audience.
The bubble chart is eye-popping and intuitively obvious. Great idea.
January 10, 2009 at 5:39 pm
Cosma
I told you the log scale wouldn’t work. The bubble chart, on the other hand, is nice.
January 10, 2009 at 11:38 pm
rja
The bubble chart is very good–it’s easy to read and it looks good, too.
Also, with only slight modifications, it could simultaneously be used to test for red-green colorblindness.
January 11, 2009 at 3:51 pm
eric
I told you the log scale wouldn’t work.
I knew that, but there is a certain kind of social scientist who thinks you aren’t doing social science unless you’ve logged all your variables.
January 11, 2009 at 6:32 pm
Western Dave
My 10th graders would totally get the bubble chart. They wouldn’t have a prayer with the log graph.
January 12, 2009 at 6:33 am
David
Eric,
What do you think of this New Deal graph? Does it really summarize the New Deal debate?
http://macromarketmusings.blogspot.com/2009/01/great-depression-debate-in-one-picture.html
January 12, 2009 at 10:37 am
Nutshell. « The Edge of the American West
[…] to commenter, uh, David Beckworth, for pointing this […]