Feeds:
Posts
Comments

Here’s an odd discovery. If you search Google’s geolocator for “New York”, it will give you the following coordinates: Latitude: 40.7143528 Longitude: -74.0059731. If you search for “Boston”, it responds with: Latitude: 42.3584308, Longitude: -71.0597732. Chicago: Latitude: 41.8781136, Longitude: -87.6297982. These locations look a little something like this.

Not bad, right? I mean, Google pretty much nailed it. Those are indeed New York, Boston and Chicago.

But’s let’s say you want to know the street address of these locations. I’ll admit it, I’m curious enough to wonder about these types of things. For years now, we’ve known that—for whatever reason—Coffeyville, Kansas is located in the exact center of the United States on Google Maps. Just load Google Maps and zoom in: Coffeyville. Let’s take this a step further. Say you want to verify that the coordinates Google has just given you are indeed located within the city you have specified. Say you want to know the places Google considers to be the exact center of New York, Boston & Chicago. Google allows you to do this via the “reverse” option in their Geolocation API.

Before I go any further, I’ll give a little background to provide some context (why would I even have stumbled upon this, anyway?). About a year ago, I was mid-semester in a cartography seminar here at UW-Madison. Our goal: create a series of animated maps showing global Twitter trends. It was—in some ways—a shameless attempt to go viral via two extremely hot forms of media. I mean, let’s face it, in the realm of place-based data derived from social media, what’s hotter than Twitter? And in cartography—regardless of data quality or clarity of message—what gets more attention than animated maps? Maps we tend to see most tweeted are things like Alexander Chen’s Conductor: MTA.ME and the OpenStreetMap 2008: A Year of Edits.

Our seminar was a mild success, creating three animations that more or less went unnoticed on the blogosphere (continue not noticing them here). Ironically, the only thing to go viral that had anything to do with our seminar was a static map. Daniel Huffman, with the aid of Jeremy White’s Twitter Hitter application (created for the seminar), received a bit of press for his map which was used as cover art for Cartographic Perspectives.

But I digress. The point of this post is to bring attention to something bizarre I just noticed about the Google Maps Geolocation API. One of my tasks during this seminar was to investigate the feasibility of geolocating non-geolocated tweets. Gosh, that sounds like gibberish. What I mean is this: some folks have fancy phones which attach coordinates to their tweets; others do not. The vast majority of tweets (nowadays, anyway) do not have coordinates attached to them. They do, however, have user-specified place names (Oella, Maryland, for example). So, my task was to see if it was possible to geolocate (get coordinates for) tweets that did not have user-specified coordinates. To be frank, the whole thing was a debacle. My findings: geolocating tweets via a non-coordinate-based system was not possible (or at least not advisable). Feel free to read about it here and here.

Nevertheless, some decently useful tools were born of my geolocation frustrations. Specifically, I wrote a handy little Python script that did the following:

  1. Read a spreadsheet for place names
  2. Sent place names to Google geolocator
  3. Recorded coordinates from Google geolocator back to original spreadsheet
  4. For error-checking, the Google coordinates were sent back to Google’s reverse geolocator
  5. New place names from the reverse geolocator were added to spreadsheet to check against original place names (and to catch ambiguous place names, like “Porland” or “Springfield”).

A year ago, when I wrote this script, the search terms “New York”, “Boston” and “Chicago” would yield the same coordinates as above. But, when I dug out this script yesterday, blew the dust off of it and took it for a spin, I noticed that something fundamental had changed about the reverse geolocator. Last year, it would yield a little something like this:

But now, these search terms yield:

What!? So… the reverse geolocator does not give you an address any more? It gives you a business? Is the reverse geolocator telling us, “If you are at Lat 40.714 and Long -74.005 and you have a tummy ache, give Dr. Suneeta a call!”? Hmm… I wonder if the parameters are all wonky in my script. I mean, surely not all places are centered around a business, right? Think of how quaint and cute it is that little Coffeyville, Kansas is the center of the United States. Uh-oh, hold the presses…

D’oh. That’s a business… and it’s not even in Kansas!

Pictures are great. People understand them. When someone sees a picture of a house, they are likely to recognize it as just that: a house. Similarly, when someone sees a picture of a small house next to a large house, they will immediately determine some kind of proportional relationship between the two houses. This is one reason why we see pictorial charts in books, magazines and newspapers (oh, right, and on the internet too). Robert Harris, in Information Graphics, points out a number of other benefits of pictorial charts:

  • To make the document more interesting and appealing.
  • To make the material more understandable to a greater number of people [they are not language-dependent].
  • To improve communication in situations where the appearance of an item is better known than the name.
  • To facilitate easier reading of a chart or graph by including information to orient the reader that otherwise might have been shown in a legend or note.

I can get behind these points (though, it should be said that if a chart is not needed, it probably should not be added simply to “make the document more interesting and appealing”). Pictures may be understood faster than bar charts, where the reader must trace the horizontal and vertical axes to their respective labels in order to determine what (and how much) is being represented.

That doesn’t mean we should always use pictures though. In fact, we should be careful with them. A picture can be too specific or too general, inadvertently leaving out  some part of the dataset in the mind of the reader. (Who is this guy, anyway? I don’t look like that guy!). Pictures and icons can also be tricky because of different accepted traditions and conventions. A good rule of thumb might be: if you aren’t sure the picture clearly represents your data set, don’t use it.

One particularly popular—yet often misleading—form of pictorial chart is the proportional chart or diagram. Here are some examples… and thoughts on how we could do better.

Oh, my. What do we have here? From Arkin’s Graphs: How to Make and Use Them (1936), these are… well, oranges. The orange on the left represents “44,319 carloads of oranges”; the orange on the right represents “84,944 carloads of oranges”. I wonder if it the US Department of Commerce retained a single car for 10 years explicitly to facilitate this orangey comparison. That seems likely enough.

But in all seriousness, this is a pictorial proportional diagram that isn’t entirely necessary (except as an example of what not to do, as the author has used it). Generally, readers cannot estimate volume based on a diagram nearly as well as they can other visual variables. What about you? When you first saw this, did you look at it and think, “Golly, they sold 1.9 times more carloads of oranges in 1931!”? Or did you simply think, “Hmm… more oranges 10 years later.”

These ladies make the orange chart look like a work of information visualization genius. Arkin stands behind this one, suggesting that it is a stronger graphic since the volume of the figures is used to make the comparison and the pictures are representative of the data they depict (that is, the appearance of the women varies based on age bracket). Sexism of the trans-generational shopping bag aside, charts of this type have another problem to contend with. If volume of a simple shape is difficult to ascertain, deducing it from a complex or irregular shape is nearly impossible. Schmid (Handbook of Graphic Presentation guy) lists a number of common shapes which are particularly difficult to read in this context: humans (d’oh!), ships, automobiles, houses (whoops!) and domestic animals.

Our old buddy, Brinton weighs in on this topic too. He explains the problem with a specific method: “Charts of this kind with men represented in different sizes are usually so drawn that the data are represented by the height of the man. Such charts are misleading because the area of the pictured man increases more rapidly than his height. Considering the years 1696-1700, the pictured minister has about two and one half times the height of the man representing public service. The minister looks over important because he has an area of more than six times that of the man drawn to represent public service. This kind of graphic work has little real value.” Right on, Brinton.

A reoccurring theme in all of the pieces I’ve read on these charts is that: a. they are really easy to read, but b. they are often accidentally misleading, so c. the shapes used should be as uncomplicated as possible. It makes me think… you know what shape is really uncomplicated? A bar… you know, as in “bar chart”. Ha. Nothing is perfect.

That said, some intriguing alternatives to these somewhat awkward proportional charts have gained popularity of late. Some of you may remember one viral graphic from 2009 showing what $1 trillion looks like. While I’m sure there are arguments against using 2.5D space in this graphic (oblique pie charts are the worst, right!?), I actually find that it emphasizes the point. It gives the reader the ability to say, “Gosh! That money is as far as the eye can see!” Perhaps more to the point, if the author attempted creating this graphic in 2D, there would be some major readability issues due to scale. Imagine all of those palettes stacked vertically in one row. Yikes.

After all, the pictorial chart is not unlike most charts, graphs and maps. Compromises need to be made. As long as those compromises are managed well and the readers are aware of them, they are generally quite effective.

ps. I do enjoy this planet and star size comparison video.

pps. Odd timing: immediately after posting this, I saw that information aesthetics had a moderately related post today entitled Social Compare: Visually Compare the Size of Objects and Concepts. Check it out.

ppps. Bonus chart, from Manual of Charting (1924). Hogs, Rubber and Houses. Veritable peas in a roaring 20s pod.

A map of the Lower 48 states à la Cy Twombly.

On a card.

A map of the Lower 48 states à la Jackson Pollock.

On a card or poster.

In addition to the clock graph, another form of graphic that seems to have been quite popular in the early 1900s is the Ranking (or Rating) Chart. Here, I have plucked two examples. The first is from Calvin Schmid’s Handbook of graphic presentation (1954) and the second from our old friend Brinton’s Graphic Methods for Presenting Facts (1914). In a ranking chart, items or categories are placed in order (generally in a vertical fashion) based on frequency or magnitude. In the case of our first chart, marital grievances are ranked in descending order according to intensity for both husbands and wives. To see if there is any relation, a line is drawn that connects identical grievances from each list. The further the line deviates from horizontal, the less agreement the two parties have on that particular grievance. (Perhaps there should have added another line: “way partner ranked grievances”.)

I quite like this graphic… at least partly because it appears as if not a whole lot has changed in terms of Marital Happiness since 1938. We still hear these types of complaints about our friends’ partners (no, I’m not talking about anyone particular. gosh!). Modern-day additions might be “uses smart phone at dinner table” or “hasn’t changed relationship status on Facebook”. Otherwise, these grievances seem—as David Byrne might say—”same as they ever were”. One question here though: what has the author encoded in the line type that is connecting the two lists? Level of agreement? It’s tough to say.

Here is a time series ranking chart showing state and territory population in the US from 1860 to 1900. Some trends can be seen in this chart: steady high population in NY, PA, OH and IL, while some Midwestern and Western states have growing population. I have to wonder, though: is this data better suited for a set of maps in small multiples? I find this web of tangled lines fairly difficult read. Nevertheless, unlike some forms of charts and graphs, I believe an effective ranking chart may not necessarily have an immediate impact on the reader. Instead, further inspection may be required. With a simple glance at the Marital Happiness chart, for example, a reader may not be able to deduce much of anything. But with closer look, he or she might start notice that lines appear to be straighter at the top and bottom of the chart, with the middle looking more criss-crossy. Without knowing the exact grievances, the reader is able to deduce that at least husbands and wives can more or less agree on the BIG and small stuff… right? Right?

Bogus Hans Hofmann Map

A map of the Lower 48 states à la Hans Hofmann.

Bogus Eva Hesse Map

A map of the Lower 48 states à la Eva Hesse.

Follow

Get every new post delivered to your Inbox.