.
EdwardTufte / Presenting Data and Information Seminar

Presenting Data and Information Seminar
A one-day seminar by Edward Tufte
26 August 2004, Indianapolis, IN

This one-day seminar was fascinating, insightful, amusing, and enlightening. It covered the subject matter mostly at a higher level, providing concepts and guidelines more often than specific specialized solutions. The remainder of this page is essential a reprint of the trip report I produced for my employer (that report being a requirement since my attendance was made possible by the corporate training budget).

Contents

Cover BE
Note

Since this report was written, Mr. Tufte has written a fourth book on information design entitled Beautiful Evidence. I have not had a chance to look at that book, yet; however, I fully expect it to be as outstanding as the first three books highlighted below.

Business Week named Beautiful Evidence one of the Best Innovation and Design Books for 2006, calling it “a brilliant masterpiece, the Galileo of graphics has done it again.”

ZDNET called Beautiful Evidence one of the Best Business and Technology Books of 2006, saying, “Tufte will get you thinking about the meaning of words and images, not to mention your ability to tell the truth. A beautiful book.”



The Seminar and the Speaker

from the course flyer…

A one-day course given by Edward R. Tufte (pronounced TUFF-tee), author and publisher of three books on analytical design:

Cover VDQI Cover EI Cover VE
The Visual Display of Quantitative Information --- the classic book on statistical charts, graphs, and tables. “A visual Strunk and White” (The Boston Globe). One of the “Best 100 books of the 20th century” (amazon.com). Envisioning Information --- maps of data and evidence. Design strategies for complex information. High resolution displays. Escaping flatland, mapping, narrative. Visual Explanations --- depicting evidence relevant to cause and effect, decision making. Design of presentations. Interface design. Scientific visualization.

Topics covered in this one-day course include:

  • fundamental strategies of information design
  • color and information
  • statistical data: tables, graphics, and semi-graphics
  • business, scientific, research, and financial presentations
  • complexity and clarity
  • effective presentations: on paper and in person
  • use of video, overheads, computers, and handouts
  • multi-media, internet, and websites
  • courtroom exhibits
  • design of information displays in public spaces
  • animation and scientific visualizations
  • design of computer interfaces and manuals
Edward Tufte

Edward Tufte is Professor Emeritus at Yale University, where he taught courses in statistical evidence, information design, and interface design. He has written seven books, including Visual Explanations, Envisioning Information, The Visual Display of Quantitative Information, and Data Analysis for Politics and Policy. He wrote, designed, and self-published the three books on information design, which have won 40 awards for content and design. The New York Times described him as “The Leonardo da Vinci of data”.



Course Notes and Commentary

one person’s perspective of the course…

Introduction and Overview

Though some of the time seemed merely the rantings of an artist, Mr. Tufte did cover many excellent and relevant concepts. Overall, the seminar was strong on doctrine but somewhat weak on practical, day-to-day application. Historical presentations of data seemed to capture a large slice of time while only a smattering of new concepts for data presentation could be found. The theme may have been that, for the most part, the geniuses of the past (like Euclid, Newton, Galileo, Copernicus) knew how to get their data laden points across using efficient and effective presentation techniques; while we today have settled for the inane, data-sparse eye candy of PowerPoint graphics.

The course could be broken out into five major sections: the grand principals of thoughtful (data presentation) design, the display of financial information, user interfaces (desktop and web), the cognitive style of PowerPoint, and effective presentations. As you will understand when I get to the PowerPoint section, Mr. Tufte did not use presentation software during his course. The handout he did provide (an 11x17 sheet folded in half) contains: an approximate schedule for the day, a list of potential topics for the day and the relevant sections within the three books (which were included in the course materials) for those topics, Mr. Tufte’s list of some 20th century classics of information architecture, a full page devoted to a keyphrase listing of the contents of www.edwardtufte.com, and suggestions for further reading, etc.

He talked quickly, refused to allow questions until the very end of the day, and moved about the room constantly. His lack of “crutches” (his term for slides, notes, or outline) certainly bolstered the appearance of a strong command of the topics covered. Perhaps the lack of those “crutches” also forced the listener to frantically take notes (thus involving auditory, visual, and kinetic modes of learning) or risk losing the pearls so freely cast about amid the pomp and fluff.

A brief summarization of Mr. Tufte's three books:

  • Visual Display of Quantitative Information (VDQI) discusses pictures of numbers;
  • Envisioning Information (EI) discusses pictures of nouns; and
  • Visual Explanations (VE) discusses pictures of verbs.

The following is a compendium of those frantically taken notes, images and discussion from the books, the Edward Tufte website, and various online commentators.

The Grand Principals of Thoughtful Data Presentation Design

There are two profound issues in data presentation. First, nearly everything interesting is multi-variate; but, it must be communicated via two dimensions. Second, the range of information resolution required --- bits of information per unit area or unit time --- has grown exponentially from approximately 11 orders of magnitude before Galileo’s time to more than 40 orders of magnitude today.

In light of these two issues, Mr. Tufte presented the following grand principals for the thoughtful design of data presentation. Where did these design principals come from? The purpose of analytical thinking is to provide insight. Good design is “clear thinking in action;” while bad design is simply “stupidity in action.” So these principals, which are indifferent to language, gender, culture, and history, encapsulate processes required for insight.

Show comparisons. For example, in the March to Moscow graphic, the size of the army through space and time is easily compared.

Show causality. This includes mechanisms, dynamics, and structure. Again in March to Moscow, the effect of temperature on attrition can be clearly drawn, and the catastrophe at the crossing of the Berezina River is quite obvious.

Show multi-variate data. Although the medium is limited to two dimensions, the information should not be. In March to Moscow, six dimensions are shown: size of army, geographic location (2 dimensions), direction of movement, time, and temperature.

Integrate word, number, and image. It is all evidence and must be integrated. The text should not be separated from the data and neither should be separated from the visuals. For example, legends distract the viewer and cause a level of indirection which limits comprehension; one should directly label the graph or chart rather than coding it and presenting a separate legend.

Document everything. This includes where the information came from, and scales of measurement used in the visual or dataset. This is essential for believability.

Deep knowledge and caring for the content. Preserve and promote the quality, relevance, and integrity of the content. A key rule: “do no harm” to the content with your presentation.

Adjacent in space rather than stacked in time. The human mind is better able to make comparisons over time when the data can be seen all at once, rather than stacked in time (i.e. one page after another).

Galileo's sunspot drawings
Small Multiples --- these sunspot images as printed by a contemporary of Galileo are effectively an animation on the printed page.

Use small multiples. One approach to resolving the preceding principal is to use small multiples. A small multiple is a collection of charts or graphs, each with the same size and scale but showing the data at a different time (or along any other additional dimension). The figure above, from EI, p. 19, shows how Christopher Scheiner (a contemporary of Galileo) used small multiples when showing his sunspot data. This presentation method is easy on viewers, and demonstrates credibility to the reader. Another type of small multiple is the comparison chart (such as on EI p. 31).
Sunspot Graph

Put everything on the universal grid. Provide a scale of measurement and be consistent. For example, see NASA Image of Venus Deceptive. The human mind picks out variations better in graphs with mostly 45 degree slopes; graphs that are too flat or too spiky simply hide information (as shown in the picture to the right from VE p. 25 --- the bottom graph reveals more information, such as the asymmetry of each peak).

As discussed in VE chapter 3, you need to negate the rules of magic (which is simply disinformation) in order to better understand the rule of presentation (information). The magician rarely tells (fully) what they are about to do. The presenter of information, then, should always tell what they are about to do (or have done). Presentation is teaching.

Small but effective; clear but no more. Show the smallest effective difference, minimize contrast. In information presentation, 1 + 1 = 3; that is, two lines create three spaces. Those two lines have activated the negative space. See Tufte Clarifies the Ear. Another example is the org chart: the boxes do not add any information and should be removed or made as unobtrusive as possible.


This is one of many times during the day where the concept of Data Ink and the Data Ink Ratio was discussed. Data Ink is defined as the non-erasable core of the graphic; and the Data Ink Ratio is the ratio of Data Ink to total ink used in printing the graphic. Thus, 1.0 - data-ink ratio is the fraction of the total ink that can be removed without harming (and usually enhancing or highlighting) the information. This brings us to Tufte's 5 Laws of Data Ink:

Above all else, show the data. This is the most important of the five maxims because the data ink is undefined until one has first developed a purpose for the graphic. Write a topic sentence for each graph before you begin to compose it; if you have no topic, then perhaps the graph is mere eye candy or filler fluff.

Maximize the data ink ratio. The remaining three laws provide a method for doing this.

Erase non data ink. Remove or deemphasize whatever does not highlight the information of your topic sentence. Legends are distractions, directly label wherever possible. Your data should stand out, make non-data lines thinner or lighter or erase them altogether if they do not provide additional context. Sometimes a simple table is all that is needed… erase the graphic completely!

Erase redundant data ink. Bar charts are typically overflowing with redundant data ink, with wide bars and internal shading or crosshatching. Simple thick lines, connected at the bottom to provide grouping, can be used just as effectively.

Revise and edit. Revision and editing are equally important for scientific visualization as for writing. The intellectual content is not changed by editing, only the clarity. In an era of “Short Attention Span Theater,” a graph is pointless unless it is clear. Emphasizing the data is purely an artistic touch because the cognitive content of the graph is not altered. However, design touches do matter because scientists and engineers always have too many papers to read and too little time. A paper with clear, easy-to-decode graphs will make a much more lasting impression than one with confusing illustrations that require a lot of concentrated attention.


Another concept mentioned several times throughout the day and relevant here is that of data density. Mr. Tufte defines data density as the number of data values represented in a graphic divided by the area of the graphic. The goal, according to Mr. Tufte is to maximize data density without sacrificing clarity or interfering with the cognitive task the graphic is meant to invoke. A corollary to this maximization of data density is that, according to Mr. Tufte, most graphics can be shrunk significantly. Bar charts and other Pravda graphics typically present less than 5 numbers per square inch (and Pravda graphics typically present them with distortion and copious distracting non data ink). The best scientific journals, such as Nature and Science average more than 20 numbers per square inch, with extremes approaching 1000 numbers per square inch. Maps typically provide the greatest density; for example, a 27 square inch map of the boundaries of the 30000 communes in France depicts nearly 9000 numbers per square inch. The record is currently held by the recent map of galaxy density (as measured by the Hubble Space Telescope team) which depicts the number of galaxies in each of 2,275,328 pixels (thus 3 numbers per pixel) on a 61 square inch map, yielding a data density of 110000 numbers per square inch.

You must watch over your content… no one else will. Your goal in presenting information is to invoke a content response, not a presentation response. And always keep in mind, “what is the cognitive task this display should facilitate?”

The Display of Financial (Time Varying) Information

Enable assessment of change. Show the data, not empty space. Show the context (see VDQI pp. 74-75). To assess change one needs to know the context over time and the context with similar information.

Enable assessment of average / variation. In order to understand the story before we can decide whether to believe it. VDQI p. 30 shows an excellent presentation of annual weather data for a specific station. One can not only quickly see the overall trend of temperature throughout the year but also quickly pick out the daily variation and the extremes. This is an excellent example of providing both DETAIL and OVERVIEW. Graphical design is simple, it is the content that should be rich and complex.

Provide standardization. When showing money vs. time over a multi-year span, you should somehow show (or remove) the effect of inflation. For example, a chart of the Dow Jones Industrial Average for any time period greater than five years would be more accurate if presented in terms of constant dollars (i.e., the value of a dollar in a chosen year).

Footnotes are essential. Never trust a financial display without footnotes.

Annotate. Nearly all financial data is descriptive, it doesn’t address causality. Annotation brings causality into the display.

Don’t worry about being original, just get it right. Follow established examples of data visualization. Sports tables and financial graphs are well understood by their frequent users. Talent imitates but genius steals.


Sparklines
Galileo's Saturn

At this point the discussion turned to the concept of sparklines, one of the few truly novel visualization concepts presented. A sparkline is an inline graphic — Mr. Tufte describes them as “intense, word-sized graphics.” Well, “truly novel” may be inaccurate since the first observed use of the sparkline concept was by Galileo in his 1613 report entitled, Istoria e dimostrazioni intorno alle macchie solari. Galileo presents Saturn as a visual noun, along with a comparison of clear and unclear views, as shown in the scan of the original print (right). A full extract of the chapter on sparklines in Mr. Tufte’s forthcoming book, Beautiful Evidence, is available in the “Ask E.T. forum” on edwardtufte.com.

Sparkline Axes

The sparkline concept can also be used in two-dimensional graphic visualizations. This example (see left) shows how the sparkline concept can be used to add information to a scatterplot, by enabling visualization of the distribution of the data along each of the axes. Sparklines can be used in tables and in complicated visuals.

Unfortunately, the current-day computer approach typically segregates word from graphic from number, since any given program or application usually only deals with one of those well, contradicting the spirt of the sparkline. The sparkline exists to remove the distinctions between words, numbers, and graphics. As Mr. Tufte concludes, “It is all evidence, after all.”

Sparklines have caught on quite rapidly. Yahoo! has been seen using them (though they are encased in a distracting dominating frame), and there is a SourceForge project to create sparklines for use in web pages (sparkline.sourceforge.net, the Sparkline PHP Graphing Library).

User Interface Designs on the Desktop and the Web

The original Graphical User Interface, developed by Xerox, simply depicted printer(s), a trash can, and documents. There was no concept of “application” or “operating system”. There was no “market experience”. It concentrated on content and user activity, as any well-designed user interface should. Today’s GUIs (Windows, Mac OS/X, etc.) force the attention on applications and distract the user with marketing. User interface design today represents beaurocratic hierarchy, not content and activity.

Good user interface design emphasizes content, and the interface design should drive the application design (not vice versa, as most programmers would prefer). Software engineering is one of only two industries that describes its customers as “users”. Both industries seem to focus on what is best for the provider, with little regard for its effect on the consumer.

For example, consider Mr. Tufte’s user interface for the kiosks at the National Gallery in Washington, D.C. (discussed in VE pp. 146-148). Only the top 10% of the main interface is devoted to “computer administration” --- a box indicating that this is a touch-sensitive screen and a selection of languages for the interface. The remaining 90% is devoted to listing (text-only) the permanent facilities of the gallery and (graphics + text) the current special exhibitions. Once an item is selected, the resulting display again devotes only the top 10% to computer administration --- displaying the currently selected item and two possible user activities. The remainder of the display shows a live video image of the visitor at the kiosk within the context of the main lobby, a three-dimensional guide map with a marked route showing the visitor how to reach the facility they selected, along with textual step-by-step directions. One of the two actions is to print out this display, so the visitor can have a keepsake (their picture in the National Gallery lobby) and directions they can take with them to navigate the many buildings comprising the National Gallery.

The Cognitive Style of PowerPoint

Included in the course handouts was a 28 page (8½ by 11) essay entitled The Cognitive Style of PowerPoint. That essay begins with this story of Louis Gerstner’s first days as president of IBM:

One of the first meetings I asked for was a briefing on the state of the [mainframe computer] business… At that time, the standard format of any important IBM meeting was a presentation using overhead projectors and graphics that IBMers called “foils” [projected transparencies]. Nick [Donofrio, then running the System/390 business] was on his second foil when I stepped to the table and, as politely as I could in front of his team, switched off the projector. After a long moment of awkward silence, I simply said, “Let’s just talk about your business.”

Gertsner’s simple statement indicated a desire for an exchange of information, an interplay between speaker and audience. PowerPoint is entirely presenter-oriented, not content-oriented or audience-oriented. After all, even the PowerPoint marketing is directed solely at presenters: “A cure for the presentation jitters.” “Get yourself organized.” “Use the AutoContent Wizard to figure out what you want to say.” Slideware helps speakers outline their talks, retrieve and show diverse visual materials, and to communicate slides in talks, printed materials, and the internet. It also usually replaces serious analysis with chartjunk, over-produced layouts, cheerleader logotypes and branding, and corny clip art. In other words, “PowerPoint Phluff”!

The cognitive style of the standard PowerPoint presentation is characterized by: foreshortening of evidence and thought, low spatial resolution, a deeply hierarchical single-path structure as the model for organizing every type of content, breaking up narrative and data into slides and minimal fragments, rapid temporal sequencing of thin information rather than focused spatial analysis, conspicuous decoration and Phluff, a preoccupation with format not content, an attitude of commercialism that turns everything into a sales pitch.

The statistical graphics generated by the PowerPoint ready-made templates are thin, nearly content-free. In 28 PowerPoint instructional books Mr. Tufte surveyed, the 217 graphics shown depicted an average of 12 numbers each. That is extremely low data density. Mr. Tufte surveyed various worldwide publications (most issued in 2003) and found that data density ranged from more than 1000 numbers per graphic (for Science) to typically 25-50 numbers per graphic for major news magazines (e.g. Time, Le Monde, Financial Times). The only publication Mr. Tufte surveyed which had a lower density than the PowerPoint graphics was Pravda from 1982, when that newspaper was the chief propaganda instrument of the Soviet communist party. As Mr. Tufte says, “Doing a bit better than Pravda is not good enough.”

Bullet lists may create the appearance of hard-headed organized thought, but in reality, as determined by a study reported in the Harvard Business Review, they represent “generic, superficial, simplistic thinking.” To be borderline sardonic… here are the three primary conclusions of that report:

  • Bullet lists are typically too generic.
  • Bullets leave critical relationships unspecified.
  • Bullets leave critical assumptions unstated.

A major case study related in Mr. Tufte’s essay concerns the Columbia space shuttle flight which culminated in disaster on 1 February 2003. During the spaceflight in January, Boeing Corporation engineers prepared three quick reports assessing possible damage to the left wing during liftoff (due to chunks of insulating foam breaking away from the fuel tank), and the potential consequences of that damage. The Columbia Accident Investigation Board (CAIB) found that the reports provided an over-optimistic assessment of the danger facing the Columbia on reentry. All three of the Boeing reports suffered from typical PowerPoint problems: elaborate bullet outlines, segregation of words and numbers (12 of 14 slides with quantitative data have no accompanying analysis); atrocious typography, data imprisoned in tables by thick nets of spreadsheet grids, and only 10 to 20 short lines of text per slide.

In these reports, every text-slide uses 4 to 6 levels of bullet hierarchy, each slide starting the hierarchy anew. This rigid hierarchy, indifferent to content, sliced and diced the evidence into arbitrary compartments, resulting in “an anti-narrative” with choppy continuity. To compound matters, this hierarchy of information was filtered as it rose through the hierarchy of NASA. The CAIB observered:

The Mission Management Team Chair’s position in the hierarchy governed what information she would or would not receive. Information was lost as it traveled up the hierarchy. A demoralized Debris Assessment Team did not include a slide about the need for better imagery in their presentation to the Mission Evaluation Room. Their presentation included the Crater analysis, which they reported as incomplete and uncertain. However, the Mission Evaluation Room manager perceived the Boeing analysis as rigorous and quantitative. The choice of headings, arrangement of information, and size of bullets on the key chart served to highlight what management already believed. The uncertainties and assumptions that signaled danger dropped out of the information chain when the Mission Evaluation Room manager condensed the Debris Assessment Team’s formal presentation to an informal verbal brief at the Mission Management Team meeting.

At the same time these three Boeing reports were being analyzed, low-level NASA engineers where discussing the possible danger to the Columbia in several hundred emails. 90% of those emails used paragraphs and sentences; only 10% used bullet lists, and then only with 2 or 3 levels. Engineers were able to reason about the issues without employing the multi-level hierarchical outlines of the original PowerPoint pitches.

15 years earlier, as a member of the commission that investigated the Challenger shuttle accident in 1986, Dr. Richard Feynman had this to say about the bullet-outline format style of NASA:

Then we learned about “bullets” --- little black circles in front of phrases that were supposed to summarize things. There was one after another of these little goddamn bullets in our briefing books and on slides.

The CAIB finally noted:

As information gets passed up an organization hierarchy, from people who do analysis to mid-level managers to high-level leadership, key explanations and supporting information is filtered out. In this context, it is easy to understand how a senior manager might read this PowerPoint slide and not realize that it addresses a life-threatening situation.

At many points during its investigation, the Board was surprised to receive similar presentation slides from NASA officials in place of technical reports. The Board views the endemic use of PowerPoint briefing slides instead of technical papers as an illustration of the problematic methods of technical communications at NASA.

Yet, in a survey of PowerPoint slides, Mr. Tufte found that the Boeing reports were quite above average in information density. The Boeing reports had an average of 97 words per slide; typical PowerPoint presentations posted on the internet and top-ranked by Google (March 2003) had an average of 40 words per slide (on 1460 text-only slides); and for 654 slides shown in the 28 PowerPoint textbooks the average is 15 words per slide.

The cognitive style of PowerPoint is the style of the software corporation that develops and markets it. Microsoft is a big bureaucracy engaged in computer programming (deeply hierarchical, nested, highly structured, relentlessly sequential, one-short-line-at-a-time) and in marketing (fast pace, misdirection, advocacy not analysis, slogan thinking, branding, exaggerated claims, marketplace ethics).

The solution? Use PowerPoint as a projector for showing low-resolution color images, graphics, and videos that cannot be reproduced as printed handouts at a presentation. The average person can speak 100-200 words per minute, yet that average person can read 400-500 words per minute. A single 11 by 17 sheet of paper printed on both sides and folded in half (to make a 4 page 8½ by 11 brochure) can contain the content of 50 to 150 PowerPoint slides.

Guidelines for Effective Presentations

Finally, in a fast-paced delivery as time slipped away, Mr. Tufte provided these guidelines for effective presentations:

  • Show up early
  • Problem / Relevance / Solution should be covered at the beginning --- never apologize; see how long you can go without using the 1st person
  • Particular / General / Particular emphasis when presenting a complex chart (or concept) --- first point out a detail, then discuss the big picture, and end by pointing out a different detail
  • Have handouts --- the Wall Street Journal is the highest resolution newspaper in the world, and the 2nd best selling… people don’t become stupid just by coming to your presentation, they can read
  • (I missed this point)
  • (I missed this point)
  • Audiences deserve our respect --- again, do not underestimate their intelligence
  • Humor must be relevant --- humor does make your presentation memorable, but be careful
  • Avoid gender-specific pronouns --- use 3rd person plural: rather than "the user, he", use "the user, they"
  • Get out from behind the podium
  • Be believable
  • Be convinced of your presentation
  • Finish early
  • Before the meeting: PRACTICE, PRACTICE, PRACTICE! --- preferably in front of a video camera, review your presentation and be aware of any distracting habits
  • Before the meeting, ensure the quality, relevance, and merit of your content

Conclusion

Congratulations if you made it here! This is a lot to wade through; but I left out a lot of examples and detail. There was so much more presented and such interesting visual aids which I can’t possibly reproduce here. For example, Mr. Tufte owns several antique books, including early editions of Euclid, Galileo, and Newton, which he brought to the presentation and had his assistant carry throughout the room (as the concept it represented was being discussed).

Overall, in my opinion (despite my early comments above that may have led you to think otherwise), this course was time and money well spent.

If there were any hope of pulling corporate America out of the depths of PowerPoint and bulletized thinking, and I had the authority to do it, I would make The Cognitive Style of PowerPoint required reading for everyone, especially anyone to whom others report.

Images on this page were scanned from Edward Tufte’s books or obtained from his website. These images may be copyrighted by Edward Tufte or their original creators. It is believed inclusion on this page falls under the Fair Use clause; but do not use them in your own works without checking with Edward Tufte.


Life is a joke that has only just begun.

W.S. Gilbert

A2 Web Hosting
Creative Commons License
Get your Domain at GoDaddy.com
search engine by freefind advanced

loaded 2017-12-15 16:57:02 • last modified 2013-10-05 03:22:08
Privacy PolicyDisclaimer
• awcolley.com is powered by PmWiki
• all content (unless otherwise noted) © 2017 A W Colley