From the course: Data Visualization: A Lesson and Listen Series

Listen: Alberto Cairo

(contemporary music) - Alright so it's time for the listen part of our episode once again today. I'm very excited to have Alberto Cairo with us today. I'm very excited to have Alberto Cairo with us today. Alberto is the Knight Chair in Visual Journalism Alberto is the Knight Chair in Visual Journalism at the School of Communication at the School of Communication at the University of Miami. at the University of Miami. He also teaches courses on infographics He also teaches courses on infographics and data visualization and data visualization and he's the director of the Visualization program at UM Center for Computational Science. Alberto does consulting with clients. He's written two really great books about data visualization and he's actually working on a third one right now, which relates to today's theme. He's a key figure in today's data visualization community. I'm very excited to have him here today. Alberto, welcome and thank you for joining me. - Thank you so much for having me. - So as I told you before we talk today, our theme for today's episode is truth in data visualization. And I chose you I think for pretty obvious reasons, you were going on a really interesting speaking tour over the past year, I think you're still doing it now right titled Visual Trumpery. So if you could tell us a little bit about that speaking tour and what it was about and what sparked you going in that direction? and what sparked you going in that direction? - So the Visual Trumpery talk could be subtitled - So the Visual Trumpery talk could be subtitled how charts lie or how data visualization lies how charts lie or how data visualization lies or even better how we lie to ourselves using charts. or even better how we lie to ourselves using charts. And it basically comes from an interest that I have had And it basically comes from an interest that I have had for a long time, particularly after I wrote my second book for a long time, particularly after I wrote my second book The Truthful Art I started getting interested The Truthful Art I started getting interested not only in how to design better data visualizations, not only in how to design better data visualizations, but how people and when I say people I mean non-specialists but how people and when I say people I mean non-specialists interpret or misinterpret the data visualizations interpret or misinterpret the data visualizations that we create. that we create. And we have tons of sources pointing out that And we have tons of sources pointing out that we all misinterpret visualizations on a regular basis we all misinterpret visualizations on a regular basis sometimes because we don't treat them with enough attention, sometimes because we don't treat them with enough attention, sometimes because we project what we want to believe sometimes because we project what we want to believe onto the data visualizations that we see onto the data visualizations that we see just because we love our own beliefs just because we love our own beliefs and ideological biases confirmed, and ideological biases confirmed, but the information that we receive sometimes but the information that we receive sometimes because visualizations are not well designed because visualizations are not well designed and so on and so forth. and so on and so forth. So I decided that it could be a great idea to put together So I decided that it could be a great idea to put together a talk pointing out or listing systematically a talk pointing out or listing systematically the many ways in which we can be mislead by the many ways in which we can be mislead by different kinds of graphics. different kinds of graphics. But you know I was looking for a title But you know I was looking for a title and the original title of the talk was going to be and the original title of the talk was going to be A Graphicacy, which is visual literacy or graphical literacy A Graphicacy, which is visual literacy or graphical literacy but then I realized afterwards that graphicacy but then I realized afterwards that graphicacy may sound a little too academic or a little bit too boring may sound a little too academic or a little bit too boring and someone on my tutor stream right after the and someone on my tutor stream right after the 2016 presidential election tweeted the meaning 2016 presidential election tweeted the meaning of the english word trumpery and a trumpery of the english word trumpery and a trumpery is something that deceives or something that lies, is something that deceives or something that lies, particularly something that lies visually, particularly something that lies visually, it's like a visual display that lies for some reason. it's like a visual display that lies for some reason. So I thought that this was the perfect title So I thought that this was the perfect title for the talk because it would make the talk for the talk because it would make the talk highly controversial, it will attract bigger audiences highly controversial, it will attract bigger audiences who would be by the way misled by the title of the talk who would be by the way misled by the title of the talk because the talk is highly political, because the talk is highly political, but it is not partisan so I teach how graphics lie but it is not partisan so I teach how graphics lie and the examples that I use come from sources and the examples that I use come from sources from all over the ideological spectrum. from all over the ideological spectrum. So it's basically a talk about how charts lie, So it's basically a talk about how charts lie, just in summary. just in summary. - I went to the talk in Boston, it was really a great talk, - I went to the talk in Boston, it was really a great talk, and I love that you're trying to be controversial, and I love that you're trying to be controversial, you knew it would stoke a little bit of controversy you knew it would stoke a little bit of controversy and yet it was very fair and even handed, and yet it was very fair and even handed, it was clearly examples from all sides of various debates. it was clearly examples from all sides of various debates. So the second book, The Truthful Art led to the talk So the second book, The Truthful Art led to the talk and now you're also doing a book that's on and now you're also doing a book that's on a similar subject. a similar subject. Tell us a little bit more about the book and how it differs Tell us a little bit more about the book and how it differs maybe from the talk. maybe from the talk. - Yes I finished writing actually the first draft - Yes I finished writing actually the first draft of this new book a couple of days ago. of this new book a couple of days ago. So now I need to get into all the editing process et cetera So now I need to get into all the editing process et cetera and it will take a few months to get it out, and it will take a few months to get it out, it will be published probably around 2019. it will be published probably around 2019. Basically it's going to be my first book Basically it's going to be my first book for the general public. for the general public. Both my previous books, Both my previous books, The Functional Art and The Truthful Art The Functional Art and The Truthful Art I wrote them mainly for journalists, graphic designers, I wrote them mainly for journalists, graphic designers, scientists, statisticians, et cetera, scientists, statisticians, et cetera, but the new book is for the general public. but the new book is for the general public. It's the first trade book basically for the general public, it's going to be a hard cover book and so on and so forth. And the talk itself is like the trailer for the book. So the talk provides sort of a, shows you the structure that the book follows. Obviously in the book I have tons more of more examples than the ones that I can present in one hour so it's much more extensive obviously, but the principles that I write about in the book are similar to the ones that I outline in the talk. - Yeah. So let's talk about some of those principles. This is a lesson and listen series cause I always give a lesson and then we talk to somebody like you on that same theme. And when I was giving my lesson on this topic I covered a few different things. I talked about the idea that when you're doing data visualization, and if you want to be truthful which you should, you have to sort of think of yourself almost as a data fiduciary. I talked about things like scale, I talked about things like correlation doesn't equal causation et cetera. What are some of the most important areas of focus that you think people should think about in the interest of being truthful and honest with their data visualizations? - So both in the talk and in the new book I give like a list of things that we can pay attention at when we read a chart. The first rule by the way, I write this very explicitly in the book, the first rule of a good reader of charts is to pay attention, and that's the first bias that we need to overcome. There's a very common bias particularly in people who are not trained in quantitative methods, statistics, design, et cetera, to believe, and this is by the way that biases almost exist among people in these areas that a chart is sort of an illustration. It's something that can be understood at a glance, you take a look at the chart and you immediately understand it. And in the book and in the talk I point out that the first rule in becoming a better chart reader is to pay attention and don't assume that you understand the chart just because you have seen tons of charts in the past. So pay attention at the scales, pay attention at the legend, pay attention at the footnotes and so on and so forth. And right after, right after you have paid attention, I list the things that you need to pay attention to. First one is obviously the source of the data, so take a look at what the source is, whether it looks reliable or not, and we're going to get into that, I provide several examples of how a non-specialist can do that. And then matters that are related to how the chart is built. So you mentioned before for example scales, and I show tons of examples in the book in which people fudged with the scales of graphs for example or they manipulated the color of scales and correct with math in order to provide a message alright that push some sort of agenda. So how has the chart been built? One of the chapters of the book is basically talks about the grammar and the symbology of graphics, how to read the graphic right but right after that I have a chapter in which what happens when you break certain principles in chart making, what the consequences could be. Another principle is don't attribute to malice what you can attribute to sloppiness. This is a very important principle and most of the charts that I include both in the book and in the talk, I don't think that they were designed to mislead in purpose. I mean sometimes we end up designing graphics that mislead just because we are careless or we are in a rush. And this has happened to me as has happened to anybody else right. Particularly being a journalist who work on a deadline for many years, sometimes you need to rush to publish and you end up publishing something that is not that great right. You mentioned correlation is not causation or correlation doesn't equal causation. This is a principle that everybody keeps in mind but nobody respects. Because even people who, by the way there is a corollary to that principle which is that correlation is usually one of the first clues to causation, that's also important to remember. But in any case, even if we keep the original principle in mind, there's a whole chapter about this in the book, we keep breaking that principle over and over and over again. We keep seeing for example two events that happen in sequence and we are immediately in fear that one event caused the second event for example or the second phenomenon. So I talk quite a lot about that in the book but there's another one for example, when you want to make inferences you need to analyze the data at the appropriate level of aggregation. So for example in the book I show a case which by the way I borrowed from my friend statistician Heather Cross in which I show that I can basically prove, quotation marks in prove in there, I can prove that smoking cigarettes increases life expectancy. Because I can do for example a correlational chart, a scatter plot plotting number of cigarettes consumed per year and per person on the horizontal axis and life expectancy on the vertical axis you will see that the more cigarettes people consume, the more people live. And obviously that's the wrong level of aggregation cause if you want to study what happens in your own body when you smoke cigarettes you need to analyze person by person, not country by country, because the correlation becomes inverse when you go down to the individual level. We know that smoking cigarettes shortens the life expectancy of people, of persons. What happens is that when you aggregate the data there are other variables that come in such as for example wealth or access to good healthcare so basically what you are doing by doing a correlation chart as counterplot of smoking cigarettes, cigarette smoking and life expectancy is basically a chart that hides a lot of variables that are lurking behind that relationship again such as wealth or health or healthcare and so on and so forth. So using the right level of, and I have other examples, not only this one, some of them related to politics also. I was writing about that today, so that would be another one. And the overall principle is that as I mentioned before that one of the main reasons, not the main reason, but one of the main reasons that charts lie so often is that we love to lie to ourselves. We love to project what we want to see in the charts that we consume everyday. And we tend to read too much into charts, and this is another very important principle that I outline and that I explicitly write in the book and I mention in the talk I believe that a chart shows only what it shows and nothing else. So for example the cigarette consumption versus life expectancy chart, it just shows you that at the national level there's a positive correlation. But you cannot infer from that that smoking more cigarettes will lead to longer lives, that's a completely undeserved extrapolation from what the data is showing. So charts only what they show. And another thing is that people need to pay attention at what charts, what the charts that we consume show, but also to what the same chart could be hiding. Because a chart can only show so much, we cannot put everything in a chart so we should always think about what has been left out and why. - I think it's really interesting because your book as you said is for the general public, it's aimed at data literacy more so than how to create better visualizations that don't lie, but clearly I think our audience can read this book and learn lessons about how to do a better job and learn lessons about how to do a better job at what they're doing. at what they're doing. And the thing to remember, I always use the phrase you have And the thing to remember, I always use the phrase you have to have a zen beginner's mind when you do this work to have a zen beginner's mind when you do this work because you have to remember what it's like because you have to remember what it's like to not know everything. to not know everything. I'm actually reading a book right now, I'm actually reading a book right now, now I can't remember the name of it, The Heath Brothers, now I can't remember the name of it, The Heath Brothers, and one of the phrases in there they refer to and one of the phrases in there they refer to is the curse of knowledge. is the curse of knowledge. - The curse of knowledge, absolutely. - The curse of knowledge, absolutely. - So you have to release that and in order to remember - So you have to release that and in order to remember how not to be misleading even by mistake it is to how not to be misleading even by mistake it is to just have an appreciation of your audience and understand just have an appreciation of your audience and understand what they don't know. what they don't know. And the fact that everyone looks at a chart as you sort of And the fact that everyone looks at a chart as you sort of alluded to earlier and think it speaks truth, alluded to earlier and think it speaks truth, think that it is complete and it never is. think that it is complete and it never is. - Yeah rather than scrutinize the chart, - Yeah rather than scrutinize the chart, we tend to look at charts and take them at face value. we tend to look at charts and take them at face value. And we need to apply the same critical thinking And we need to apply the same critical thinking that we apply to text or to spoken words that we apply to text or to spoken words we also need to apply to charts. we also need to apply to charts. - Yeah. - Yeah. So you were a journalist as you said So you were a journalist as you said and I do find this world, the world of data visualization, and I do find this world, the world of data visualization, data communications, it's really an interesting blend data communications, it's really an interesting blend between journalism, design, and technology. between journalism, design, and technology. And it's a very rare and interesting mix of skills And it's a very rare and interesting mix of skills and I guess I would ask you, and I guess I would ask you, given the state of the world today and how we all given the state of the world today and how we all question everything which is a good thing, question everything which is a good thing, but of course the integrity of some of the institutions are but of course the integrity of some of the institutions are a bit in doubt right now, a bit in doubt right now, what role does data visualization and journalistic thinking what role does data visualization and journalistic thinking play in making the world a better place? play in making the world a better place? - So the book that I'm writing right now, - So the book that I'm writing right now, it may have either the title or subtitle it may have either the title or subtitle how charts lie, that would be the first part, how charts lie, that would be the first part, but the second one is how they make us smarter. but the second one is how they make us smarter. So when a chart is badly designed or without it So when a chart is badly designed or without it interpreted correctly, it makes us dumber right, interpreted correctly, it makes us dumber right, it makes us dumber. it makes us dumber. But if the chart is well designed and we interpret But if the chart is well designed and we interpret it correctly, that chart makes us smarter. it correctly, that chart makes us smarter. So that's a very important principle that we need So that's a very important principle that we need to keep in mind because a chart, to keep in mind because a chart, and when I say chart I refer to any kind and when I say chart I refer to any kind of data visualizations, of data visualizations, a chart let us see beyond what we can normally see. a chart let us see beyond what we can normally see. Let us see behind the complexity of the data or beyond Let us see behind the complexity of the data or beyond the complexity of the data. the complexity of the data. So it extends our perception and our cognition. So it extends our perception and our cognition. So that's a very important principle to remember. So that's a very important principle to remember. As to principles of journalism that can be applied As to principles of journalism that can be applied to chart making or to visualization design, to chart making or to visualization design, well there are many. well there are many. One of them would be for instance the use of a narrative One of them would be for instance the use of a narrative for example, of narrative structures whenever we are going for example, of narrative structures whenever we are going to present information to a particular audience. to present information to a particular audience. So learning for example how to chunk information So learning for example how to chunk information and how to sequence the information in a way and how to sequence the information in a way that the information makes sense rather than to present that the information makes sense rather than to present everything at once, presenting things little by little, everything at once, presenting things little by little, and how to chain one step to the next step and how to chain one step to the next step or connect them. or connect them. - Wait a second. - Wait a second. - So they make sense. - So they make sense. - Alberto are you saying, and I know you would never use - Alberto are you saying, and I know you would never use this word cause I know you're not a fan, this word cause I know you're not a fan, you're talking about data storytelling. you're talking about data storytelling. (laughter) (laughter) Least that's what my definition of storytelling. Least that's what my definition of storytelling. No that's great, I agree. No that's great, I agree. It's sequence, flow, chunking, absolutely. It's sequence, flow, chunking, absolutely. Keep going sorry. Keep going sorry. - So yeah you were about to ask about my dislike - So yeah you were about to ask about my dislike for the word storytelling right. for the word storytelling right. That's sort of an internet or Twitter controversy That's sort of an internet or Twitter controversy that I got involved in a while ago saying that I got involved in a while ago saying that I dislike that word, I don't like the word that I dislike that word, I don't like the word storytelling that much. storytelling that much. Or better said, I like it when it is applied explicitly to, Or better said, I like it when it is applied explicitly to, explicitly to what a story is. explicitly to what a story is. And a story is something that I use a very narrow definition And a story is something that I use a very narrow definition of a story and a story is a kind of narrative of a story and a story is a kind of narrative that has some sort of so to speak redeeming arc that has some sort of so to speak redeeming arc or something like that. or something like that. When you pose a problem or you pose, When you pose a problem or you pose, you begin a narrative and there is some sort of conflict, you begin a narrative and there is some sort of conflict, there's some characters involved perhaps that need to there's some characters involved perhaps that need to solve that conflict and then you provide a resolution solve that conflict and then you provide a resolution to that conflict so the story will be well rounded. to that conflict so the story will be well rounded. But sometimes you can apply that structure to presentations But sometimes you can apply that structure to presentations and if you can great, go ahead, but I don't do that. and if you can great, go ahead, but I don't do that. But in the world of journalism, But in the world of journalism, and I believe that this is something and I believe that this is something that is extending to other realms, that is extending to other realms, we are using story or storytelling in a much looser way. we are using story or storytelling in a much looser way. We are calling for example interactive We are calling for example interactive data visualizations stories or some graphics that are not data visualizations stories or some graphics that are not really stories we are call them stories. really stories we are call them stories. And I say that perhaps we need to use words or language And I say that perhaps we need to use words or language a little bit more accurately or precisely a little bit more accurately or precisely and talk about for example insight. and talk about for example insight. So this graphic provides insight, So this graphic provides insight, it doesn't provide stories. it doesn't provide stories. Or narrative for example, Or narrative for example, narrative structures are not always stories narrative structures are not always stories because there may be not a resolution at the end, because there may be not a resolution at the end, you may keep things open ended or you may present you may keep things open ended or you may present two or three or four different kinds of explanations two or three or four different kinds of explanations to the phenomena that you are talking about to the phenomena that you are talking about and I wouldn't have called that a story. and I wouldn't have called that a story. - No that's fair. - No that's fair. And I appreciate it, I wasn't expecting us to go into And I appreciate it, I wasn't expecting us to go into that territory, but it's fun. that territory, but it's fun. And you know it's actually, it sort of brings us right back And you know it's actually, it sort of brings us right back to the theme right because truth telling, to the theme right because truth telling, being careful and accurate with our words, being careful and accurate with our words, careful and accurate with our data and our visuals, careful and accurate with our data and our visuals, it all aligns pretty nicely. it all aligns pretty nicely. So you know unfortunately we are actually running So you know unfortunately we are actually running out of time and I guess I just wanted to thank you very much out of time and I guess I just wanted to thank you very much for joining me here today. for joining me here today. It was a good talk and I wish we had a lot more time It was a good talk and I wish we had a lot more time to talk about it, but I guess I would just ask you to talk about it, but I guess I would just ask you in conclusion to remind us when the book is coming out in conclusion to remind us when the book is coming out and if you have any last words of advice and any sort of and if you have any last words of advice and any sort of future telling about the state of data literacy future telling about the state of data literacy and data visualization in the world in the next and data visualization in the world in the next one, three, five years? one, three, five years? - Sure. - Sure. So I don't know yet when the book is going to come out, So I don't know yet when the book is going to come out, sometime during 2019. sometime during 2019. The editing process and the publishing process The editing process and the publishing process is a little bit of a mystery to me. is a little bit of a mystery to me. So right now as I said before I finished writing So right now as I said before I finished writing the first draft, now I need to go through copy editing the first draft, now I need to go through copy editing and proofreading and verification et cetera, and proofreading and verification et cetera, and I don't know how many months that will take and I don't know how many months that will take but sometime during 2019, probably in the first half, but sometime during 2019, probably in the first half, but I don't know yet. but I don't know yet. Anyway so and where visualization is going I have no idea, Anyway so and where visualization is going I have no idea, but I see but I see some promising trends that I would like some promising trends that I would like to help push forward. to help push forward. And one of them, and by the way this connects to the talk And one of them, and by the way this connects to the talk and also to the book, and also to the book, is that I believe that visualization has the potential is that I believe that visualization has the potential to become a language or a means of discovery to become a language or a means of discovery and expression that anybody can take advantage of and expression that anybody can take advantage of or from. or from. So I do believe that it can become a universal language So I do believe that it can become a universal language and I think that it's powerful language, and I think that it's powerful language, it's something again that can help us see beyond it's something again that can help us see beyond what we can normally see. what we can normally see. And it should not be just the realm of groups of specialists And it should not be just the realm of groups of specialists scientists, statisticians, graphic designers, journalists. scientists, statisticians, graphic designers, journalists. I believe that any citizen can take advantage of charts I believe that any citizen can take advantage of charts to discover features of the places where that person lives to discover features of the places where that person lives or about his or her own life, or about his or her own life, or to basically conduct a better and more informed life. or to basically conduct a better and more informed life. - That's great. - That's great. Alberto thank you very much, Alberto thank you very much, really appreciate you joining me, really appreciate you joining me, and I know our audience will appreciate your insights. and I know our audience will appreciate your insights. So thank you again. So thank you again. - Thank you, thank you so much again for having me. - Thank you, thank you so much again for having me.

Contents