From the course: Data Visualization: A Lesson and Listen Series

Listen: Elijah Meeks

- Thank you, Bill, thanks for inviting me. I'm happy to talk about data visualization anytime. - Excellent, so the focus, the theme for today's talk is really talking about big data, which, of course, is a big term And then there's this very practical and also social layer to it. So, the first one I'd like to focus in on is how, I build custom data visualizations. So, a lot of folks use tools to drive data visualization, and those tools can span from sort of GUI-driven, off-the-shelf tools, or a wide variety of libraries. And those are typically optimized for smaller datasets. And just like big data, small data has changed in its meaning over time. But tens of thousands of rows max, typically. So, oftentimes, I'm presented with a custom data visualization ask, that it isn't based at all on whether or not they need some kind of fancy, weird chart that you could only do in custom data viz, or some kind of very involved design process; but rather, they're coming to me and saying, "Hey, this data, we've got this dashboard in Tableau. "We've built it with x dataset, and it's great. "But now we need to drive this dashboard "off of a much larger dataset, "and we can't build a Tableau extract based off of that." And all they want from me is to build the same Tableau dashboard, but with custom data visualization, because then they can get access to some kind of these big data back ends, like Snowflake or Druid. We use Druid a lot for our big data. And so, it's very interesting to me that custom data visualization, especially when I started at Netflix, but even still, years later, is oftentimes framed as a data access solution, that if we could, we'd just, if there was a working Druid connector in Tableau, we would never have even hired you, right? - Right. - And so, I use that opportunity to build for them what they want, which is this very interesting situation because they have in their minds not just a view into the data, but a very specific visual representation of that view that they expect me to produce. So, if their Tableau dashboard had a pie chart and a bar chart and a line chart on it, and I came back to them with a cool Sankey diagram and a, I don't know, table full of sparklines, then typically they'd feel a certain sense of betrayal, right? Because implicit in their ask about visualizing this big data repository is that it would take the form of the visualization of whatever the sample or smaller subset of that data did. - Yeah, I actually had a question I was going to ask you, and I was originally going to leave it out because of time, but I think we've sort of walked right into it. As someone who works with big data, do you need to be a, if not a data scientist, at least a data analyst who knows how to do this advanced querying, so that you can get those sort of smaller datasets that can actually be visualized? And I thought that would be bit of a conversation, but it sounds really like you kind of really need to do that. And in the work that I do, I'm not really a data analyst. A lot of times my clients give me the data, it is summarized, it is in very light, small form by the time I'm visualizing it. And it seems to me like what you're saying is that in order to really work with big data, you really need to be a data analyst or at least an engineer who can do some of that work yourself before you can then get to the stage of visualizing, or partner with somebody who can do that with you and for you. - I think, to a certain degree, yes. I mean, definitely once you factor in the sort of partnering with somebody, to a certain degree. The value that I provide, so let me be very clear here. When somebody comes to me and says, "We have this Tableau dashboard, we love it. "Make it, but with a big data back end." They would be, they would check off the box of me having successfully fulfilled their objectives if I came back and it was this very rudimentary dashboard of the kind that an engineer might have produced using Tableau. I don't think that I would still have a job at Netflix if that's all that I did. Instead, what happens is I take that opportunity to sort of step in and try to enable them. My value add is to say, "Oh, well, think about these other ways "that you might represent this data." And while I'm not a data scientist, I don't have a strong statistical background, or engineering background, frankly, but as long as I have a certain level of familiarity with the processes that go into data engineering and manipulation and pivoting things, I can help them to see that there are different views into this data that might be more useful or provide context to the primary views that they want. So, I do think you have to have a certain level of understanding of that, but it doesn't by any means have to be sort of a technical sophistication necessary to get a job as an actual data engineer or data scientist. - Yeah, that makes sense. You know, our audience for this show is very, very broad. We may have C-suite executives, we may have HR people, we may have data visualization practitioners, it really is across the board roles, demographics of all kinds. So, I imagine that for this particular interview, it's going to skew a little bit more towards the technical and maybe practitioner. So, for those folks, what would you say maybe you have done over the years, big, hairy, complex challenge where you've banged your head against some walls and sort of key learnings you could share with them of things that you might be able to recommend that they look for first when dealing with challenge like this? Or not even necessarily first, but you're going to hit this problem, this problem, this problem, and here are some solutions to those issues. - I think, I think one of, for anybody who sort of knows anything about me, I'm a huge fan of network data visualization. - I was going to talk about that specifically. But, yeah, go ahead (laughs). - Yeah, and I mean, LinkedIn had that, had your InMaps. I think network visualization is extremely powerful. I think it is incredibly interesting to audiences. And that, to me, is always an important sign. But at the same time, the complexity of it, the abstract nature of it, and frankly the lack of literacy in knowing how to read it, means that in most situations, network data visualization is too obscure for somebody to understand. And so, there have been a couple of times in my professional career here at Netflix, and earlier, at Stanford, where I have put in the effort to make a really good network visualization, which takes a lot of effort, and not technical effort, but I mean design effort to really try to infuse it with enough meaning and readability that somebody can navigate through it, only to find out that even that is too intimidating for an audience. And what I always end up learning, and I feel like I've relearned it, I'd like to hope that I'm not relearning it every time, but I'm getting better at it, is that to really give people an effective way to engage with network visualization, you need to surround that network visualization with a lot of contextual, more traditional data visualization, top-level metrics, lots of explanation for what you mean by this or that network effect. So, we have a network visualization that relies on reciprocated ties, and that's well known in social network analysis as a good sign of strong connections between people, but people don't know what that means. And we have to tell them, and tell them over and over again in different ways via helper text, but also via contextual data visualization that's smuggling in explanations. And only by doing that can you really give them a chance to read and really sort of entice them into taking the time necessary to learn to read something as complex as a network visualization. And that's something that I've always struggled with. I have a deep love of all of these sort of flow diagrams, like Sankey diagrams or these dagger diagrams, or traditional network charts, like force-directed networks and matrices and the hierarchical network stuff. I mean, I love all of it, and I have dealt with it over and over again. And every time, I have to still remember that I need to give people a soft landing around it, so that they can feel like the data visualization is welcoming to them, but also that there is important information here that is worth the effort to read. - Yeah, yeah, I mean, it's complex, big data, so a bar chart's not going to do it. Network diagrams are where we go 'cause of the complexity. It's all about sort of decoding that complexity. I want to ask you a very quick question, and then our final question. The quick question is, off the top of your head, great example, or examples, or really just one, of a network diagram that just works, whether it's one of yours or somebody else's, a network diagram that maybe is a good starter diagram that people will look at and say, "Okay, now I get how these things work at all." 'Cause some people do really struggle with that. - Yeah, so, I can think of two. One is, the New York Times did a wonderful diagram of Oscar winners and how they worked together with each other. And it was small and very, very designed, right? But very effective. Actually, there was another New York Times example that I think would be good that dealt with how political action committees were interrelated. - I remember that one. - And I'll have to dig that one up. It uses parallel edges to great effect. It showed how certain staff members worked at different organizations. And I think it was for the Clinton campaign, showing how they were related to the various political action committees in the earlier Clinton campaigns. And then the other example isn't a single example, but rather a field of examples. And that is anything that deals with transportation networks, so Minard's old transportation maps from France, modern subway maps, and also all of these transportation density maps that they're creating all the time now, where you show how long it takes to get from the city center to somewhere else. Those are very legible to people because they're rooted in geographic reality, and our geospatial information visualization literacy is always higher than our other forms of information visualization literacy. So, anything that can be anchored in a geospatial representation actually will help with people trying to understand how to read something. - Yeah, those are great. Yeah, I remember the political example you mentioned from the New York Times. Shocking that the New York Times would come up twice (laughs), right? 'Cause, of course, they do the best work of many. All right, so, I do have one last question for you. I've heard you speak, and I know you've written about what you call the third wave of data visualization. So, we're sort of going off topic from big data a tiny bit here. So, you describe the three waves as being in the early years of the field, the definition of best practices and focusing on clarity; the second wave was developing systems and encoding of best practices; and you've said that now we're essentially going into the third wave, essentially going into the third wave, which is all about convergence. which is all about convergence. And you also, the second wave is also about developing tools. And you also, the second wave is also about developing tools. So, convergence is where we at, or where we're at now. So, convergence is where we at, or where we're at now. Can you just describe what you mean by that, Can you just describe what you mean by that, and what it means for what's coming next and what it means for what's coming next in data visualization? in data visualization? - So, when I was growing up, - So, when I was growing up, doing data visualization 10 years ago, doing data visualization 10 years ago, the reason why one would choose to learn D3 the reason why one would choose to learn D3 or encode things was because you literally or encode things was because you literally couldn't get certain data visualization forms, couldn't get certain data visualization forms, outside of specialized tools. outside of specialized tools. And so, the only way you could get access to those And so, the only way you could get access to those was to learn these low-level geometric libraries was to learn these low-level geometric libraries that were all based off of the grammar of graphics that were all based off of the grammar of graphics or a similar systematic approach to encoding or a similar systematic approach to encoding data channels with graphical, data channels with graphical, graphical attributes. graphical attributes. That's no longer the case. That's no longer the case. You can do incredible things in Tableau. You can do incredible things in Tableau. Just take a look at all the Tableau Public examples. Just take a look at all the Tableau Public examples. There are amazing capabilities There are amazing capabilities in a lot of different packages. in a lot of different packages. Library-wise, you can do amazing, animated, Library-wise, you can do amazing, animated, graphically rich data visualization in R now graphically rich data visualization in R now in a way that you could only do in D3. in a way that you could only do in D3. Likewise, whether it's a BI tool or a custom application Likewise, whether it's a BI tool or a custom application or in the notebook environment, or in the notebook environment, all of the different modes that we're in all of the different modes that we're in are getting closer and closer to each other. are getting closer and closer to each other. So, there's no longer sort of this So, there's no longer sort of this discreet thing called a dashboard, discreet thing called a dashboard, and it's very different from something called a report, and it's very different from something called a report, and it's very different from something called a notebook. and it's very different from something called a notebook. Instead, they all share all the same capabilities Instead, they all share all the same capabilities and the same expectations among their audiences. and the same expectations among their audiences. And so, I think right now we're all dealing with this And so, I think right now we're all dealing with this in a very reactive kind of way. in a very reactive kind of way. We're responding to these changes intuitively. We're responding to these changes intuitively. But we haven't sort of actively and explicitly But we haven't sort of actively and explicitly called it out and theorized about what it means for us called it out and theorized about what it means for us that all of these things, all of these tools that all of these things, all of these tools now have very similar capabilities. now have very similar capabilities. All of these modes have very similar All of these modes have very similar users and audience expectations. users and audience expectations. And all of the ways that we're presenting things And all of the ways that we're presenting things share in common a lot of the same forms, share in common a lot of the same forms, so that you have interactive elements so that you have interactive elements in data-driven storytelling in journalism, in data-driven storytelling in journalism, and you have a lot of journalistic and you have a lot of journalistic elements and aesthetic pop in internal elements and aesthetic pop in internal business development applications. business development applications. And so, I don't have necessarily And so, I don't have necessarily the most concrete answers for what to do now, the most concrete answers for what to do now, other than a few things that I other than a few things that I touch on in a couple of my talks; touch on in a couple of my talks; for instance, saying that even if you're developing for instance, saying that even if you're developing business applications in industry, business applications in industry, you should acknowledge that it's an attention economy you should acknowledge that it's an attention economy and you still have to draw your users in, and you still have to draw your users in, you still have to use techniques to engage your users you still have to use techniques to engage your users and draw them into your data visualization and draw them into your data visualization because the dashboard or report you make is one of many because the dashboard or report you make is one of many that they're going to have in front of them. that they're going to have in front of them. And so, you have to make yours stand out in the same way And so, you have to make yours stand out in the same way that the New York Times has to make their stories stand out. that the New York Times has to make their stories stand out. But otherwise, I think it's more But otherwise, I think it's more an acknowledgement of this shift an acknowledgement of this shift and really struggling with that and really struggling with that and ideating around that to try to figure out and ideating around that to try to figure out what that means for the future of data visualization what that means for the future of data visualization when it's no longer technically when it's no longer technically challenging to produce things, challenging to produce things, and instead, now, it's the question of what tool you make and instead, now, it's the question of what tool you make or what profession you're in, or what profession you're in, or what audience you're building for or what audience you're building for no longer constrains your choices. no longer constrains your choices. - Yeah, yeah, that's great. - Yeah, yeah, that's great. I mean, what I always say is that I mean, what I always say is that as the things get more and more commoditized as the things get more and more commoditized and simplified and the technical solutions go away, and simplified and the technical solutions go away, it all comes back to the idea. it all comes back to the idea. It all comes back to that It all comes back to that - That's right. - communication strategy. - That's right. - communication strategy. What am I tryin' to say? What am I tryin' to say? Who am I talkin' to? Who am I talkin' to? What do they need to hear? What do they need to hear? And so, it's always there first. And so, it's always there first. And if you come up with those ideas first, And if you come up with those ideas first, then the technical solutions melt away, then the technical solutions melt away, and more so now, as you said, than ever before. and more so now, as you said, than ever before. - Absolutely, and I think that's why we need this emphasis - Absolutely, and I think that's why we need this emphasis on design and an emphasis on storytelling, on design and an emphasis on storytelling, and really, especially for the sort of C-suite and really, especially for the sort of C-suite audience for things like this, audience for things like this, an emphasis on why data visualization an emphasis on why data visualization and what data visualization is most impactful, and what data visualization is most impactful, so that they can know how to invest and how to evaluate it so that they can know how to invest and how to evaluate it to know whether or not their organization to know whether or not their organization is well equipped to enter a world where data visualization is well equipped to enter a world where data visualization is everywhere and key to the success of an organization. is everywhere and key to the success of an organization. - Absolutely. - Absolutely. Well, that's unfortunately all the time that we have. Well, that's unfortunately all the time that we have. Elijah, I want to thank you very much for joining me today. Elijah, I want to thank you very much for joining me today. We dipped our toes in the shallow end We dipped our toes in the shallow end of the very deep pool of big data. of the very deep pool of big data. There's so much more to talk about, There's so much more to talk about, but we'll have to tackle that another day. but we'll have to tackle that another day. But thank you very much for joining me But thank you very much for joining me and giving us your time. and giving us your time. - Bill, can I make one pitch - Bill, can I make one pitch for the Data Visualization Society? for the Data Visualization Society? - Sure, go for it. - Sure, go for it. - Yeah, I'd love for the listeners, - Yeah, I'd love for the listeners, if you're a data visualization practitioner, if you're a data visualization practitioner, and that is broadly construed, and that is broadly construed, and then I would like them to take a look and then I would like them to take a look at DataVisualizationSociety.com, at DataVisualizationSociety.com, which is a new and extremely active which is a new and extremely active professional organization focused professional organization focused on holistic data visualization, on holistic data visualization, so not focused on a particular tool so not focused on a particular tool or a particular mode of data visualization. or a particular mode of data visualization. And we have an excellent publication And we have an excellent publication and a growing member list and a growing member list and an extremely active Slack. and an extremely active Slack. If you would like to join, it's free. If you would like to join, it's free. And we've got a lot of opportunities for uplifting And we've got a lot of opportunities for uplifting voices and improving your profile in the field. voices and improving your profile in the field. - Yeah, and I have to say, I'm a member of it, - Yeah, and I have to say, I'm a member of it, and I am on the Slack community, and I am on the Slack community, and it is a very vibrant community. and it is a very vibrant community. So, absolutely, great job. So, absolutely, great job. - Thank you, Bill. - Thank you, Bill.

Contents