Data warehousing requires extracting, transforming, and loading, ETL, of disparate datasets and this video shows how to make a diagram to express this.
- [Instructor] Now, we are going to start discussing flow diagrams. Let's begin by talking about warehouse data flow diagrams. Remember the slide from chapter one. I was pointing out that flow diagrams can focus on many things. In this course, I'm going to show you examples of data flow diagrams, user flow diagrams and workflow diagrams. This video focuses on a data flow diagram where we are mapping the flow of data from one system or structure to another. Let me give you an example scenario. Okay, this is a true story, but the names have been changed to protect the innocent. Just kidding. So, my best friend is a transfusionist, and one day he asked if I'd give a consultation to a study group he was working on. See, when you donate blood, it's a really good thing to do because blood is a scarce commodity. So blood centers that do transfusions have to be very careful about not overusing their blood. Like, they try different alternatives to see ways they can conserve precious blood. But they were all wondering, how much is too much to use? To be honest with you, I told my friend that I don't really like blood. Like it freaks me out when I bleed, and I don't really know how to figure out how much is too much blood. He said he figured that out already. They would benchmark blood use by getting data from different hospitals and comparing the data. The hospital representatives were the other people on the team. But he said the group hadn't really worked out, exactly, all the technical issues. So he wanted my take on it. (clears throat deeply) I prescribed to my good friend, the doctor, a warehouse data flow diagram. So, what do I mean by that. I guess I use the term data warehouse loosely. What I mean by a data warehouse is an environment where the data are not live production data. They are instead static data that are copies of data, usually data that was production data before it was copy. Typically, a warehouse environment, in my mind, is one where they take in data from different places, then do an extract, transform, load protocol or ETL protocol for those of you cool cats, and store it, so you can analyze it, usually with a fancy front-end for analyzing warehouse data. So I figured my friend was in need of a warehouse flow diagram. Therefore, I had to decide, a few things. First, what software was I going to use to make the diagram? Then, I had to decide, am I going to use the official flow diagram shapes or am I going to make up my own? And of course I had to figure out what to put in the diagram. I had a basic idea from our conference call, but it was still a little hazy. Well, lucky for you, I chose PowerPoint as my software, so we can go look at what I did. Okay, here we are in PowerPoint. This file is in your exercise files for this video. It's called blood use data flow. Okay, right away you see, I chose not to use the traditional flow chart shapes. Why? Well, mainly because this was more of a conceptual flow, than a literal one. And also, you'll see I chose PowerPoint for the software. I did that for two main reasons. First, I wasn't sure about the diagram, and I wanted to be able to easily show it to the group and edit it. Everyone on the group knows PowerPoint. So that's a good choice. Next, if we ever wanted to put this image in a publication, It's easy to export out as a JPEG. Just go to File, Save As, decide where you want to put it by putting Browse. Here you can name it. But the important thing is, here you can change it to JPEG. Then it will ask you if you only want to save the current slide, or all the slides as JPEGs and you choose, and then you have a JPEG you can put in an article or on another slide or meta. All right, let's cancel out of here. We'll go back to the slide. So what do we have here? We have some buildings. We have three hospitals, and we have a data hosting facility. And we have the headquarters of my esteemed consulting firm, Deathwench Professional Services or DPS for short. Also, we see these dreamy clouds of VPN. And we have data management terminals. Wait, where's the actual data? Oh yeah, here's the data in the garbage can shape. I used to have a coworker who said that. As you can see, there are a bunch of arrows indicating a data flow. And there is a bunch of wording to explain things. So, this is more of a conceptual high view management diagram. Obviously, just one of those arrows, like the one from Hospital A to the VPN cloud, probably involves a ton of technical steps, that itself could have several flow diagrams. But even though this is high level, after I made the diagram, the group got a lot more done. Because now we could visualize what exactly we were trying to do. We had a map for the path that data would take if it flowed through our benchmarking system. Okay, that was a fun one, with buildings and clouds and garbage cans of data. Now, let's get down with some serious flow shapes in the next video, which is about analytic data flow diagrams.
- How curation files function as part of data management
- Back-end curation
- Front-end curation
- Steps for dashboard design
- Designing surveys
- Creating warehouse, analytic, and application flow diagrams
- Text-based curation files