So what is data visualization? Well, data is stored on a computer. But we want to work with the data, and so we need to get the data somehow out of the computer and into our heads. So we can think about the data. So we can understand the data. So we can gain insight into what the data represents. And so the best way we've found to do that is using visual communication. And so data visualization is a high bandwidth connection between the computer and the human brain that transmits as efficiently as possible the data from the computer and inserts it into our brains. And so here's a sample system, a typical visualization system, that I've got diagrammed here. We have a scanner here that's going to acquire data. In this case it's going to use a laser beam in order to scan. The shape of this graduate student's head, and then that data gets processed by the computer into geometry and color, and then it gets displayed on this display screen. At this point we set up this visualization, this data visualization channel that's going to take this data from the display screen. And inserted it into my brain through this perceptual channel. So, we get this visualization operation where I'm looking at the data, and it's going through my perceptual system, going into my human memory and then I can start to think about it. In this case, there's some strange things happening with the laser scanning here at the top of the students head. And I want to try to understand why that's happening. And so data visualization isn't just about seeing data. It's about understanding data and being able to make decisions based on the data. And so another example of how data visualization has been useful. Is when I discovered just as a university professor. So, students apply for graduate school at Computer Science at the University of Illinois, and we have a webpage that they use to select faculty they might want to work with. And so we say, pick out the three faculty members you'd like to work with, and then we have drop down menus with all the faculty. And students pick the faculty they want to work with, and then the students having made that choice, that information is entered into a database. And for each student applicant, we can see the faculty they might be interested in working with and we can start to get an idea for each faculty member what students they would want to pick from. So, I wanted to plot to see how uniform this was across the faculty, and so I used a data visualization tool in order to do this and I came up with a standard bar chart. And, horizontally, these are basically the different faculty members. There's some faculty members are in our department, and some are affiliated with our department, and some get more students interested in working with them, and some get fewer students working with them, and so horizontally I've got the a faculty members here and vertically I'm plotting the number of student that want to work with that faculty member. And so we've got some faculty members that are very popular and some that aren't very popular and again, some of the one's that aren't very popular are faculty that are affiliated with other departments. And students usually apply to work with them through other departments. And I've eliminated the faculty names here, just to avoid any embarrassment. So we've got this list and then I colour code it, the bar charts. This is a stacked bar chart. I've colour coded the bar charts based on blue, if the faculty member was a students first choice. Orange if the faculty member was the students second choice, and green if the faculty member was the students third choice. And you might see something here, I certainly saw it when I first plotted this data, and that was that students were picking people in the first half of the alphabet. This is in alphabetical order. Students are picking faculty from the first part of the alphabet first, and they're picking people from the second half of the alphabet mostly last. And then the middle part of the alphabet, or I should say, they're picking second people uniformly across the alphabet and what I discovered was that from this visualization, was that students were picking faculty in alphabetical order. And having made that discovery from that visualization, I was able to work back into the system and figure out why students were picking faculty in alphabetical order. And the reason was we had three pull down lists with all the faculty. And so the students would go down the list and find the first faculty member they were interested in and click there. They'd go down the list and pick the second faculty they were interested in and they would select that, and then they would find the third faculty member they were interested in and click that, and then that would go into the database, but when we were looking at the database, in order to evaluate what students might want to work with us. We were assuming students were picking the faculty they wanted to work with first, as the first choice, and their second choice was second, and their third choice was third. When in fact, they were just finding three faculty members and picking them when they appeared in the list. And so in this case data visualization helped us make a decision that we wouldn't have been aware of otherwise. And so that's a hallmark of data visualization is to provide insight into the data, and be able to use that insight in order to make further decisions, further hypotheses to explore further. And so that's the kind of insight that I want to provide and the tools that I want to provide, to students in this course. And so we'll work through basically four weeks of materials on data visualization. The first week, we're just going to understand the platform for data visualization. Basically, how a computer displays data, and then how the user perceives and processes it. And so we'll spend most of the first week just studying the computer and studying the human. The second week, we'll introduce a few simple charts, bar charts and so on. And more importantly, we'll look at the kinds of data that we want to display. That we want to visualize, and we'll make sure. That we're using the proper chart for the proper kinds of data. And that the proper kinds of data are being mapped to the correct elements of that chart. And then we'll also talk about how to design charts effectively to convey that data. In week three, we'll look at how to layout and display other data, non numerical data, relational data. When you're data isn't necessarily going to be a number that will give you a height and a bar chart, but is basically a relation that says item A is related to item B. How do we display that? How do we visualize that? How do we gain insight into that kind of data? So we spend week three on more looking at abstract data as opposed to numerical data. And then week four, we'll talk about visualizing data bases, how to visualize the results of data mining. And how to analyze the way we work, so the visualization can provide us the insight into how do we make a decision. And so we'll talk about how to make a visualization dashboard that gives you the information you need, in order to make an effective decision, and an informed decision. So I think it's useful to talk about what we won't be able to cover in these four weeks. We're going to talk about abstract data visualization. I'm not going to be able to talk in much detail about specific tools used for visualization, or visualization tools for scientific visualization or medical visualization, business data, business intelligence systems and so on. There's a number of tutorials. And manuals for using specific packages and we just won't have time to go into any one of them, but i'll try to provide examples across those different applications. I'll be able to provide, provide a few design principles, but I won't be able to cover a full course on visual design for presentation and so there are good courses on that. That you might take a look at, but, we'll just be looking at some very minimal design principles that are most effective for data visualization. And this will be an introduction to the topics of data visualization. And data visualization can be a very deep subject. And I won't be able to get into the details of visualization. Will be at sort of a high level of the different areas of visualization. But focusing on the important topics of visualization to make any visualization successful, and to work well. I'll try to provide pointers to tools, but again, not tutorials on how to use those tools, and there'll be a couple of assignments in the course, and programming can be used to complete those assignments and I encourage you to program if you know how to do these assignments, but it's not required and I'll provide existing tools that you can use that you don't need to write a program in order to use to visualize your data in order to complete these assignments and in general to be able to visualize data. And finally, I'd like to thank a few people that have been very helpful with putting this course together. Eric Shaffer is a lecturer here at the University of Illinois. And he's talked in one of the most recent scientific visualization classes we had and he's provided some materials for me for this class from that. I've very grateful for his help on that. Karrie Karahalios is also a professor of Computer Science here at the University of Illinois and she's taught social visualization classes, especially visualizing, kind of relational data in social situations and she's a world renowned expert on that area and has provided me with a lot of resources and website links for tools to help with a variety of visualization methods. Mike Gleicher is a professor, a friend of mine at the University of Wisconsin. We just completed a university wide data visualization course and had some nice material there that he provided me that I've also been able to use in this course. Donna Cox is a good friend of mine from the NCSA. Who taught me a lot of what I know about data visualization. And many other presentation visualization examples I'll be showing are her work from the National Center for Supercomputing Applications here at the University of Illinois. And finally, Pat Hanrahan, professor at Stanford, has been very helpful and provided a lot of good advice and resources for data visualization, and I owe him my gratitude as well. [MUSIC]