Again, I'm Evan Jones, one of the course designers for data to insights have been teaching data analysis for over ten years. My life at Google before developing courses like this one was in google finance. We built pretty fun machine learning models to predict and optimize expenses here at Google. And I'm thrilled that Google has made their internal petabyte scale data analysis tools available to the world through the Google Cloud platform. It's that platform that we're going to be using to explore and derive insights using their big data tools. Let's take a quick look at the agenda of topics we're going to cover. First, we'll start with the basics of Google Cloud platform and highlighting the Cloud handle your compute and storage needs, enables massive scalability. After the fundamentals of Cloud, we'll go into the Big Data Tools available to you as a data analyst. We're going to focus on BigQuery, Google data studio and Cloud data prep to start. Third is where we'll start coding in sequel, the structured query language. Fourth, we'll explore the BigQuery pricing model for query processing and data storage. Next up, is a discussion on dirty data and how we can clean it up with sequel or a new UI tool. Sixth and seventh on this list is how you can create and store your own Datasets on BigQuery from your queries or from external data sources. We'll close here with an introduction of visualization and how to create reports from your data within data studio. Moving on to some of the more advanced topics we're going to cover. You're going to look at joins and union in your data sets together in BigQuery as some of the more advanced statistical functions and user defined functions you may not have seen before. Afterwards, it's one of my favorite sections on how repeated fields and arrays work within Big Queries nested data structures. Again here, we'll close with some more advanced visualization tips within data studio. In these last sections, we'll walk through one of the most popular topics which is troubleshooting query and Dataset performance. Lastly before wrapping up, we'll close the specialization with a critical topic of data security and access controls. This class is targeted primarily a data analyst who query their business data sets using sequel and create insightful reports and dashboards. So, first and foremost, we're going to take a look at those challenges that are faced by data analysts. So, let's just jump right into those. So, if you've run any queries in your life, particularly like when I was learning database processing in school, my instructors and teachers would say, hey run this one query and then you can go to the bathroom or do whatever you need to do while your query is running, right. So, upper left, you see the queries that are taking too long. It could potentially stall your analysis or what about if I wanted to combine 15 data sources and query all of them. And I want to do that within a reasonable amount of time. A lot of times that was hard to do and in the middle say it wasn't a querying problem, but it was actually an infrastructure problem. I'm a data analyst or a data scientist, I am not a hardware purchasing department, I don't know about buying servers and storing multiple versions of of hard drive that are redundant in case hard drive platter fails. And I have to maintain the network of all of my data as it relates to processing my queries and accessing the data where that's stored. I don't want to deal with any of that kind of infrastructure, right? But I have to as a necessary evil if I want to be a big data shop, right? Or if you're using Hadoop on your on your clusters, you're managing your clusters but you've had this amazing capital outlay to get this awesome processing cluster but now you're punished by your own success because now your clusters can't scale because your organization says, you did such an amazing job. Now we have ten times the data, can your cluster's handle it or do you need to buy more and kind of keep expanding out your ever growing infrastructure empire. And again, it's how much of the business of building infrastructure do you want to be in versus spending that opportunity cost of infrastructure versus writing out those amazing queries or those machine learning models to get those insights. Lastly is pretty apparent one which is just cost. So, maybe you have a ton of data, you have a torrent of data, but you literally just can't afford to process all of it just because performance wise it's prohibitive on your machines and you can only create a few columns or just the monetary cost, which is processing processing that much data and storing that much data is just prohibitive. And last but not least if you have no central place where you can just dump all this data into like a staging area or analytics warehouse. That could be a problem as well. And when you go into, these are a lot of the same exact problems that Google had kind of growing up, right. And faced with a torrent of search, indexing data and adds volume data. The necessary problems that Google as a big data organization had to solve and we'll see exactly how they did that and the benefits of technology and time that have evolved to create a lot of these cool Google Cloud platform tools, these big data tools like BigQuery