In the previous video, we described each of the steps of the scientific method, and the role data plays in that process. In this video, we'll go into a little more depth and specificity on the technical skills necessary to complete each of the steps in the data science workflow. If you're not familiar with all of the skills and techniques we outlined in this video, that's perfectly okay. One of the purposes of this course is to develop some of these skills. The key thing here is that you understand the importance of developing these skills, in order to perform data science well. Let's begin with question development. Remember that quality questions must be relevant and answerable, and in objectively measurable manner. But how do we know if that's the case? Well, it requires some fundamental data science skills. To understand if a question is relevant, we need data literacy to read and analyze previous research on a topic. This allows us to develop our own knowledge and ask meaningful questions whose answers and solutions can make a real impact. In order to determine what's objectively measurable, we need an understanding of bias and quantitative methods. Related to this, practitioners need to be able to ask questions in a way that can allow them to be answerable using data science methods. These skills will save us time and energy at this stage by letting us frame our questions in a way that can be easily solved using data. Together, these technical skills allow us to ask questions that will enable successful science. Next, let's talk about hypothesis development. When constructing hypotheses, remember that we want to construct a testable hypothesis set, one null hypothesis, and one alternative hypothesis. In order to do that successfully, the scientific practitioner needs to have a fundamental understanding of probability distributions, relevant hypothesis tests, and the framing of real-world problems for data science. While these statistical techniques will not be directly used in this step, they can inform the development of hypotheses so that they're easily assessed later on in the data science process. This will help us scale out data science solutions so that they can be successful with limited manual intervention. The next step in the scientific process is carrying out an experiment and there are a lot of technical skills necessary to do this effectively. First, some type of experiment or analysis must be designed. This requires skills in experimental design and research design, like mitigating bias by sampling data in a responsible way. Then the experiment or analysis needs to actually be performed. In data science, this usually involves some type of data pipeline. This requires an ability to write data engineering code to import, clean, and manipulate data to make it useful. There are varying degrees of quality to these tasks, and it's not uncommon for them to be completed by dedicated data engineers working alongside data scientists on a data team. Data scientists can also create data in this step by using statistical modeling, machine learning modeling, or other statistical or machine learning techniques. This is a common practice in modern data science. Following the conduction of an experiment, an analysis of the results needs to be completed and interpreted. In order to analyze the results, data scientists need to have an ability to apply hypothesis testing methods, and machine learning model evaluation techniques. Then based on these results, data scientists need data literacy to interpret the outcome and arrive at a conclusion. These are vitally important skills to a data scientist because this is how the value of data science work is assessed. In other words, this is how to determine if data science projects are making an impact, or if they might need to be reworked in some way. Data science projects can be ineffective if the results aren't communicated, shared, or delivered and technology can help us complete these tasks. To deliver results to stakeholders as a presentation, a data scientist must be able to develop data visualizations that are informative and easy to understand. This skill, along with the ability to engineer data pipelines, is also important in the development of real-time dashboards, a common request from key project stakeholders. If the result that needs to be communicated is a model or prediction, a live REST API or an interface used by computers to get results on request, might need to be developed to support that communication. We hope that these examples of concrete skills demonstrate the inseparability of the scientific process and performing data science at scale. Next, we'll work to make more sense of these skills by organizing them into common fields associated with data science.