TensorFlow is an open-source library in numerical computation and it's using data flow graphs. It allows us to express machine learning and deep learning algorithms and prints along an execution engine, which allows these algorithms to run at scale on multiple nodes in a cluster backed by CPUs, GPUs, TPU's and mobile devices. So every numerical computation is a graph, better note that computations, and on the excess out flowing 10 slots between them, they are from in tens of flow. So let's start with the little decoding examined within IBM data center experience. We create a new notebook. We give you the name and select the correct spark service although, we do not need it at the moment. And then we click on Create. Make sure that Python is selected as programming language. The first of all, we download some data. TensorFlow has a couple of built in datasets and reduce the endless handwritten digit classifications set. Now the data gets downloaded and we obtained a complex Python object called MNIST. Let's import TensorFlow first and then examine the data we've just downloaded. We used the jupiter metric command matplotlib inline, to display to plots directly in the notebook. Then we accessed the first image out of the 55,000 from the MNIST training set using a built-in iterator, which is quite useful for later usage, but now we just accessed one image at a time. Images in the MNIST data set are 28 by 28 pixels, but we are only obtaining a vector of 784 pixel's length. Therefore, we have to reshape accordingly. Then we tell pyplot to plot, create, scale, and finally plot the image. So this is a three, and if we run again, we see a one. Note that every time we call next batch, we obtain a new image. Sometimes, we are not really sure but number is meant, therefore we can also print the label. This is a one hot encoded vector, this means at the index of the actual corresponding number, we see one, and all the other elements are zero. Now let's start with Tensorflow coding by expressing the state to computation graph using Python. We start with a so-called placeholder. This is a tense service in Tensorflow that data is fed in during execution time. So basically, this is used to add data during training which takes place after this computation graph is constructed. Those placeholders are typed, and we can use either single or double position. So this placeholder would take our training vectors, representing the images, the 784 elements inside. Now will create Tensorflow variable. A variable is something Tensorflow retrieve during training, whereas the placeholder is meant to keep training data. In addition, a variable can be saved to disk during and after training for check pointing and water transfer. So we create our wait matrix W with 784 Baitz on one X's. Just one for each element of X, and we do it 10 times. Since you are basically running 10 soft mix regression motors in parallel, one for each possible digit. Finally, we end up with a bias that draw one foot each soft next regression model. Now we create the actual model. So please be aware that no computation is happening at this stage. We are not basically hooking up the notes together to form a computation of graph. Softmax as well as Matmul expect Tensorflow variables as Polanyi does. So that's the reason TensorFlow court is so hard to read. We are not computing anything, we just despite the expressive computation of graph, which is executed by the TensorFlow engine in the background. TensorFlow is not having it's own domain specific language for doing so, but it's relying on language bindings in different programming languages, like python for example. System ML on the other hand, which is introduced in another module is a bit better here, since System ML is having its own domain specific language and are syntax of Python, which looks found more nature. Anyway, let's continue with creating another placeholder for a training level spike. These are often mentioned 10, because why? It is the one hot encoded vector labeling the image. Now we defined the cost function as cross-entropy. Therefore, let me just walk you through the formula and we will see later how to implement it in TensorFlow. So we take a predictive value of Y head and multiply it to the log of the desired value of Y and some of those values up. So we start with the reduce mean function of TensorFlow, because we are now calculating 10 individual cross-entrophy values, one for each softmax regression model. Then we use Reduce Sum, to calculate the sum of the individual values of a Tensor. And this Tensor is the product of the desired value, and the luck of the actual prediction. Reduction indices defines that the dimension of the Tensor that aggregation should take place. Since Y is a matrix of 10 columns and N rows, the N stands for a number of creating examples, the sum over the columns to obtain the value for each digit. This reside has no past to an argument to reduce mean, so that the overall prediction error is calculated all of the individual prediction errors for each number between zero and nine. Now, we use TensorFlow GradientDescentOptimizer with the learning rate of zero to five, to tweak W and B with respect to the cross-entropy function. So TensorFlow will take care of calculating the back propagation and gradients for this task automatically. A feature called automatic differentiation, does the job for us. Now we create a TensorFlow session, since we are in an interactive context within a Jupiter notebook we use the interactive session. A session is the way to deploy a TensorFlow execution graph, onto a specific execution context like a CPU or GPU. Then we initialize all global variables, since this hasn't been done. Remember, the chest has only expressed the computation of graph. Now it's time to bring it to life. After the variables have been initialized, it's time to create our GradientsDescent loop. So this is batch Gradienst Descent, since on each iteration, we graph a hand that randomly selected examples from returning set and using the session object, the ExecuteGradientDescent for those hand out examples. Note that if you pass the training example as parameters to this function call, in order to assign them to the previously defined placeholders. So this runs very fast. Now let's evaluate our classification performance using the test set from MNIST. So argmax returns the index of the tensor, in this case a vector, which the maximum value. This may be can transform back from one hot encoding scheme to a scaler. We use reduce mean in order to determine the amount of correctly predicted values, but since correct prediction is a full in vector, we have to cast it to float in order to calculate the mean over this vector. And again, accuracy is a note in computation of graph. Therefore, we need to use the TensorFlow session in order to execute it. Now the placeholders does become handy, because now we assign the test dataset to the graph. And as expected, this timber regression model gives us 92 percent of accuracy. So as you've seen training of new networks can be quite complicated. But it's okay if you didn't understand all. In the next module, we will cover TensorBoard, which is a way of debacking your new netbook.