we will build a neural network in tensor flow consisting of four layers that takes in a handwritten digit and tells us which digit it is we will use the amnest handwritten digits data set which contains 60,000 images for training and 10,000 images for testing our model and it works pretty well tflow is an open-source machine learning library developed by Google it is used for both research and production Google tensorflow offers apis for beginners experts to develop applications for desktop mobile web and the cloud I will use a Jupiter notebook inside a virtual environment we first
create a new virtual environment with Python 3 minus mvn VN and to activate it we use S VN bin activate we run pip 3 install Jupiter 10 flow numpy met plot lip to install all the necessary dependencies for our project with Jupiter notebook we start Jupiter and with new Python 3 we create a new notebook first I add all the necessary Imports to the top of the notebook so I don't need to scroll around much during the video the images we'll work with have a size of 28 by 28 pixels and are grayscale meaning the
pixel values range from zero for black up to 255 for white to ensure our Network learns efficiently we aim to have normalized values though ranging from 0 to 1 so we will need to do a little bit of pre-processing as well first we need to download the data set fortunately it is a very common data set to train on so there are helper methods available for downloading it we use data sets amist load data to download the data set to normalize the data we are casting the integer values to floating Point values and dividing them
by 255 let's briefly examine the data set our training set has 60,000 images and there are 10,000 images in the test set each image is 28x 28 pixels let's see what they look like I use matplot lip to show the first 10 images of the training set here exactly what we want 10 handwritten digits with the strongest pixel value being the digit itself let's create a simple feed forward network with 784 input neurons a hidden layer with 128 neurons another hidden layer with 64 neurons and an output layer with 10 neurons one output neuron for
each possible digit the input and hidden layers will use the relo activation function to communicate their activation to the next layer and the output uses softmax to turn the neuron activation into a probability this way we get the probability of which digit is seen in the input image ranging from 0o to one I use the sequential class to Define our model this class feeds the input sequentially through the defined layers to generate the output the input to our newal network is a two-dimensional image of 28x 28 pixels because our inut layer requires 784 input values
we flatten the input to be a vector of length 784 next we feed these 784 values into a densely connected layer which outputs 128 activations and uses the reload activation function to shape them these are then fed into another densely connected layer outputting 64 activations shaped by relu again lastly we use another densely connected layer to Output 10 activations but this time we use the soft Max activation function to shape them and to turn them into a probability now it's time to train our model on the training set training means we are feeding the entire
training set so 60,000 images and their respective correct labels a couple of times into the network comparing if the result of the network is actually correct every time something goes wrong we calculate which neurons make a bad decision and tweak them a little if everything works over time our Network should get better and learn to recognize the digits this training algorithm is called back propagation and there are different implementations explaining the math behind it goes beyond the scope of this video but there are a lot of really good resources out there if you want to
know more to know if our anual Network perform performs the task correctly we need a definition of what correct means that's what a loss function is for for a classification task like ours the negative log likelihood loss function is great to give you an intuition of what is happening imagine the following given the number three our model spits out these probabilities as we can see the model is only 40% sure that the digit is a three that needs to be corrected our loss function should spit out a high value to indicate something is wrong and
a low value if everything is fine we take the 0.4 and calculate the logarithm of it this gives us a Min -92 if it would be more sure like 90% we would get -.1 and if it wasn't sure at all with 1% this would yield - 4.6 so to get a high value for every wrong prediction all we have to do is to negate the lock and voila we get a high loss value for wrong predictions keep in mind though that the loss function is really dependent on the task you're trying to perform so you
basically cannot simply copy this code to make a prediction on crypto prices before we can train our model we need to compile it here we set the optimizer Adam is one of the implementations of back propagation we also set our loss function here it gets a little bit complex to avoid delving to too deep into naming and other loss functions sparse categorical cross entropy uses the negative log likelihood when making a prediction in multiple classes and the target label is the index of the category we track the accuracy during training which is simply the number
of correct predictions divided by the total number of images in the training set now we can simply call the fit method with our training images the associated layer labels and let it run for five Epoch which means five full runthrough over all 60,000 images let's train our model this didn't take very long so let's see how we're doing let's start with one digit for that I'll use a function I found in a medium article I link in the description below this function takes the image of the digit and the probabilities out put by the model
we use met plot lip to display the image and the probabilities side by side first we show the image next we draw a bar chart of all the probabilities for each digit we are retrieving one image from our test set and inputting it into our model then we let our model make a prediction and obtain the probabilities next we can easily pass the image and the probabilities to our view class function here's what it looks like our model is quite confident that this is a seven good job now it's time to evaluate the whole test
set to evaluate our model we simply call the evaluate method with all test images and test labels and receive the total loss and accuracy back let's see wow 97.93% workings of new networks and how to predict your favorite whiskey with them this video tells you all about it until then have a lot of fun Cod US