# Single Layer NN using TensorFlow¶

The following code classifies MNIST dataset using a single layer NN with softmax activation function, Cross entropy loss function and Mini-batch Technique

## Softmax Function¶

## Cross Entropy¶

## Mini-batch Technique¶

Taking a btach of 100 images in a single iteration. 2 Reasons to use it:

- Analyzing Single image results in a curvy descent. Knowledge of 100 images at a single time gives a more precise consensus of the gradient
- We do distributed processing using GPUs on which matrix multiplications works faster. (Optimised for GPUs)

```
import os
os.environ['TF_CPP_MIN_LOG_LEVEL']='2'
import tensorflow as tf
import matplotlib.pyplot as plt
import matplotlib.patches as mpatches
%matplotlib inline
```

## Load Data¶

MNIST dataset is a handwritten numbers dataset. We download it from tensorflow examples.

Make sure to change the path according to your need

```
from tensorflow.examples.tutorials.mnist import input_data
mnist = input_data.read_data_sets("tutorials/data/MNIST/", one_hot=True)
print("Extraction of images is complete.")
```

## TensorFlow Placeholders¶

Tensorflow placeholders are like variables waiting for input. They are access points to the computational graphs on which we can just feed into the graph.

```
X = tf.placeholder(tf.float32, [None, 784])
W = tf.Variable(tf.zeros([784, 10]))
b = tf.Variable(tf.zeros([10]))
init = tf.global_variables_initializer()
```

## Model¶

Our model is a Softmax Function applied over the weighted sum of the image pixels along with a bias added to it.

Where,

X = flattened image vector (1, 784)

W = Weights matrix of shape (784, 10) -> 784 weights for each pixel and 1 column for each class [0-9]

b = bias vector with 10 columns, each column representing a class

## Role of Mini-batch¶

Since, we are using mini-batch technique. We will use X as a flattened image matrix with 100 rows, each row representing a flattened image.

```
Y = tf.nn.softmax(tf.matmul(X, W) + b)
```

## Placeholder for correct Label¶

We need a Placeholder, to hold the correct labels, which will help us to compute the accuracy and cross-entropy of our model.

```
Y_ = tf.placeholder(tf.float32, [None, 10])
```

## Loss Function¶

As discussed above we are using cross-entropy loss function

```
cross_entropy = -tf.reduce_sum(Y_ * tf.log(Y))
```

## % of correct answers found in batch¶

Graph nodes to compute the accuracy of our model

```
is_correct = tf.equal(tf.arg_max(Y_, 1), tf.arg_max(Y, 1))
accuracy = tf.reduce_mean(tf.cast(is_correct, tf.float32))
```

## Optimizer¶

We are using the simplest Gradient Descent technique as an optimizer, with a learning rate of 0.003. This means we will be adding the 0.3% value of our gradient to the weights everytime.

## Why Gradient?¶

Gradients have a unique property to point towards the minima of the curve. Thus resulting in, change of weights to reach a minima in the loss function.

```
optimizer = tf.train.GradientDescentOptimizer(0.003)
train_step = optimizer.minimize(cross_entropy)
```

## Session¶

According to definition on tensorflow documentation:

A Session object encapsulates the environment in which Operation objects are executed, and Tensor objects are evaluated.

Well, we can say this is a kind of main function to our computational graph. That is, it starts the execution of the graph in the order, we added the computational nodes.

```
sess = tf.Session()
sess.run(init)
```

## Training¶

Time to train the system. We run 2000 iterations on the train step and compute the accuracy and cross-entropy on each step. Along side we compute the accuracy and cross-entropy on the test set on each iteration, to see the improvement on alien data.

```
# no. of iterations
n_iter = 2000
# test set
test_data = {X: mnist.test.images, Y_: mnist.test.labels}
# lists to hold train accuracy and cross-entropy
acc_train_li = []
cross_train_li = []
# lists to hold test accuracy and cross-entropy
acc_test_li = []
cross_test_li = []
for i in range(n_iter):
# load batch of images and correct answer
bacth_X, batch_Y = mnist.train.next_batch(100)
train_data = {X: bacth_X, Y_: batch_Y}
# train
sess.run(train_step, feed_dict=train_data)
# find accuracy and cross entropy on current data
a, c = sess.run([accuracy, cross_entropy], feed_dict=train_data)
acc_train_li.append(a)
cross_train_li.append(c)
# find accuracy and cross entropy on test data
a, c = sess.run([accuracy, cross_entropy], feed_dict=test_data)
acc_test_li.append(a)
cross_test_li.append(c)
```

## Plot the graph¶

We plot 2 graphs:

- Accuracy graph – To display the accuracy on the train data and test data
- Cross-entropy graph – To display the cross-entropy loss on train and test data

```
x = list(range(n_iter))
blue_patch = mpatches.Patch(color='blue', label='Train Data')
red_patch = mpatches.Patch(color='red', label='Test Data')
plt.figure(0, figsize=(10, 12))
plt.subplot(211)
plt.title("Accuracy")
plt.legend(handles=[blue_patch, red_patch])
plt.plot(x, acc_train_li, color='blue')
plt.plot(x, acc_test_li, color='red')
plt.subplot(212)
plt.legend(handles=[blue_patch, red_patch])
plt.title("Cross-Entropy Loss")
plt.plot(x, cross_train_li, color='blue')
plt.plot(x, cross_test_li, color='red')
plt.show()
```

## Final Loss and Accuracy¶

Let’s have a peek at the final loss and accuracy of the training and test sets

```
print('Train Set Accuracy: {} \t Train Set cross-entropy Loss: {}'.format(acc_train_li[-1], cross_train_li[-1]))
print('Test Set Accuracy: {} \t Test Set cross-entropy Loss: {}'.format(acc_test_li[-1], cross_test_li[-1]))
```

## Conclusion¶

Using a Single layered neural network resulted in an accuracy of approximately 92%. Considering the situation of using this system in a post office to detect hand written numbers can be a devastating.

Why? Because, according to our finding it will misinterpret 8 out every 100 images (92%).

Code available at this repository

Found an issue? Or just want to say hello? You can contact me on my website