Evaluating Models - Tensorflow Machine Learning Cookbook

We have learned how to train a regression and classification algorithm in TensorFlow. After this is accomplished, we must be able to evaluate the model's predictions to determine how well it did.

Getting ready

Evaluating models is very important and every subsequent model will have some form of model evaluation. Using TensorFlow, we must build this feature into the computational graph and call it during and/or after our model is training.

Evaluating models during training gives us insight into the algorithm and may give us hints to debug it, improve it, or change models entirely. While evaluation during training isn't always necessary, we will show how to do this with both regression and classification.

After training, we need to quantify how the model performs on the data. Ideally, we have a separate training and test set (and even a validation set) on which we can evaluate the model.

When we want to evaluate a model, we will want to do so on a large batch of data points.

If we have implemented batch training, we can reuse our model to make a prediction on such a batch. If we have implemented stochastic training, we may have to create a separate evaluator that can process data in batches.

If we included a transformation on our model output in the loss function, for example, sigmoid_cross_entropy_with_logits(), we must take that into account when computing predictions for accuracy calculations. Don't forget to include this in our evaluation of the model.

How to do it…

Regression models attempt to predict a continuous number. The target is not a category, but a desired number. To evaluate these regression predictions against the actual targets, we need an aggregate measure of the distance between the two. Most of the time, a meaningful loss function will satisfy these criteria. Here is how to change the simple regression algorithm from above into printing out the loss in the training loop and evaluating the loss at the end. For an example, we will revisit and rewrite our regression example in the prior Implementing Back Propagation recipe in this chapter.

Classification models predict a category based on numerical inputs. The actual targets are a sequence of 1s and 0s and we must have a measure of how close we are to the truth from our predictions. The loss function for classification models usually isn't that helpful in interpreting how well our model is doing. Usually, we want some sort of classification accuracy, which is commonly the percentage of correctly predicted categories. For this

example, we will use the classification example from the prior Implementing Back Propagation recipe in this chapter.

How it works…

First we will show how to evaluate the simple regression model that simply fits a constant multiplication to the target of 10, as follows:

1. First we start by loading the libraries, creating the graph, data, variables, and placeholders. There is an additional part to this section that is very important. After we create the data, we will split the data into training and testing datasets randomly.

This is important because we will always test our models if they are predicting well or not. Evaluating the model both on the training data and test data also lets us see whether the model is overfitting or not:

import matplotlib.pyplot as plt import numpy as np

import tensorflow as tf sess = tf.Session()

x_vals = np.random.normal(1, 0.1, 100) y_vals = np.repeat(10., 100)

x_data = tf.placeholder(shape=[None, 1], dtype=tf.float32) y_target = tf.placeholder(shape=[None, 1], dtype=tf.float32) batch_size = 25

train_indices = np.random.choice(len(x_vals), round(len(x_

vals)*0.8), replace=False)

test_indices = np.array(list(set(range(len(x_vals))) - set(train_

indices)))

x_vals_train = x_vals[train_indices]

x_vals_test = x_vals[test_indices]

y_vals_train = y_vals[train_indices]

y_vals_test = y_vals[test_indices]

A = tf.Variable(tf.random_normal(shape=[1,1]))

2. Now we declare our model, loss function, and optimization algorithm. We will also initialize the model variable A. Use the following code:

my_output = tf.matmul(x_data, A)

loss = tf.reduce_mean(tf.square(my_output - y_target)) init = tf.initialize_all_variables()

sess.run(init)

my_opt = tf.train.GradientDescentOptimizer(0.02) train_step = my_opt.minimize(loss)

3. We run the training loop just as we would before, as follows:

for i in range(100):

rand_index = np.random.choice(len(x_vals_train), size=batch_

size)

rand_x = np.transpose([x_vals_train[rand_index]]) rand_y = np.transpose([y_vals_train[rand_index]])

sess.run(train_step, feed_dict={x_data: rand_x, y_target:

rand_y})

4. Now, to evaluate the model, we will output the MSE (loss function) on the training and test sets, as follows:

mse_test = sess.run(loss, feed_dict={x_data: np.transpose([x_vals_

test]), y_target: np.transpose([y_vals_test])})

mse_train = sess.run(loss, feed_dict={x_data: np.transpose([x_

vals_train]), y_target: np.transpose([y_vals_train])}) print('MSE' on test:' + str(np.round(mse_test, 2))) print('MSE' on train:' + str(np.round(mse_train, 2))) MSE on test:1.35

MSE on train:0.88

5. For the classification example, we will do something very similar. This time, we will need to create our own accuracy function that we can call at the end. One reason for this is because our loss function has the sigmoid built in and we will need to call the sigmoid separately and test it to see if our classes are correct.

6. In the same script, we can just reload the graph and create our data, variables, and placeholders. Remember that we will also need to separate the data and targets into training and testing sets. Use the following code:

from tensorflow.python.framework import ops ops.reset_default_graph()

sess = tf.Session() batch_size = 25

x_vals = np.concatenate((np.random.normal(-1, 1, 50), np.random.

normal(2, 1, 50)))

y_vals = np.concatenate((np.repeat(0., 50), np.repeat(1., 50))) x_data = tf.placeholder(shape=[1, None], dtype=tf.float32) y_target = tf.placeholder(shape=[1, None], dtype=tf.float32) train_indices = np.random.choice(len(x_vals), round(len(x_

vals)*0.8), replace=False)

test_indices = np.array(list(set(range(len(x_vals))) - set(train_

indices)))

7. We will now add the model and the loss function to the graph, initialize variables, and create the optimization procedure, as follows:

my_output = tf.add(x_data, A)

8. Now we run our training loop, as follows:

for i in range(1800):

rand_index = np.random.choice(len(x_vals_train), size=batch_

size)

rand_x = [x_vals_train[rand_index]]

rand_y = [y_vals_train[rand_index]]

sess.run(train_step, feed_dict={x_data: rand_x, y_target:

rand_y})

if (i+1)%200==0:

Step #800 A = [-0.20045301]

Loss = 0.241349

Step #1000 A = [-0.33634067]

Loss = 0.376786

Step #1200 A = [-0.36866501]

Loss = 0.271654

Step #1400 A = [-0.3727718]

Loss = 0.294866

Step #1600 A = [-0.39153299]

Loss = 0.202275

Step #1800 A = [-0.36630616]

Loss = 0.358463

9. To evaluate the model, we will create our own prediction operation. We wrap the prediction operation in a squeeze function because we want to make the predictions and targets the same shape. Then we test for equality with the equal function. After that, we are left with a tensor of true and false values that we cast to float32 and take the mean of them. This will result in an accuracy value. We will evaluate this function for both the training and testing sets, as follows:

y_prediction = tf.squeeze(tf.round(tf.nn.sigmoid(tf.add(x_data, A))))

correct_prediction = tf.equal(y_prediction, y_target)

accuracy = tf.reduce_mean(tf.cast(correct_prediction, tf.float32)) acc_value_test = sess.run(accuracy, feed_dict={x_data: [x_vals_

test], y_target: [y_vals_test]})

acc_value_train = sess.run(accuracy, feed_dict={x_data: [x_vals_

train], y_target: [y_vals_train]})

print('Accuracy' on train set: ' + str(acc_value_train)) print('Accuracy' on test set: ' + str(acc_value_test)) Accuracy on train set: 0.925

Accuracy on test set: 0.95

10. Many times, seeing the model results (accuracy, MSE, and so on) will help us to evaluate the model. We can easily graph the model and data here because it is one-dimensional. Here is how to visualize the model and data with two separate histograms using matplotlib:

A_result = sess.run(A)

bins = np.linspace(-5, 5, 50)

plt.hist(x_vals[0:50], bins, alpha=0.5, label='N'(-1,1)', color='white')

plt.hist(x_vals[50:100], bins[0:50], alpha=0.5, label='N'(2,1)', color='red')

plt.plot((A_result, A_result), (0, 8), 'k--', linewidth=3, label='A = '+ str(np.round(A_result, 2)))

plt.legend(loc='upper right')

plt.title('Binary' Classifier, Accuracy=' + str(np.round(acc_

value, 2))) plt.show()

Figure 8: Visualization of data and the end model, A. The two normal values are centered at -1 and 2, making the theoretical best split at 0.5. Here the model found the best split very close to that number.

In document Tensorflow Machine Learning Cookbook (Page 74-80)