Goals
- Work on online machine training.
- Work on neural network models using Tensorflow.
- Finish work on the recommender system and writing of project report.
Scoring
Every exercise has an associated difficulty level. Easy and medium-difficult exercises help you understand the fundamentals and give you ideas to work on difficult exercises. It is highly recommended that you finish easy and medium-difficult exercises to have a good score. Given below is the difficulty scale that will be marked with every exercise:
- ★: Easy
- ★★: Medium
- ★★★: Difficult
Guidelines
- To get complete guidance from the mentors, it is highly recommended that you work on today's practical session and not on the preceding ones.
- Make sure that you rename your submissions properly and correctly. Double-check your submissions.
- Please check the references.
- There are several ways to achieve a task. Hence there are many possible solutions. But try to make maximum use of the libraries that have been suggested to you for your exercises.
Installation
Please refer installation page.
Exercise 4.1 ★
During our last practical session 3, we split our data into two: training data and test data for creating models for prediction and we fed the complete training data to our classifier. However, in real life, we may have new data to train. Check the following code using perceptron and compare it with the code of exercise 3.3:
from sklearn import datasets, metrics
from sklearn.linear_model import Perceptron
import numpy as np
import matplotlib.pyplot as plot
digits = datasets.load_digits()
training_size = int(digits.images.shape[0]/2)
training_images = digits.images[0:training_size]
training_images = training_images.reshape((training_images.shape[0], -1))
training_target = digits.target[0:training_size]
classifier = Perceptron(max_iter=1000)
#training
for i in range(training_size):
training_data = np.array(training_images[i])
training_data = training_data.reshape(1, -1)
classifier.partial_fit(training_data, [training_target[i]], classes=np.unique(digits.target))
#prediction
predict_images = digits.images[training_size+1:]
actual_labels = digits.target[training_size+1:]
predicted_labels = classifier.predict(predict_images.reshape((predict_images.shape[0], -1)))
#classification report
print(metrics.classification_report(actual_labels,predicted_labels))
This approach is called online machine training (or algorithme d'apprentissage incrémental (fr)). Did you get good precision?
Your next question is to modify the above program and test online training with MLPClassifier.
Try modifying (reducing and increasing) the training data size. What are your observations?
Exercise 4.2 ★★
Your final exercise is to use Tensorflow. We will use a Deep Neural Network (DNN) classifier with two hidden layers. Before starting, please refer the installation page for installing tensorflow.
Recall that we have already used Multilayer perceptron (MLP), a subset of Deep Neural Network in our preceding practical session. We will first predict an image of a digit using DNNClassifier.
import tensorflow as tf
from sklearn import datasets
import matplotlib.pyplot as plot
digits = datasets.load_digits()
training_size = int(digits.images.shape[0]/2)
training_images = digits.images[0:training_size]
training_images = training_images.reshape((training_images.shape[0], -1))
training_target = digits.target[0:training_size]
classifier = tf.contrib.learn.DNNClassifier(
feature_columns=[tf.contrib.layers.real_valued_column("", dtype=tf.float64)],
# 2 hidden layers of 50 nodes each
hidden_units=[50, 50],
# 10 classes: 0, 1, 2...9
n_classes=10)
#training
classifier.fit(training_images, training_target, steps=100)
#prediction
predict_images = digits.images[training_size+1:]
predict = classifier.predict(predict_images[16].reshape(1,-1))
print(list(predict))
plot.imshow(predict_images[16], cmap=plot.cm.gray_r)
plot.show()
Did it work? Let's now try to get the accuracy of our model. Will it work for our entire test data?
import tensorflow as tf
from sklearn import datasets
import matplotlib.pyplot as plot
digits = datasets.load_digits()
training_size = int(digits.images.shape[0]/2)
training_images = digits.images[0:training_size]
training_images = training_images.reshape((training_images.shape[0], -1))
training_target = digits.target[0:training_size]
classifier = tf.contrib.learn.DNNClassifier(
feature_columns=[tf.contrib.layers.real_valued_column("", dtype=tf.float64)],
# 2 hidden layers of 50 nodes each
hidden_units=[50, 50],
# 10 classes: 0, 1, 2...9
n_classes=10)
#training
classifier.fit(training_images, training_target, steps=100)
#prediction
predict_images = digits.images[training_size+1:]
actual_labels = digits.target[training_size+1:]
evaluation = classifier.evaluate(x=predict_images.reshape((predict_images.shape[0], -1)), y=actual_labels)
print(evaluation['accuracy'])
What is the accuracy that you got? Now change the number of neurons in each layer (currently it is set to 50 each). Also try to increase the number of hidden layers. Did your accuracy improve?
Exercise 4.3 ★★★
Project: Image recommender system: 3 practical sessions
Recall that the goal of this project is to recommend images based on the color preferences of the user.
If required, you can refer to some example Python code in the references page to resize images, read and write JSON files.
Please prepare a 3-page Project report (French or English) detailing the following:
- Goal of your project
- Data sources of your training images and licence. Did you use labeled data sources? Did you ask your user to label images?
- Machine learning models that you tested and used as well as their precision.
- Size of your training data and test data.
- Did you use online machine learning?
- Information that you decided to store for each image.
- Information concerning user preferences
- Self-evaluation of your work.
- Remarks concering the practical sessions, exercises and scope for improvement.
- Conclusion
Note: Please do not add any program (or code) in this report.
Submission
- Rename your notebook as Name1_Name2_[Name3].ipynb, where Name1, Name2 are your names.
- Rename your project report as Name1_Name2_[Name3].pdf, where Name1, Name2 are your names.
- Submit your notebook and Project report online.
- Please do not submit your images, JSON, TSV and CSV files.