image-classification-note

Problems:

Semantic Gap: There’s a huge gap between the semantic idea of a cat, and these pixel values that the computer is actually seeing.
Viewpoint variation: All pixels change when the camera moves
Illumination: There can be lighting conditions going on in the scene
Deformation: Cats can assume a lot of different, varied poses and positions.
Occlusion: You might only see a part of a cat.
Background Clutter: The foreground of the cat look similar in appearance
Intraclass variation: Cats can come in different shapes and sizes and colors and ages

An image classifier

1
2
3

def (image):
	
    return class_label

no obvious way to hard-code the algorithm for recognizing a cat, or other classes.

Data-Driven Approach

Collect a dataset of images and labels
Use Machine Learning to train a classifier
Evaluate the classifier on new images

def train(images, labels):
	# Machine learning
	return model
  
def predict(model, test_images):
	# Use model to predict labels
    return test_labels

Rather than a single function that just inputs an image and recognizes a cat, we have these two functions. One called train, that’s going to input images and labels and then output a model, another function called predict, which will input the model and make predictions for images.

#Nearest Neighbor classifier

import numpy as np
class NearestNeighbor:
	def __init__(self):
		pass
	def train(self, X, y):
	""" X is N x D where each row is an example. Y is 1-dimension of size N """
    # the nearest neighbor classifier simply remembers all the training data
    self.Xtr = X
    self.ytr = y
    
    def predict(self, X):
     """ X is N x D where each row is an example we wish to predict label for """
    num_test = X.shape[0]
    # lets make sure that the output type matches the input type
    Ypred = np.zeros(num_test, dtype = self.ytr.dtype)
    
    # loop over all test rows
    for i in xrange(num_test):
		# find the nearest training image to the i'th test image
     	# using the L1 distance (sum of absolute alue differences)
        distances = np.sum(np.abs(self.Xtr - X[i, :]), axis = 1)
        min_index = np.argmin(distances) # get the index with smallest distance
		Ypred[i] = self.ytr[min_index] #predict the label of the nearest example
     return Ypred

Q: With N examples. how fast are training and prediction?

A: Train O(1), predict O(N)

This is bad: we want classifiers that are fast at prediciton; slow for training is ok.

k-Nearest Neighbors

Instead of copying label from nearest neighbor, thake majority vote form K closest points.

###Hyperparameters

What is the best value of k to use?
What is the best distance to use?

These are hyperparameters: choices about the algorithm that we set rather than learn

_Very problem-dependent._

_Must try them all out and see what works best._

Setting Hyperparameters

Split data into train, val, and test; choose hyperparameters on val and evaluate on test
Cross-Validation: Split data into **folds, try each fold as validation and average the results. Useful for small datasets but not used too frequently in deep learning.

k-Nearest Neighbor on images never used

Very slow at test time
Distance metrics on pixels are not informative
Curse of dimensionality

k-Nearest Neighbors: Summary

In Image classification we start with a training set of images and labels, and must predict labels on the test set
The *K-Nearest Neighbors classifier predicts labels based on nearest training examples
Distance metric and K are hyperparameters
Choose hyperparameters using the validation set; only run on the test set once at the very end!

#Linear Classification

These deep neural networks are kind of like Legos and this linear classifier is kind of like the most basic building blocks of these giant networks.

f(x, W) = Wx + b

image-classification-note

Problems:

An image classifier

Data-Driven Approach

k-Nearest Neighbors

Setting Hyperparameters

k-Nearest Neighbor on images never used

k-Nearest Neighbors: Summary

近期文章

近期评论

标签

热门

文章归档

分类目录

功能