Problems:
- Semantic Gap: There’s a huge gap between the semantic idea of a cat, and these pixel values that the computer is actually seeing.
- Viewpoint variation: All pixels change when the camera moves
- Illumination: There can be lighting conditions going on in the scene
- Deformation: Cats can assume a lot of different, varied poses and positions.
- Occlusion: You might only see a part of a cat.
- Background Clutter: The foreground of the cat look similar in appearance
- Intraclass variation: Cats can come in different shapes and sizes and colors and ages
An image classifier
|
|
no obvious way to hard-code the algorithm for recognizing a cat, or other classes.
Data-Driven Approach
- Collect a dataset of images and labels
- Use Machine Learning to train a classifier
- Evaluate the classifier on new images
|
|
Rather than a single function that just inputs an image and recognizes a cat, we have these two functions. One called train, that’s going to input images and labels and then output a model, another function called predict, which will input the model and make predictions for images.
#Nearest Neighbor classifier
|
|
Q: With N examples. how fast are training and prediction?
A: Train O(1), predict O(N)
This is bad: we want classifiers that are fast at prediciton; slow for training is ok.
k-Nearest Neighbors
Instead of copying label from nearest neighbor, thake majority vote form K closest points.
###Hyperparameters
- What is the best value of k to use?
- What is the best distance to use?
These are hyperparameters: choices about the algorithm that we set rather than learn
_Very problem-dependent._
_Must try them all out and see what works best._
Setting Hyperparameters
- Split data into train, val, and test; choose hyperparameters on val and evaluate on test
- Cross-Validation: Split data into **folds, try each fold as validation and average the results. Useful for small datasets but not used too frequently in deep learning.
k-Nearest Neighbor on images never used
- Very slow at test time
- Distance metrics on pixels are not informative
- Curse of dimensionality
k-Nearest Neighbors: Summary
- In Image classification we start with a training set of images and labels, and must predict labels on the test set
- The *K-Nearest Neighbors classifier predicts labels based on nearest training examples
- Distance metric and K are hyperparameters
- Choose hyperparameters using the validation set; only run on the test set once at the very end!
#Linear Classification
These deep neural networks are kind of like Legos and this linear classifier is kind of like the most basic building blocks of these giant networks.
f(x, W) = Wx + b





近期评论