[cv231n] Lecture 1 | Introduction to Convolutional Neural Networks for Visual Recognition

Lecture 1 | Introduction to Convolutional Neural Networks for Visual Recognition Feifei-Li

Prerequisites

  • Proficiency in Python, high-level familiarity in C/C++
    All class assignments will be in Python (and use numpy) (we provide a tutorial here for those who aren't as familiar with Python), but some of the deep learning libraries we may look at later in the class are written in C++. If you have a lot of programming experience but in a different language (e.g. C/C++/Matlab/Javascript) you will probably be fine.
  • College Calculus, Linear Algebra (e.g. MATH 19 or 41, MATH 51)
    You should be comfortable taking derivatives and understanding matrix-vector operations and notation.
  • Basic Probability and Statistics (e.g. CS 109 or other stats course)
    You should know the basics of probabilities, Gaussian distributions, mean, standard deviation, etc.
  • Equivalent knowledge of CS229 (Machine Learning)
    We will be formulating cost functions, taking derivatives and performing optimization with gradient descent.\

     

1. A brief history of computer vision

1.1 Evolution's Big Bang: human's vision vision

 

1.2. The history of vision world

 

1.3. MIT summer vision project - the first vision project

 

1.4. Stanford - Every image can be the shape of a generalized cylinder

 

1.5 image segment - graph theory

 

1.6 2001 - face detection 2006 - Real-time face detection

 

1.7 1999 - Object recognition 

 

1.8 Spatial  pyramid matching

 

1.9 First Benchmark Image Dataset - PASCAL Visual Object Challenge

 

1.10 2009 - Princeton & Stanford: ImageNet

image - large data - high dimensional - machine learning - overfitting

 

1.11 Large Scale Vision Recognition Challenge: LSVRC

Need to output all the object in the image

 

1.12 LSVRC Competition

2012 CNN

2014 VGGNet

2015 ResNet

 

2. cs231n overview

2.1 focus: important of visual recognition - image classification

object detection

action classification

image caption

 

2.2 CNN for object detection

152 layer ResNet

CNN breakthrough since 2012

 

2.3 CNN were not invented overnight

1998 LeNet

2012 AlexNet

 

different: GPU / much data

 

learning the image