วันจันทร์ที่ 8 พฤษภาคม พ.ศ. 2560

Convolution neural network (cnn)

An MLP with normally 3 hidden layers : convolution layer (similar to sliding window but here called filter producing output called feature map) & pooling layer (for non-linear down-sampling e.g. max pooling taking max value pixel) & fully connected layer.
Traditional MLP do not scale well to higher resolution images. For example, in CIFAR-10, images are only of size 32x32x3 (32 wide, 32 high, 3 color channels), so a single fully connected neuron in a first hidden layer of a regular neural network would have 32*32*3 = 3,072 weights. A 200x200 image, however, would lead to neurons that have 200*200*3 = 120,000 weights.
Also, such network architecture does not take into account the spatial structure of data, treating input pixels which are far apart the same as pixels that are close together. Thus, full connectivity of neurons is wasteful for the purpose of image recognition