The book spends a good deal of time talking about image analysis.

The trick with image analysis is that making every pixel an input to a neural node is counterproductive.  (If you have a 256x256 image, this boils down to 256x256x3 inputs, the 3 is for the RGB components of the color pixel.  Usually, a pixel has a numeric value between 0 and 255.  <0,0,0> is black.  <255,255,255> is white.)

If you make every pixel an input to a neural node, this does not code "adjacency."  In other words, the way a node works, is that the activation multiplies each input by its weight and then sums all of these weighted inputs as as the input to its activation function.

A neural node has a really difficult time looking at which input pixels are near to other input pixels.  If you're trying to identify a blade of grass or a face, you really need to know which pixels are adjacent.

The standard way to do this is to break the image up into pieces and assign a neural node to each one of the pieces...and pieces can overlap, too.  That way you know the pixels you're looking at are somewhere in the vicinity of each other.

"Adversarial inputs" can be generated specifically so that a neural network will give the wrong answer for them.


Data General Dasher D200:  560x264  Grayscale:  Black, Dim, White

Classic Mac:  512x342   Black, White

My 4K Roku TV:  3840x2160