Date: Fri, 24 Mar 2017 15:00:00 +0000
<h2>Feed Forward Neural Networks</h2> <p>In a feed forward neural network, neurons cannot form a cycle. In this episode, we explore how such a network would be able to represent three common logical operators: OR, AND, and XOR. The XOR operation is the interesting case.</p> <p>Below are the truth tables that describe each of these functions.</p> <h3>AND Truth Table</h3> <table> <thead> <tr> <th align="center">Input 1</th> <th align="center">Input 2</th> <th align="center">Output</th> </tr> </thead> <tbody> <tr> <td align="center">0</td> <td align="center">0</td> <td align="center">0</td> </tr> <tr> <td align="center">0</td> <td align="center">1</td> <td align="center">0</td> </tr> <tr> <td align="center">1</td> <td align="center">0</td> <td align="center">0</td> </tr> <tr> <td align="center">1</td> <td align="center">1</td> <td align="center">1</td> </tr> </tbody> </table> <h3>OR Truth Table</h3> <table> <thead> <tr> <th align="center">Input 1</th> <th align="center">Input 2</th> <th align="center">Output</th> </tr> </thead> <tbody> <tr> <td align="center">0</td> <td align="center">0</td> <td align="center">0</td> </tr> <tr> <td align="center">0</td> <td align="center">1</td> <td align="center">1</td> </tr> <tr> <td align="center">1</td> <td align="center">0</td> <td align="center">1</td> </tr> <tr> <td align="center">1</td> <td align="center">1</td> <td align="center">1</td> </tr> </tbody> </table> <h3>XOR Truth Table</h3> <table> <thead> <tr> <th align="center">Input 1</th> <th align="center">Input 2</th> <th align="center">Output</th> </tr> </thead> <tbody> <tr> <td align="center">0</td> <td align="center">0</td> <td align="center">0</td> </tr> <tr> <td align="center">0</td> <td align="center">1</td> <td align="center">1</td> </tr> <tr> <td align="center">1</td> <td align="center">0</td> <td align="center">1</td> </tr> <tr> <td align="center">1</td> <td align="center">1</td> <td align="center">0</td> </tr> </tbody> </table> <p>The AND and OR functions should seem very intuitive. Exclusive or (XOR) if true if and only if exactly single input is 1. Could a neural network learn these mathematical functions?</p> <p>Let's consider the perceptron described below. First we see the visual representation, then the Activation function <img alt="A" src="http://s3.amazonaws.com/dataskeptic.com/latex/25e26f99186ea1ba5b80cec1e7917d8896843465.svg" />, followed by the formula for calculating the output.</p> <p> </p> <center> <p> </p> <p><img alt="" src="https://s3.amazonaws.com/dataskeptic-static/img/2017/img1.svg" /></p> <p><img alt="" src="https://s3.amazonaws.com/dataskeptic-static/img/2017/activation_func.svg" /></p> <p><img alt="Output = A(w_0 \cdot Bias + w_1 \cdot Input_1 + w_2 \cdot Input_2)" src="http://s3.amazonaws.com/dev.dataskeptic.com/latex/10136c2573d6998d27536bd440338884c46265f2.svg" /></p> <p> </p> </center> <p> </p> <p>Can this perceptron learn the AND function?</p> <p>Sure. Let <img alt="w_0 = -0.6" src="http://s3.amazonaws.com/dataskeptic.com/latex/848c86ee7e8564a3b9a3ea0dd7861275779496bc.svg" /> and <img alt="w_1 = w_2 = 0.5" src="http://s3.amazonaws.com/dataskeptic.com/latex/b2a197856413cf77c51f6f44871688724812132e.svg" /></p> <p>What about OR?</p> <p>Yup. Let <img alt="w_0 = 0" src="http://s3.amazonaws.com/dataskeptic.com/latex/4c4aaedffae22cf314c59d1a0202653693901987.svg" /> and <img alt="w_1 = w_2 = 0.5" src="http://s3.amazonaws.com/dataskeptic.com/latex/b2a197856413cf77c51f6f44871688724812132e.svg" /></p> <p>An infinite number of possible solutions exist, I just picked values that hopefully seem intuitive. This is also a good example of why the bias term is important. Without it, the AND function could not be represented.</p> <p>How about XOR?</p> <p>No. It is not possible to represent XOR with a single layer. It requires two layers. The image below shows how it could be done with two laters.</p> <p> </p> <center><img alt="" src="https://dataskeptic.com/blog/episodes/2017/src-feed-forward-neural-networks/XOR_perceptron_net.png" /></center> <p> </p> <p>In the above example, the weights computed for the middle hidden node capture the essence of why this works. This node activates when recieving two positive inputs, thus contributing a heavy penalty to be summed by the output node. If a single input is 1, this node will not activate.</p> <p>Universal approximation theorem tells us that any continuous function can be tightly approximated using a neural network with only a single hidden layer and a finite number of neurons. With this in mind, a feed forward neural network should be adaquet for any applications. However, in practice, other network architectures and the allowance of more hidden layers are empirically motivated.</p> <p>Other types neural networks have less strict structal definitions. The various ways one might relax this constraint generate other classes of neural networks that often have interesting properties. We'll get into some of these in future mini-episodes.</p> <p> </p> <p><a href="https://www.periscopedata.com/skeptics"><img alt="Periscope Data" src="https://dataskeptic.com/blog/episodes/2017/src-data-provenance-and-reproducibility-with-pachyderm/periscope-data.jpg" /></a></p> <p>Check out our recent blog post on how we're using <a href="https://dataskeptic.com/blog/sponsored/2017/periscope-data-cohort-charts"> Periscope Data cohort charts</a>.</p> <p>Thanks to Periscope Data for sponsoring this episode. More about them at <a href="https://www.periscopedata.com/skeptics">periscopedata.com/skeptics</a></p>