ANN: Intro
What is a an artificial neural network?
At this point, a great deal of us have an idea of what an artificial neural network is, or have at least heard about it. It's a model (modeled after our brain's neural network) that is comprised of nodes (contained in layers) and edges (that had associated weights), with an input and an output layer, that helps us find patterns in our data that will help us solve a problem or make a prediction.
ANNs are based on a reinforcement learning framework. This means that the model has no prior knowledge, but rather is rewarded or penalized based on actions. Think about when you first learned how to ride a bike. You got on the bike and fell - there is a penalty associated with that. You now have more information the next time you get on the bike. The model reinforces good behavior (i.e. you don't die if you can ride a bike well) and penalizes bad behavior (i.e. you have a cut on your leg for the rest of your life from falling off the bike so you better learn faster!!). It also needs inputs and a goal. The inputs are our variables, and the goal is to minimize the error (or learn how to ride the damn bike) , for example.
On a high level: the deeper the ANN, the more patterns it can recognize. The first layer recognizes a simple pattern and the more layers you add, the more patterns it recognizes. This is referred to as a “deep” network.
Artificial Neural Networks vs Biological Neural Networks
Our brain contains tens of billions of neurons and tens of trillions of connections, called synapses. Each neuron receives "input" through its dendrites. The input travels through the cell body, which instructs the neuron to "fire" or not, and then produces output signals along its axon, which connects to other dendrites through the terminals, and becomes input for the next neuron.
Biological neural networks also work based on a reinforcement learning framework. Our brain remembers these connections based on the outputs of certain inputs and in turn "learns".
See below a comparison of the structures of artificial and biological neural networks.
What is a perceptron?
A perceptron is the building block and the most basic form of a neural network (a neural network can be referred to as a multi-layer perceptron). A perceptron algorithm is used for supervised learning i.e. binary classifiers. It can be used for both two class classification as well as for multi-class classification (a combination of multiple perceptrons). To understand how a neural network operates, it's important to understand how a perceptron works.
A perceptron is comprised of nodes, edges, summation and activation functions.
This is what it looks like:
The inputs, x, and the constant are the inputs of the network. They are weighted (the bias is the constant's weight). The weighted sum is then calculated (this is also referred to as the transfer function) and put through an activation function, which will transform it into the desired output. In the above example, the activation function is a step function, which returns 1 for positive values and 0 for negative values (more on activation functions later).
The first weight (weight 0) in the diagram above is the corresponding weight to the bias, which helps you shift the output of the activation function.
The rest of the weights are the corresponding weights to the input matrix X.
Next post: more on perceptrons and examples using classification problems.