What is the definition of Machine learning? The wikipedia definition of the term is as follows:
Machine learning, a branch of artificial intelligence, concerns the construction and study of systems that can learn from data.
What does that mean? How does it work? this article is going to go over both of these fundamental questions and talk specifically about Artificial Neural Networks (ANNs) and why they are such a powerful tool in the hands of the savvy developer.
Imagine you have a “smart key” lock on your door, these locking mechanisms have a special feature that allows them to reset the tumblers and form fit to the teeth of a new key. This new key is now the only key that will open the lock, as the lock now “knows” what key is valid and what isn’t, and it was taught to acknowledge the current key as the only valid key, and a key with even a slight variation of teeth would be considered invalid.
What I just described was a layman’s view of what a neural network does, and how machine learning works. a basic Artificial Neural Network functions by taking a set of input parameters (they could be any variable that could even remotely impact the output value), putting them through a series of plastic ( malleable) coefficients, which are then plugged into an activation function which spits out what it thinks to be an output value.
Y(O1, O2) = A(1*C1*A(1*C2+I1*C3+I2*C4+….+ In*Cn+2))
The power of this system comes in the randomization of the weight choices. Weights are initially randomized and then by a multitude of potential methods are refined to produce less and less error per iteration (The specific reduction methods will be discussed in further detail in another talk.)
As the system reduces error iteratively, it will eventually reach a point where the error between the expected result of the dataset and the real value are nearly indistinguishable, however the network is not yet verified.
The next step is to plug the weight values into a verification network (IE the weights are fixed, and the input and ouput values iterate through the verification data). If the network was sufficiently trained, then it will pass the verification stage and the designer will able to acknowledge the strengths and limitations of this particular net, and use that to forecast with reasonable certainty of its results.
A network like I just described could be as varied as Google’s house number optical character recognition software (http://phys.org/news/2014-01-google-team-neural-network-approach.html), a “suspicious person” face recognition software for law enforcement and transit authorities, or a smart commodities and stock trader that is able to see past speculation and find the true value of a particular commodity, and communicate with other smart traders to determine the direction of the global economy.
Now there are drawbacks of course; The computational power required to do any of these individual tasks is significant, and the training dataset must be sufficiently large, varied and transparent enough so that the system can learn the intricacies and fundamentals of how each system obeys underlying natural laws.
However with the advent of “big data”, obtaining such information is becoming quite straightforward albiet sometimes expensive, and with more powerful computing methods such as GPGPU (general purpose graphics processing unit) computing via openCL, the power of neural networks and smart agents is just starting to be realized.