Tuesday, February 28, 2017

Is deep learning about what you train on?

In simple terms, deep learning is about using a series of artificial neural networks in multiple layers to form representations of data. The math behind it is a bit more complex, but it is more or less about adjusting the weights and biases linking the various neurons in the network so that for a given input, the calculated result gives a certain output.

Usually, we use a set of random initial values for the weights and biases, and through training (basically, feeding in many many sets of inputs, comparing the outputs of the network with the expected outputs, and adjusting the weights and biases if the two are not the same), arrive at a network that has a set of weights and biases that can more or less given us an expected output when we feed it an input.

The training data used to train the network is usually a very large set of inputs and their corresponding outputs (aka correct answer). And most of the time, there is no order to this set of data; it is like a huge random collection with no fixed order.

But if you think about it, if these artificial neural networks are trying to simulate our brains, with each layer learning an abstraction of the data, then maybe we should be training these networks in a similar way to how we train our brains.

What I am suggesting is this: start off with a set of simple data (simple pictures that you find in children's books, simple sentences or words, etc.) and allow the network to make big adjustments to the weights and biases as it trains on this set of data. This is like how children learn basic concepts with easily influenced minds.

Next is to slower increase the complexity of the training data, while reducing the amount of changes (learning rate) that can be made to the weights and biases. Eventually, we should be training the network with real-world data. The whole idea is to train the network as we would train our own brains, in increasing level of complexity.

The tedious part here is to segregate existing training data into differing levels of complexity. It may be easier to just create new datasets, but datasets are usually huge and therefore time-consuming to generate anew. But with Amazon Mechanical Turk, it may not be that difficult after all. Maybe some researcher somewhere can pick up on this idea. Hopefully, I myself will be able to work on this idea to test it out (but first, I need to get a better computer, hopefully one with TITAN X).

No comments: