[<< | Prev | Index | Next | >>]

Saturday, October 08, 2016

Welcome to AI

The big shift in AI that's happening right now under the rubric of Deep Learning isn't really about the deep part, imo, but rather the move to composable design of non-homogeneous architectures.

On the wetware side, for instance, cortex in the past was typically viewed as a powerful generic learning circuit that if you just have enough of, great things would happen. But this is like looking at a CPU and seeing that's it's made entirely of transistors, and deciding that if you can just reverse engineer/replicate transistors and put lots of them down in a sheet you'll have a CPU. Similarly, back-prop networks were made of qualitatively similar layers, with maybe a different activation function on the final output layer; and structural variation from that, or "designed" architecture, was generally minimal.

The reality of course is that cortex has a rather complicated, genetically determined wiring diagram (the connectome) much like the modular layout of a CPU. It's unclear at this point at what level this diagram goes from point-to-point specified (e.g., module A sends output to module X) to something more stochastic (e.g., neurons of type B in a type J module connect randomly with some density function based on proximity to neurons of type K), and to what degree connections are pre-determined vs. learned (some indications some axons or dendrites actually grow long distances along chemical gradients that are a form of contextually derived learning signal), but in the big picture even the relatively homogeneous cortex is chunked up into regions or modules which are wired to each other in very specific ways and more interestingly to non-cortical regions (thalamus, etc), such that the end result (particularly once you include all the lower/inner brain modules) is definitively not a big fully-interconnected blob of magic but rather more like a familiar circuit or block diagram which implements a particular machine or program. (Side note and interesting question is how much genetic variation is there in the brain's connectome, and so is each of us born our own unique circuit diagram with associated tradeoffs in how we are able to learn and conceptualize the world?)

And we're finally there: modular, composable AI design is the new programming.

(Random example)
Figure 3. The proposed competitive multi-scale convolution (a) and competitive inception (b) networks, together with the reference inception style network (c). In these three models, we ensure that the output of each layer has the same number of units. Also note that: the inception style model uses ReLU [15] after all convolutional layers, the number of filters per convolutional node is represented by the number in brackets, and these models assume a 10-class classification problem. 

Right now most of the focus is on end to end supervised back-prop, but I think that will change toward more unsupervised methods (including forward prediction of unlabeled temporal data, which technically can still be done with back-prop), and toward a more homogeneous mix of module types--not just "concat" vs. "max pool", but also a variety of learning types (e.g. clustering, factoring, variational auto-encoding, whatever) as well as temporal/control behaviors ("neurally" controllable or adaptive versions of multiplexers, stacks, heaps and queues, and so on). And these will be combined and boxed up into modules which will be combined and boxed up into bigger modules and so on.

This is not to discount the use and power of black-box temporal learning modules like LSTM or the like--those are precisely the sorts of building blocks that will be used. And, just as with the history of software (and the evolution of the brain for that matter), there will first be relatively complicated designs that do the job, which will later be replaced by more elegant and general modules that can do many jobs, and so on. At first the less elegant designs will have an efficiency advantage because they employ stronger priors and so will be closer to their end goal out of the gate, but then as computational power continues to increase that efficiency will lose weight to the generality of the more elegant solutions. (This being why computers don't feel any faster today than thirty years ago when they were orders of magnitude slower under the hood... But I digress.)

Fun times ahead.

[<< | Prev | Index | Next | >>]

Simon Funk / simonfunk@gmail.com