[<< | Prev | Index | Next | >>]

Thursday, October 06, 2016

Video Learning

Google recently released a video database for machine learning in the form of pre-parsed frames. I'm sure there will be lots of nifty results, but I would like to see someone apply a hierarchical RNN to predict (as in paint) the next frame of raw video.

I would think it would be a great way to get a full hierarchy of visual features completely unsupervised. (Not to mention potentially awesome compression, albeit likely not real time on today's hardware.)

One might think it would just learn to copy pixels with some possible regional motion (something like mpeg), but to really do a good job it needs to segment along edges and such (which means recognizing/representing edges), and it could get even more interesting in the higher levels of the hierarchy (which should probably work over progressively slower/longer timescales).

[<< | Prev | Index | Next | >>]

Simon Funk / simonfunk@gmail.com