Warning: This post contains a mixture of excitements, frustrations and rants!
Today, Machine Learning/Deep Learning people have been sharing their great excitements over Ali Rahimi’s talk at NIPS (from min 57 onwards). Undoubtedly, it’s a great talk and you should check it out if you care about fundamental issues and the lost rigor in Deep Learning results. His talk has resonated a lot with me as well and for more reasons that I’ll try to explain in my own way while these tweets sum up the hype part well
While there’s no doubt Deep Learning has been an incredible enabler but AI hype is real and you can feel its bittersweet taste. It is too naive to think that at this stage Deep Learning will bring us Artificial General Intelligence and portraying ML/DL like in terminator movie coming to extinct the human race is irresponsible and idiotic. At this stage Deep Learning is like a bundle of techniques and since it is led by (empirical part of) Computer Science community, “working code” and seeing the empirical results is somehow the proof. Moreover, apparently in order to get into ML/DL, you’d only need to know calculus and coding to be the revolution and change the world! ¯\__(ツ)__/¯
My pessimistic side is saying what is happening now is that Deep Learning is growing exponentially fast with big army of enthusiasts ready to code up something quickly, generate results, publish papers and attract many attentions. These results somehow contribute to building systems for health care for example, which is sensitive enough and we cannot give more power to Deep Learning until there’s a solid, real foundation. One such example of these serious issues is reflected in this tweet
What is baffling to me is there are many faculties who seem happy about this situation and are not addressing the real problems and are somehow becoming the enemy of science. Their ignorance is mind blowing!
My optimistic side is pointing me towards the efforts and initiatives for addressing these issues. Some examples include introduction of google’s colaboratory project and now reproducibility of results are much easier than ever before. Well-known distill.pub initiative aims at clearing up the air of current DL research. Stanford DAWNbench competition. From research sides, Bayesian Neural Networks is getting more attention and abundant of probabilistic programming frameworks such as PyMC3, Edward, ZhuSuan, Pyro and ProbTorch help a lot. Information theory of DL talk by Naftali Tishby at Yandex school is addressing some of the fundamental issues. Hinton’s talk on what’s wrong with convolutional neural nets which led to the very recent Capsule paper. Moreover, his another recent paper on Distilling a Neural Network into a Soft Decision Tree.
To finish off, I think another important part is the realm of optimization algorithms more towards Unit Tests for Stochastic Optimization type of research.