Over the past few years, we have seen some advanced technologies in artificial intelligence shaping human life. Deep learning (DL) has become the main driving force in bringing new innovations in almost every industry. We are sure to continue to see DL everywhere. Most of the companies including startups are already integrating deep learning into their own day-to-day process. Deep learning techniques and algorithms have made building advanced neural networks practically feasible, thanks to high-level open source libraries such as TensorFlow, Keras, PyTorch and more.
We recently interviewed Valentino Zocca, a deep learning expert and the author of the book, Python Deep Learning. Valentino explains why deep learning is getting so much hype, and what's the roadmap ahead in terms of new technologies and libraries. He will also talks about how major vendors and tech-savvy startups adopt deep learning within their organization. Being a consultant and an active developer he is expecting a better approach than back-propagation for carrying out various deep learning tasks.
Author’s Bio
Valentino Zocca graduated with a Ph.D. in mathematics from the University of Maryland, USA, with a dissertation in symplectic geometry, after having graduated with a laurel in mathematics from the University of Rome. He spent a semester at the University of Warwick.
After a post-doc in Paris, Valentino started working on high-tech projects in the Washington, D.C. area and played a central role in the design, development, and realization of an advanced stereo 3D Earth visualization software with head tracking at Autometric, a company later bought by Boeing.
At Boeing, he developed many mathematical algorithms and predictive models, and using Hadoop, he has also automated several satellite-imagery visualization programs. He has since become an expert on machine learning and deep learning and has worked at the U.S. Census Bureau and as an independent consultant both in the US and in Italy. He has also held seminars on the subject of machine learning and deep learning in Milan and New York.
Currently, Valentino lives in New York and works as an independent consultant to a large financial company, where he develops econometric models and uses machine learning and deep learning to create predictive models. But he often travels back to Rome and Milan to visit his family and friends.
Key Takeaways
Deep learning is one of the most adopted techniques used in image and speech recognition and anomaly detection research and development areas.
Deep learning is not the optimum solution for every problem faced. Based on the complexity of the challenge, the neural network building can be tricky.
Open-source tools will continue to be in the race when compared to enterprise software. More and more features are expected to improve on providing efficient and powerful deep learning solutions.
Deep learning is used as a tool rather than a solution across organizations. The tool usage can differ based on the problem faced.
Emerging specialized chips expected to bring more developments in deep learning to mobile, IoT and security domain.
Valentino Zocca states We have a quantity vs. quality problem. We will be requiring better paradigms and approaches in the future which can be improved through research driven innovative solutions instead of relying on hardware solutions. We can make faster machines, but our goal is really to make more intelligent machines for performing accelerated deep learning and distributed training.
Full Interview
Deep learning is as much infamous as it is famous in the machine learning community with camps supporting and opposing the use of DL passionately. Where do you fall on this spectrum? If you were given a chance to convince the rival camp with 5-10 points on your stand about DL, what would your pitch be like?
The reality is that Deep Learning techniques have their own advantages and disadvantages. The areas where Deep Learning clearly outperforms most other machine learning techniques are in image and speech recognition and anomaly detection.
One of the reasons why Deep Learning does so much better is that these problems can be decomposed into a hierarchical set of increasingly complex structures, and, in multi-layer neural nets, each layer learns these structures at different levels of complexity. For example, an image recognition, the first layers will learn about the lines and edges in the image. The subsequent layers will learn how these lines and edges get together to form more complex shapes, like the eyes of an animal, and finally the last layers will learn how these more complex shapes form the final image.
However, not every problem can suitably be decomposed using this hierarchical approach. Another issue with Deep Learning is that it is not yet completely understood how it works, and some areas, for example, banking, that are heavily regulated, may not be able to easily justify their predictions. Finally, many neural nets may require a heavier computational load than other classical machine learning techniques.
Therefore, the reality is that one still needs a proficient machine learning expert who deeply understands the functioning of each approach and can make the best decision depending on each problem. Deep Learning is not, at the moment, a complete solution to any problem, and, in general, there can be no definite side to pick, and it really depends on the problem at hand.
Deep learning can conquer tough challenges, no doubt. However, there are many common myths and realities around deep learning. Would you like to give your supporting reasoning on whether the following statements are myth or fact?
You need to be a machine learning expert or a math geek to build deep learning models
We need powerful hardware resources to use deep learning
Deep learning models are always learning, they improve with new data automagically
Deep learning is a black box, so we should avoid using it in production environments or in real-world applications.
Deep learning is doomed to fail. It will be replaced eventually by data sparse, resource economic learning methods like meta-learning or reinforcement learning.
Deep learning is going to be central to the progress of AGI (artificial general intelligence) research
Deep Learning has become almost a buzzword, therefore a lot of people are talking about it, sometimes misunderstanding how it works. People hear the word DL together with "it beats the best player at go", "it can recognize things better than humans" etc., and people think that deep learning is a mature technology that can solve any problem. In actuality, deep learning is a mature technology only for some specific problems, you do not solve everything with deep learning and yet at times, whatever the problem, I hear people asking me "can't you use deep learning for it?" The truth is that we have lots of libraries ready to use for deep learning. For example, you don’t need to be a machine learning expert or a math geek to build simple deep learning models for run-of-the-mill problems, but in order to solve for some of the challenges that less common issues may present, a good understanding of how a neural network works may indeed be very helpful. Like everything, you can find a grain of truth in each of those statements, but they should not be taken at face value.
With MLaaS being provided by many vendors from Google to AWS to Microsoft, deep learning is gaining widespread adoption not just within large organizations but also by data-savvy startups. How do you view this trend? More specifically, is deep learning being used differently by these two types of organizations? If so, what could be some key reasons?
Deep Learning is not a monolithic approach. We have different types of networks, ANNs, CNNs, LSTMs, RNNs, etc. Honestly, it makes little sense to ask if DL is being used differently by different organizations. Deep Learning is a tool, not a solution, and like all tools it should be used differently depending on the problem at hand, not depending on who is using it.
There are many open source tools and enterprise software (especially the ones which claim you don't need to code much) in the race. Do you think this can be the future where more and more people will opt for ready-to-use (MLaaS) enterprise backed cognitive tools like IBM Watson rather than open-source tools?
This holds true for everything. At the beginning of the internet, people would write their own HTML code for their web pages, now we use tools who do most of the work for us. But if we want something to stand-out we need a professional designer. The more a technology matures, the more ready-to-use tools will be available, but that does not mean that we will never need professional experts to improve on those tools and provide specialized solutions.
Deep learning is now making inroads to mobile, IoT and security domain as well. What makes DL great for these areas? What are some challenges you see while applying DL in these new domains?
I do not have much experience with DL in mobiles, but that is clearly a direction that is becoming increasingly important. I believe we can address these new domains by building specialized chips.
Deep learning is a deeply researched topic within machine learning and AI communities. Every year brings us new techniques from neural nets to GANs, to capsule networks that then get widely adopted both in research and in real-world applications. What are some cutting-edge techniques you foresee getting public attention in deep learning in 2018 and in the near future? And why?
I am not sure we will see anything new in 2018, but I am a big supporter of the idea that we need a better paradigm that can excel more at inductive reasoning rather than just deductive reasoning. At the end of last year, even DL pioneer Geoff Hinton admitted that we need a better approach than back-propagation, however, I doubt we will see anything new coming out this year, it will take some time.
We keep hearing noteworthy developments in AI and deep learning by DeepMind and OpenAI. Do you think they have the required armory to revolutionize how deep learning is performed? What are some key challenges for such deep learning innovators?
As I mentioned before, we need a better paradigm, but what this paradigm is, nobody knows. Gary Marcus is a strong proponent of introducing more structure in our networks, and I do concur with him, however, it is not easy to define what that should be. Many people want to use the brain as a model, but computers are not biological structures, and if we had tried to build airplanes by mimicking how a bird flies we would not have gone very far. I think we need a clean break and a new approach, I do not think we can go very far by simply refining and improving what we have.
Improvement in processing capabilities and the availability of custom hardware have propelled deep learning into production-ready environments in recent years. Can we expect more chips and other hardware improvements in the coming years for GPU accelerated deep learning and distributed training? What other supporting factors will facilitate the growth of deep learning?
Once again, foreseeing the future is not easy, however, as these questions are related, I think only so much can be gained by improving chips and GPUs. We have a quantity vs. quality problem. We can improve quantity (of speed, memory, etc.) through hardware improvements, but the real problem is that we need a real quality improvement, better paradigms, and approaches, that needs to be achieved through research and not with hardware solutions. We can make faster machines, but our goal is really to make more intelligent machines. A child can learn by seeing just a few examples, we should be able to create an approach that allows a machine to also learn from few examples, not by cramming millions of examples in a short time.
Would you like to add anything more to our readers?
Deep Learning is a fascinating discipline, and I would encourage anyone who wanted to learn more about it to approach it as a research project, without underestimating his or her own creativity and intuition. We need new ideas.
If you found this interview to be interesting, make sure you check out other insightful interviews on a range of topics:
Blockchain can solve tech’s trust issues – Imran Bashir
“Tableau is the most powerful and secure end-to-end analytics platform”: An interview with Joshua Milligan
“Pandas is an effective tool to explore and analyze data”: An interview with Theodore Petrou
Read more