AI Atlas #5
Deep Learning
🧠 What Is Deep Learning?
Deep learning emerged in academia in the early 2000s, with wider industry adoption starting around 2010. It is a sub-field of machine learning where models are trained for various tasks by presenting them with examples.
The technique can be applied to a particular type of model called an artificial neural net, which consists of layers of interconnected simple computing nodes called neurons. Each neuron processes information passed to it by other neurons and then passes the results on to neurons in subsequent layers. The parameters of the neural net model – i.e. values that are learned and updated through the model training process – are adjusted using the examples presented to the model in training. The power of deep learning is that it can then be used to make predictions or classifications on new, previously unseen data.
In deep learning, computers learn by example. For instance, if we have a model trained on thousands of pictures of dogs, that model can be leveraged to detect dogs in previously unseen images.
🤔 Why Deep Learning Matters and Its Shortcomings
Deep learning has the ability to learn from large datasets and make complex decisions based on input data; as such it has opened up new possibilities in areas such as image recognition, natural language processing, speech recognition, and autonomous vehicles.
The transition from traditional machine learning to deep learning represented a transition from learning by instruction to learning by observation using examples. This evolution has had a significant impact on the wider AI and machine learning landscape:
Data became fuel: The performance of deep learning models such as deep neural nets generally improves with the amount of data the model is trained on. This increased the value represented by access to data.
Unstructured data was unlocked: Deep learning models can automatically learn useful attributes from data that is not organized into a pre-defined format (called unstructured data). The algorithms thus achieve higher levels of accuracy than traditional machine learning on tasks that involve such data, such as unlabeled images.
End-to-end problem solving was enabled: In traditional machine learning, intermediate representations of data are typically manually engineered by a human expert, based on their understanding of the problem and the data. In contrast, deep learning models can automatically learn intermediate representations from the input data such that they can produce output data without intervention.
There are however shortcomings to the deep learning approach including:
Significant compute requirements: Training a deep learning model is computationally intensive and requires significant resources, including high-end processors and specialized hardware.
High data requirements: algorithms typically require significant amounts of labeled training data to achieve high accuracy and are increasingly improved by more training data.
Risk of overfitting: Deep learning models are prone to overfitting, when a model is fit too well on the training data, to the point that it starts to memorize the training data instead of learning the underlying patterns that generalize to new, unseen data.
🛠 Forms of Deep Learning
The breakthrough of deep learning birthed countless consequential forms of deep neural nets notably including:
Convolutional neural networks (CNNs): the technique of choice for computer vision as well as protein folding (AlphaFold)
Neural Radiance Fields (NeRFs): leveraged for the reconstruction of 3D scenes – and now 4D – from a few stills
Recurrent Neural Networks (RNNs): commonly used for ordinal or temporal problems, such as language translation, natural language processing (NLP), speech recognition, and image captioning
Bayesian Neural Nets (BNNs): useful for solving problems in domains where data is scarce, as a way to prevent overfitting, such as molecular biology and medical diagnosis
We will cover all of these forms of deep learning, among others, in future editions of The AI Atlas!
Deep learning represented a step function in the power and breadth of applications of machine learning. This makes an understanding of its functionality and implications crucial to comprehending the latest machine-learning techniques.