CS702(B) Unit 1 Deep Learning Fundamentals study material for RGPV CSE 7th Semester. Learn history of deep learning, McCulloch Pitts neuron, thresholding logic, activation functions, gradient descent optimizers, RNN, BPTT, GRU, LSTM, encoder-decoder models and attention mechanism.
Unit 1 introduces the foundation of Deep Learning. It covers artificial neuron models, activation functions, optimization algorithms and sequence learning models such as RNN, GRU, LSTM, encoder-decoder architecture and attention mechanism.
Understand history, neuron models, thresholding logic and activation functions.
Learn Gradient Descent, Momentum, Nesterov, AdaGrad, RMSProp and Adam.
Study RNN, BPTT, GRU, LSTM, encoder-decoder and attention mechanism.
Complete syllabus-based topics of Deep & Reinforcement Learning Unit 1.
Deep Learning evolved from artificial neural networks and became powerful due to big data, better hardware and improved training algorithms.
McCulloch Pitts neuron is an early mathematical model of an artificial neuron that uses binary inputs and threshold logic.
Thresholding logic activates a neuron only when the weighted sum of inputs crosses a fixed threshold.
Activation functions introduce non-linearity into neural networks. Common examples include Sigmoid, Tanh, ReLU and Softmax.
Gradient Descent is an optimization algorithm used to minimize loss by updating weights in the opposite direction of the gradient.
Momentum-based GD speeds up learning by using past gradients to smooth the update direction.
Nesterov Accelerated Gradient improves momentum by looking ahead before calculating the gradient.
SGD updates model weights using one or a small batch of training examples at a time.
AdaGrad adapts the learning rate for each parameter based on past gradients.
RMSProp controls the learning rate using a moving average of squared gradients.
Adam combines ideas of momentum and RMSProp to provide efficient adaptive optimization.
Eigenvalue decomposition is a linear algebra technique useful in data transformation, dimensionality reduction and model analysis.
RNNs are neural networks designed for sequential data by maintaining memory of previous inputs.
BPTT is used to train RNNs by unfolding the network over time and applying backpropagation.
These problems occur in deep or recurrent networks when gradients become too small or too large during training.
Truncated BPTT reduces training complexity by backpropagating errors only for a limited number of time steps.
Gated Recurrent Unit is an RNN variant that uses gates to control information flow and reduce vanishing gradient problems.
Long Short-Term Memory networks use memory cells and gates to learn long-term dependencies in sequence data.
Encoder-decoder models convert input sequences into internal representations and generate output sequences.
Attention mechanism helps models focus on important parts of input sequences while generating output.
Deep Learning: Neural network-based learning with multiple layers.
Gradient Descent: Loss ko minimize karne ke liye weights update karna.
RNN: Sequential data ke liye neural network with memory.
LSTM/GRU: Improved RNN models jo long-term dependencies handle karte hain.
Attention: Model ko input ke important parts par focus karne me help karta hai.
| Topic | Expected Frequency | Importance |
|---|---|---|
| McCulloch Pitts Neuron | High | ⭐⭐⭐⭐ |
| Activation Functions | Very High | ⭐⭐⭐⭐⭐ |
| Gradient Descent | Very High | ⭐⭐⭐⭐⭐ |
| AdaGrad, RMSProp, Adam | Very High | ⭐⭐⭐⭐⭐ |
| RNN | Very High | ⭐⭐⭐⭐⭐ |
| BPTT | High | ⭐⭐⭐⭐ |
| Vanishing and Exploding Gradients | Very High | ⭐⭐⭐⭐⭐ |
| GRU | High | ⭐⭐⭐⭐ |
| LSTM | Very High | ⭐⭐⭐⭐⭐ |
| Attention Mechanism | Very High | ⭐⭐⭐⭐⭐ |
Deep Learning is a machine learning technique that uses multiple layers of neural networks to learn complex patterns from data.
Gradient Descent is an optimization algorithm used to minimize loss by updating model weights.
RNN is a neural network used for sequential data by maintaining memory of previous inputs.
LSTM is an improved RNN model that can learn long-term dependencies using gates and memory cells.
Attention mechanism allows a model to focus on important parts of input while generating output.