Shusen Wang
Shusen Wang
  • 25
  • 500 316
RL-1G: Summary
This lecture is a summary of Reinforcement Learning Basics.
Slides: github.com/wangshusen/DRL.git
Переглядів: 1 539

Відео

RL-1F: Evaluate Reinforcement Learning
Переглядів 1,3 тис.3 роки тому
Next video: ua-cam.com/video/DLO401mNOw4/v-deo.html If you want to empirically compare two reinforcement learning algorithms, you will use OpenAI Gym. This lecture introduces three kinds of problems: - Classical control problems include CartPole and Pendulum. - Atari games include Pong, Space Invader, and Breakout. - MuJoCo includes Ant, Humanoid, and Half Cheetah. Slides: github.com/wangshusen...
RL-1E: Value Functions
Переглядів 3,6 тис.3 роки тому
Next video: ua-cam.com/video/Rv7uC9v6Eco/v-deo.html Value functions are the expectations of the return. Action-value function Q evaluates how good it is to take action A while being in state S. State-value function V evaluates how good state S is. Slides: github.com/wangshusen/DRL.git
RL-1D: Rewards and Returns
Переглядів 1,8 тис.3 роки тому
Next video: ua-cam.com/video/lI8_p7Qeuto/v-deo.html Return is also known as cumulative future rewards. Return is defined as the sum of all the future rewards. Discounted return means giving rewards in the far future small weights. Slides: github.com/wangshusen/DRL.git
RL-1C: Randomness in MDP, Agent-Environment Interaction
Переглядів 1,4 тис.3 роки тому
Next Video: ua-cam.com/video/MeoSqrV5a24/v-deo.html Markov decision process (MDP) has two sources of randomness: - The action is randomly sampled from the policy function. - The next state is randomly sampled from the state-transition function. The agent can interact with the environment. Observing the current state, the agent executes an action. Then the environment updates the state and provi...
RL-1B: State, Action, Reward, Policy, State Transition
Переглядів 3,6 тис.3 роки тому
Next Video: ua-cam.com/video/0VWBr6dBMGY/v-deo.html This lecture introduces the basic concepts of reinforcement learning, including state, action, reward, policy, and state transition. Slides: github.com/wangshusen/DRL.git
RL-1A: Random Variables, Observations, Random Sampling
Переглядів 3,3 тис.3 роки тому
Next Video: ua-cam.com/video/GFayVUt2WGE/v-deo.html This is the first lecture on deep reinforcement learning. This lecture introduces basic probability theories that will be used in reinforcement learning. The topics include random variables, observed values, probability density function (PDF), probability mass function (PMF), expectation, and random sampling. Slides: github.com/wangshusen/DRL.git
Vision Transformer for Image Classification
Переглядів 126 тис.3 роки тому
Vision Transformer (ViT) is the new state-of-the-art for image classification. ViT was posted on arXiv in Oct 2020 and officially published in 2021. On all the public datasets, ViT beats the best ResNet by a small margin, provided that ViT has been pretrained on a sufficiently large dataset. The bigger the dataset, the greater the advantage of the ViT over ResNet. Slides: github.com/wangshusen/...
BERT for pretraining Transformers
Переглядів 13 тис.3 роки тому
Next Video: ua-cam.com/video/HZ4j_U3FC94/v-deo.html Bidirectional Encoder Representations from Transformers (BERT) is for pretraining the Transformer models. BERT does not need manually labeled data. BERT can use any books and web documents to automatically generate training data. Slides: github.com/wangshusen/DeepLearning Reference: Devlin, Chang, Lee, and Toutanova. BERT: Pre-training of deep...
Transformer Model (2/2): Build a Deep Neural Network (1.25x speed recommended)
Переглядів 14 тис.3 роки тому
Next Video: ua-cam.com/video/EOmd5sUUA_A/v-deo.html The Transformer models are state-of-the-art language models. They are based on attention and dense layer without RNN. In the previous lecture, we have built the attention layer and self-attention layer. In this lecture, we first build multi-head attention layers and then use them to build a deep neural network known as Transformer. Transformer...
Transformer Model (1/2): Attention Layers
Переглядів 29 тис.3 роки тому
Next Video: ua-cam.com/video/J4H6A4-dvhE/v-deo.html The Transformer models are state-of-the-art language models. They are based on attention and dense layers without RNN. Instead of studying every module of Transformer, let us try to build a Transformer model from scratch. In this lecture, we eliminate RNNs while keeping attentions. We will get an attention layer and a self-attention layer. In ...
Self-Attenion for RNN (1.25x speed recommended)
Переглядів 8 тис.3 роки тому
Next Video: ua-cam.com/video/FC8PziPmxnQ/v-deo.html The original attention was applied to only Seq2Seq models. But attention is not limited to Seq2Seq. When applied to a single RNN, attention is known as self-attention. This lecture teaches self-attention for RNN. In the original paper of Cheng et al. 2016, attention was applied to LSTM. To make self-attention easier to understand, this lecture...
Attention for RNN Seq2Seq Models (1.25x speed recommended)
Переглядів 33 тис.3 роки тому
Next Video: ua-cam.com/video/06r6kp7ujCA/v-deo.html Attention was originally proposed by Bahdanau et al. in 2015. Later on, attention finds much broader applications in NLP and computer vision. This lecture introduces only attention for RNN sequence-to-sequence models. The audience is assumed to know RNN sequence-to-sequence models before watching this video. Slides: github.com/wangshusen/DeepL...
Few-Shot Learning (3/3): Pretraining + Fine-tuning
Переглядів 31 тис.4 роки тому
This lecture introduces pretraining and fine-tuning for few-shot learning. This method is simple but comparable to the state-of-the-art. This lecture discusses 3 tricks for improving fine-tuning: (1) a good initialization, (2) entropy regularization, and (3) combine cosine similarity and softmax classifier. Sides: github.com/wangshusen/DeepLearning Lectures on few-shot learning: 1. Basic concep...
Few-Shot Learning (2/3): Siamese Networks
Переглядів 58 тис.4 роки тому
Next Video: ua-cam.com/video/U6uFOIURcD0/v-deo.html This lecture introduces the Siamese network. It can find similarities or distances in the feature space and thereby solve few-shot learning. Sides: github.com/wangshusen/DeepLearning Lectures on few-shot learning: 1. Basic concepts: ua-cam.com/video/hE7eGew4eeg/v-deo.html 2. Siamese networks: ua-cam.com/video/4S-XDefSjTM/v-deo.html 3. Pretrain...
17-4: Random Shuffle & Fisher-Yates Algorithm
Переглядів 2,2 тис.4 роки тому
17-4: Random Shuffle & Fisher-Yates Algorithm
5-2: Dense Matrices: row-major order, column-major order
Переглядів 4,6 тис.4 роки тому
5-2: Dense Matrices: row-major order, column-major order
5-1: Matrix basics: additions, multiplications, time complexity analysis
Переглядів 4,3 тис.4 роки тому
5-1: Matrix basics: additions, multiplications, time complexity analysis
17-1: Monte Carlo Algorithms
Переглядів 6 тис.4 роки тому
17-1: Monte Carlo Algorithms
3-1: Insertion Sort
Переглядів 9764 роки тому
3-1: Insertion Sort
6-1: Binary Tree Basics
Переглядів 6574 роки тому
6-1: Binary Tree Basics
Few-Shot Learning (1/3): Basic Concepts
Переглядів 80 тис.4 роки тому
Few-Shot Learning (1/3): Basic Concepts
2-3: Skip List
Переглядів 70 тис.4 роки тому
2-3: Skip List
2-2: Binary Search
Переглядів 7294 роки тому
2-2: Binary Search
2-1: Array, Vector, and List: Comparisons
Переглядів 4,5 тис.4 роки тому
2-1: Array, Vector, and List: Comparisons