Foundations of Artificial Neural Networks & Deep Q-Learning

Поділитися
Вставка
  • Опубліковано 24 січ 2025

КОМЕНТАРІ • 11

  • @monkey_see_monkey_do
    @monkey_see_monkey_do 4 роки тому +6

    It's like you've been explaining to 5 year old kid who is not a human but a monkey - this is THE EXACT appropriate format for me to learn! I can't thank you enough(especially for explaining Q-learning)!

  • @albertodiazdorado4396
    @albertodiazdorado4396 3 роки тому +1

    How is it even possible that this piece of art has only 2k views

  • @bayarmaaragchaa7381
    @bayarmaaragchaa7381 4 роки тому +4

    Thank you so much!
    Looking forward to the next lesson

  • @gabrielgirodo
    @gabrielgirodo 3 роки тому +1

    Hello, I am learning a lot with you. You explain in the best way I have ever seen in my life! That is a talent that not many people have. I am looking forward for the next video. I really hope everything is all right with you and that you are willing to do more videos. They are really important for me. Have a nice day! :)

  • @chuanjiang6931
    @chuanjiang6931 Рік тому +1

    Where is the next video

  • @patrickduhirwenzivugira4729
    @patrickduhirwenzivugira4729 3 роки тому +1

    Thank you for the well-explained video. Could you please make an example as you did on Q-learning?

  • @sivakumar-uj4fu
    @sivakumar-uj4fu 4 роки тому +1

    Thank you once again Dr.Daniel Sir. could i be able to get the code for DQlearning. Kindly help me with the code to understand how NN is used instead of bellman equation in computing Q-values.

  • @gemini_537
    @gemini_537 8 місяців тому

    Gemini: This video is about the foundations of artificial neural networks and deep Q-learning.
    The video starts with introducing artificial neurons and activation functions. An artificial neuron is the building block of artificial neural networks. It receives input values, multiplies each value by a weight, and sums the weighted inputs together. Then, it applies an activation function to this sum to produce an output value. There are many different activation functions, and some of the most common ones are threshold, sigmoid, hyperbolic tangent, and rectified linear unit (ReLU).
    Next, the video explains what a neural network is. A neural network is an interconnected collection of artificial neurons. These neurons are arranged in layers, and each neuron in one layer connects to neurons in the next layer. The information flows through the network from the input layer to the output layer. When a neural network is used for supervised learning, it is provided with a set of training examples. Each training example consists of an input value and a corresponding output value. The neural network learns by iteratively adjusting the weights of the connections between the neurons. The goal is to adjust the weights so that the network can accurately predict the output value for any given input value.
    The video then covers deep Q-learning, which is a combination of Q-learning and deep learning. Q-learning is a reinforcement learning method that can be used to learn a policy for an agent. In Q-learning, the agent learns a Q-value for each state-action pair. The Q-value represents the expected future reward that the agent can expect to receive if it takes a particular action in a particular state. Deep Q-learning uses a deep neural network to learn the Q-values. The input to the neural network is the state of the environment, and the output of the network is the set of Q-values for all possible actions that the agent can take in that state.
    Finally, the video talks about exploration in deep Q-learning. Exploration is important because it allows the agent to learn about the different states and actions that are available in the environment. In deep Q-learning, the exploration-exploitation dilemma is addressed by using a softmax function. The softmax function converts the set of Q-values for a state into a probability distribution for each possible action. The action chosen by the agent is then determined by taking a random draw from this probability distribution. This means that the agent is more likely to take the action that appears to yield the greatest reward, but it will occasionally take actions that currently appear to be suboptimal in order to try to discover new information that may yield greater overall rewards in the long run.