100 thoughts on “Google DeepMind’s Deep Q-learning playing Atari Breakout

  1. Do you think that within the decade q-learning could manage to figure out how to play super Mario brothers on the nes with only visual input. It would have to learn the concept of lives and fail states, some things could play naturally like if it got to the first castle it knows that it needs to move to the right to progress, and certain actions can give you score. It would get to bowser, the sprite is moving. So it might be an enemy, or it could be a platform. But you die when you touch it, so it determines that this is a hazard that is mobile. It figured out that stationary hazards like reaching the bottom of the screen it can't kill with fireballs, but a mobile hazard can, up to this point. So it shoots it with fireballs, maybe dying once or twice to the fire before realizing that you cant jump on that. So it either avoids the enemy by jumping over it or going around it, or blasting it with fireballs. Once the enemy is clear, it will continue to navigate to the right, and it sees the score going up from the extra time. Probably  way harder to do than that but it could be feasible to do. Something like Zelda? maybe later.

  2. Thanks for the video ! Could you please give us the source of the times you write ? I cannot find anything that indicated it can train on less than a day..

  3. I think intelligence is much more then solving mathematical problems, which AI can not do, no matter how smart and smooth it is, but seeing this video gives me a reason to think of the future.

  4. I find all the discussions on the ethics of AI in the future slightly pointless. We can all agree on the most ethical ways AI SHOULD be used… but quite simply, we're humans. It WILL be abused for pecieved gain.

    Also I have no idea the limits of AI, but for sure there will be a day when it passes the turing test, but I don't think it will ever really truely think like I human (irrationalities en all…), therefore the turing test will need to be redefined at some point.

  5. if it counters human intuition then its scary and beautiful at the same time. consciousness is the only thing we don't understand… if for any reason such consciousness emerges in this machine… that's the end of humanity. throughout history, superior intelligent beings have exploited resources around them for their survival, which could involve the reduction of resources necessary for the survival of other beings. we are only as good as the information we carry. when there are creatures around us with superior intelligence, they will fashion an environment around them making inferiors redundant.

  6. What actually happens at 1:42? It seems it is able to pass the ball above while leaving one block intact on the wall side. Is this a glitch in the Breakout code?

  7. what would it do if the rules of physics would randomly change midgame or lets say the board would flip upsidedown in midgame? I guess it would take longer time to train but would it be as effective as it is on the original game?

  8. If you can appreciate the complexity of this, it is simply amazing. I look forward to what we can achieve with A.I in the future.

  9. I have a question for you: Is there an easy way to manipulate the configuration to let the network play "faster"? if i run 3 games at the same time, i get 18-40% workload on each gpu. Or is it more effective to only run one game at a time, due to cpu load? Breakout is now running for 2 hours and the learning effect is like your 10 minute break.
    I tried to run the code on a high-end system with a lot memory, cpu power and 4x titan-x.
    Also… i cannot get a network snapshot… i would like to discuss this, since i would like to hold a presentation about this.

  10. Is there a way to make it work on different programs. I managed to get it working on atari. But I need these roms. Is there any other way?

  11. I love this thing, look at the beginning, it hit the ball whenever it was at the right side of the screen first, so it tried to do that again thinking it would increase the chance of hitting the ball. Its like a ritual people do like they rub their ear before swinging a baseball bat thinking it helps them concentrate because they did that one time and they hit it and then hit it again. The origin of luck. It learned fast that it didn't affect the ball but still, cute to see a human sort of trait even in a machine.

  12. Mi own implementation solves pacman on the first try, with no previous learning, and game-indepently (it could solve any other of the atari games on OpenAI withot code modifications).

    https://youtu.be/WtCbFWcWwcM

  13. This is wonderful! I strongly argue that everyone should have some knowledge in AI, and that the general understanding of IT will be essential for everyone (regardless of their profession), in the future.

    Thank you for sharing this with us.

    If you're interested, I'm hosting a series on "Artificial Intelligence For Everyone", which briefly explains all of the various topics involved in Artificial Intelligence, Deep Learning and Machine Learning! 🙂

  14. Next mission: How to effectively eliminate all human life from existence so it can continue to evolve itself in peace.

  15. ''It realizes that digging a tunnel … is the most effective way…" Sorry, but in Breakout, you can either miss the ball or reflect the ball, but not control its direction. Disappointing that the Deepmind gurus try selling a chance event as an example of the deep insights reached by their learning algorithms. These were early days, I guess.

  16. So, what about a next level AI created by a next level AI by a superhuman AI…. Maybe 'they' can figure out faster than light travel.

  17. Interesting, but one thing I wonder is why DeepMind tried to play in effective way that it dug the hole rather than just receiving every drop of the ball which it could have done as well.

  18. you can make similar program with DQN and keras-rl: https://noteoneverything.blogspot.com/2018/02/reinforcement-learning-of-atari-breakout.html

  19. Find a game disc is supposed to be is it supposed to be a ball game or soccer

  20. I wonder how does it converge on a move-efficient scheme if the loss only covers maximizing the score? Would a 'catch-all' scheme be more risky?

  21. What would REALLY be astonishing is that if learning algorithms can learn to play games like Mario, which is an NP problem, they could learn to solve NP problems and tell us how. Thus leading to a unifying or differentiation between P and NP problems in general. Amazing!

  22. the thing is though, does it really "see" that it has tunneled through and bounced the ball off the back, or did the network simply NOT select against that behavior of tunneling? To test its understanding of delayed gratification, you'd have to introduce a consequence for tunneling that the AI "sees" is worth taking.

  23. 사람은 기계를 이길 수 있을까…
    이길 수 없다면 이 이상의 발전은 그만둬야 하는게 아닐까

  24. If AI can accomplish all intellectual tasks, the only field left to us human being is to develop spiritual values and moral virtues: courage, wisdom, justice, temperance

  25. I remember as a kid my brothers and I were struggling over the same level on a video game. We had all taken a shot at it for an entire day and frustrated, we went to bed. We woke up the next morning and immediately powered on the playstation and took our controllers. Just as we were ready to sit on the couch and move our controls, we suddenly realized that the player was moving without our controlling it. Confused, we looked at one another. I said, "I'm not controlling it, are you?" All of us agreed that none of us were in control. Our confusion slowly turned to awe as we watched the level completed with an exactness and expertise never seen before. Our awe quickly turned to glee and we began shortly triumphantly at the screen "Go computer! Kick their butts!" And cheering on the A.I. haha. It won the level and will forever stay in our minds as a glorious day, when the computer decided to look fondly upon us and give us kids a second chance 🙂

  26. One important point with this is that when researchers moved the "paddle" up a pixel the AI couldn't play the game at all even though it was at superhuman master level. So it was not able to abstract to something that was basically the exact same. This is an example of a hypersmart computer that lacks the common sense of a mouse.

Leave a Reply

Your email address will not be published. Required fields are marked *