Reinforcement Learning, AWS, PACMAN – my week in review

Had a busy week last week. Wrapped up a project for my Cloud Computing class where we stored movies in DynamoDB, a NoSQL database. Apparently Amazon created DynamoDB because of limitations it had on RDBMS databases such as PostgreSQL, but also because it wanted to prioritize Availability over Consistency, in the CAP theorem. So now with consistent hashing to do load balancing, and NoSQL to get 99.99% availability. After all, it’s better to be available and inconsistent than down altogether. So we created a movie database, with some data in a large json file, and made a 500 line program to manage this update, search, etc. I believe our next project is with Flask and AWS Beanstalk (we had already done a AWS S3 and AWS EC2). Hopefully we get around to learning AWS Lambda, that would be exciting.

 

In my Artificial Intelligence class we implemented multiple reinforcment learning algorithms to make our Pacman learn from experience. First we did an MDP approach of value iteration, then did Q-Learning to learn based on sampling. And for some episilon value, our pacman agent would pick an action randomly (exploration) rather than going down the best path. Epsilon was low so it didn’t explore so much.

Lastly we did Approximate Q-Learning, which was an interesting one to implement because it is very similar to gradient descent. We learn the feature weights based on our sampling. And the amount by which we increment the weight, is dependent on the previous weight, alpha, and our error! This was an AHA! moment for me.

 

In my spare time, I finished Google’s internal machine learning course, and learned quite a bit from it. I had previously went through Andrew NG’s coursera course on deep learning over winter break.  https://developers.google.com/machine-learning/crash-course/

 

I met with my professor Scott Niekum in a faculty advising session, which he gave me advice for going into the field. Happy I showed up to that!

 

In my free time I also started coding a websocket, live chatroom, sort of like Twitch.tv’s. I was inspired by a golang blog on this topic, and also discovered that Go is faster than Node.js, an async single threaded widespread backend, so I decided to do this in Go. Rob Pike’s talk at a Go conference, helped me understand the difference between concurrency vs parallelism, and I think Go will be a huge game changer in years to come.

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google+ photo

You are commenting using your Google+ account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

w

Connecting to %s