02 Aug 2019

A problem that pops up from time to time in CUDA is when you want to perform a trivial parallel operation on an input array by assigning one thread per input array element but the number of elements in your input array is larger than the number of threads you have available. Or consider the scenario where you have written some CUDA code which works fine with your GPU however someone else tries to run it with an older model GPU and they run into this problem because their GPU has fewer threads than yours. An elegant way to handle this “more data than threads” problem is to use grid-stride loops within your kernels.

12 Jul 2019

The convolution operation has many applications in both image processing and deep learning (i.e. convolutional neural networks). Since convolutions can be performed on different parts of the input array (or image) independently of each other, it is a great fit for parallelization which is why convolutions are commonly performed on GPU. This blog post will cover some efficient convolution implementations on GPU using CUDA. This blog post will focus on 1D convolutions but can be extended to higher dimensional cases.

02 Jul 2019

Temporal difference learning shares many of the benefits of both dynamic programming methods and Monte Carlo methods without many their disadvantages. Like dynamic programming methods, policy evaluation can be updated at each time step but unlike dynamic programming you do not need a model of the environment. Like Monte Carlo methods, you do not need a model of the environemt but unlike Monte Carlo methods you do not need to wait til the end of an episode to make a policy evaluation update. All three of these methods use the same policy iteration strategy which iterates between policy evaluation (different for each method) and policy improvement (in a greedy fashion for each method).

30 Jun 2019

In the last reinforcement learning blog post we covered dynamic programming methods. In this blog post we will cover Monte Carlo (MC) methods. The biggest difference between these two methods is that dynamic programming methods assume a complete knowledge of the environement (via a MDP), but Monte Carlo methods do not. Instead, with Monte Carlo methods, knowledge of the environment is learned through experience. Another significant difference is that Monte Carlo methods can only learn from *episodic* tasks i.e. ones that start and terminate.

29 Jun 2019

This series of blog posts is intended to be a collection of short, concise, cheat-sheet-like notes on different topics relating to reinforcement learning. This first one will cover dynamic programming methods applied to reinforcement learning.

older posts
newer posts