Alex Minnaar

Implementing the DistBelief Deep Neural Network Training Framework with Akka

Presently, most deep neural networks are trained using GPUs due to the enormous number of parallel computations that they can perform. Without the speed-ups provided by GPUs, deep neural networks could take days or even weeks to train on a single machine. However, using GPUs can be prohitive for several reasons

Word2Vec Tutorial Part II: The Continuous Bag-of-Words Model

In the previous post the concept of word vectors was explained as was the derivation of the skip-gram model. In this post we will explore the other Word2Vec model - the continuous bag-of-words (CBOW) model. If you understand the skip-gram model then the CBOW model should be quite straight-forward because in many ways they are mirror images of each other. For instance, if you look at the model diagram

Word2Vec Tutorial Part I: The Skip-Gram Model

In many natural language processing tasks, words are often represented by their tf-idf scores. While these scores give us some idea of a word’s relative importance in a document, they do not give us any insight into its semantic meaning. Word2Vec is the name given to a class of neural network models that, given an unlabelled training corpus, produce a vector for each word in the corpus that encodes its semantic information. These vectors are usefull for two main reasons.

Distributed Online Latent Dirichlet Allocation with Apache Spark

In the past, I have studied the online LDA algorithm from Hoffman et al. in some depth resulting in this blog post and corresponding Scala code. Before we go further I will provide a general description of how the algorithm works. In online LDA, minibatches of documents are sequentially processed to update a global topic/word matrix which defines the topics that have been learned. The processing consists of two steps: