Alex Minnaar
Machine Learning at University College London. Research Engineer at Nitro.

Email
Github
LinkedIn

Categories

Recent Posts

Implementing the DistBelief Deep Neural Network Training Framework with Akka

Word2Vec Tutorial Part II: The Continuous Bag-of-Words Model

Word2Vec Tutorial Part I: The Skip-Gram Model

Distributed Online Latent Dirichlet Allocation with Apache Spark

Deep Learning Basics: Neural Networks, Backpropagation and Stochastic Gradient Descent

Building a Shoutbox App with Cassandra and Node.js

Building a Distributed Binary Search Tree with Akka

Introduction to the Multithreading Problem and the Akka Actor Solution

ScalaNER: A Scala Wrapper for the Stanford NER Tool with Some Added Features

Online Latent Dirichlet Allocation - The Best Option for Topic Modeling with Large Data Sets

Latent Dirichlet Allocation in Scala Part II - The Code

Facebook Recruiting III Keyword Extraction - Part 6

Test Examples

In this post we will be testing the association rule algorithm on a few posts from the training set. Usually it is bad practice to test on your training set but this is just for illustrative purposes (since the training set is so large and we are only testing on a few examples, it should not make a significant difference anyway).

Example 1:

Title: php script to echo a post

Body:

<p>Could someone help with a simple PHP script to echo the whole message received with an HTTPPOST.</p> <p>I am sending a string from an android app using HTTPPOST and would like to receive as a response the message received by the POST at the server.</p> <p>The script that I am using will only echo name value pairs </p> <pre><code>echo $_POST('data') </code></pre> <p>works when I post form data, but have not figured out how to echo a string. </p> <p>Thanks</p>

Tags: php android

The most likely association rules for the post title are

Title Word Tag Word Support Confidence
php php 158454 0.9415
echo php 2237 0.6268
echo echo 820 0.2298
post php 7042 0.1817
script php 8652 0.1631
post post 6124 0.1579
script javascript 6657 0.1255
script bash 6642 0.1233

As you can see, the most likely association rule is \(php \rightarrow php\). This is good since "php" is indeed a tag for this post. The other tag for this post is "android" however there are no association rules listed that correspond to this tag. This is not necessarily bad news because the title "php script to echo a post" does not even suggest that this post relates to "android" at all. Perhaps the android-related content is in the post body...

The most likely association rules for the post body are

Title Word Tag Word Support Confidence
android android 200854 0.8938
php php 290432 0.7563
echo php 13577 0.5538
httppost android 588 0.4273
httppost asp.net-mvc 272 0.1977
app android 128830 0.1933
script php 73170 0.1926
httppost java 217 0.1577

As suspected, the android-related content was in the post body as shown by the most likely association rule \(android \rightarrow android\). So the two most likely association rules correspond to the correct tags of "php" and "android" with probabilities 0.9415 and 0.8938 respectively. The most likely incorrect tag is "echo" with probability 0.2298.

Example 2:

Title: Can output of a method be used to autowire another bean?

Body:

<p>I have a following class </p><pre><code>public class Customer {private String firstName;private String lastName;public void setFirstName(String fName) {this.firstName = fName;}public void setLastName(String lName) {this.lastName = lName;}};</code></pre><p>I've another class that does the following.</p><pre><code>public class NameGenerator {public String generateName() {return "Zee Zee";}};</code></pre><p>Is it possible to set the name of customer (inject name into customer) without having passing NameGenerator bean. Rather, I'm expecting to inject the output of <code>generateName()</code> method?</p><p>This question is for sake of understanding if it can or cannot be done and does not necessarily delve into best practices.</p>

Tags: java spring dependency-injection

The most likely association rules for the post title are

Title Word Tag Word Support Confidence
autowire spring 157 0.9235
autowire java 102 0.6000
bean java 2236 0.4529
autowire autowired 63 0.3706
bean spring 1532 0.3103
bean jsf 1157 0.2343
autowire spring-mvc 31 0.1824
autowire autowire 30 0.1765

The most likely association rules for the post body are

Title Word Tag Word Support Confidence
bean java 12812 0.4382
bean spring 8480 0.2900
bean jsf 7165 0.2451
class java 137473 0.1859
method c# 109780 0.1682
class c# 116603 0.1576
inject java 1788 0.1522
inject dependency-injection 1760 0.1498

In this example, the association rule algorithm does not work as well. The top two most likely association rules do indeed correspond to the correct tags of "spring" and "java", however the third correct tag "dependency-injection" has a likelihood of only 0.1498. Therefore, several incorrect tags such as "autowired", "jsf", "spring-mvc", "autowire", and "c#" are more likely than the correct tag of "dependency-injection".