Monday, March 9, 2015

Nearest-Neighbor Clustering using RCaller - A library for Calling R from Java

RCaller is a software for calling R from Java. A blog post includes the latest version of downloadable jar and documentation here. The latest news can always be traced using the RCaller label in Practical Code Solutions blog.

A blog post on performing a k-means clustering analysis using RCaller is also available at this link.

In the code below, two double arrays, x and y, are created in Java side. These variables are then passed to R. In R side, distance matrix d is calculated. The R function hclust performs the main calculations. Finally, calculated heights of clustering tree and a dendrogram plot are returned to Java. The source code, output text and the returned plot are presented here:




package kmeansrcaller;

import java.io.File;
import rcaller.RCaller;
import rcaller.RCode;

public class SingleLinkageClustering {

    public static void main(String[] args) {
        RCaller caller = new RCaller();
        RCode code = new RCode();
        File dendrogram = null;

        double[] x = new double[]{1, 2, 3, 4, 5, 10, 20, 30, 40, 50};
        double[] y = new double[]{2, 4, 6, 8, 10, 20, 40, 60, 80, 100};

        code.addDoubleArray("x", x);
        code.addDoubleArray("y", y);

        code.addRCode("d <- dist(cbind(x,y))");
        code.addRCode("h&<- hclust(d, method=\"single\")");

        try {
            dendrogram = code.startPlot();
            code.addRCode("plot(h)");
            code.endPlot();
        } catch (Exception e) {
            System.out.println("Plot Error: " + e.toString());
        }

        caller.setRCode(code);

        caller.setRscriptExecutable("/usr/bin/Rscript");

        caller.runAndReturnResult("h");
        System.out.println(caller.getParser().getNames());

        if (dendrogram != null) {
            code.showPlot(dendrogram);
        }

        double[] heights = caller.getParser().getAsDoubleArray("height");
        for (int i = 0; i < heights.length; i++) {
            System.out.println("Height " + i + " = " + heights[i]);
        }
    }
}




The output is 

[merge, height, order, method, call, dist_method]
Height 0 = 2.23606797749979
Height 1 = 2.23606797749979
Height 2 = 2.23606797749979
Height 3 = 2.23606797749979
Height 4 = 11.1803398874989
Height 5 = 22.3606797749979
Height 6 = 22.3606797749979
Height 7 = 22.3606797749979
Height 8 = 22.3606797749979



The screen shot of the plotted graphics is here:



Have a nice read!


No comments:

Post a Comment

Thanks