In the code below, we create two variables x and y. K-means clustering function kmeans is applied on the data matrix that consists of x and y. The result is then reported in Java.
package kmeansrcaller;
import rcaller.RCaller;
import rcaller.RCode;
public class KMeansRCaller {
public static void main(String[] args) {
RCaller caller = new RCaller();
RCode code = new RCode();
double[] x = new double[]{1, 2, 3, 4, 5, 10, 20, 30, 40, 50};
double[] y = new double[]{2, 4, 6, 8, 10, 20, 40, 60, 80, 100};
code.addDoubleArray("x", x);
code.addDoubleArray("y", y);
code.addRCode("result <- kmeans(cbind(x,y), 2)");
caller.setRCode(code);
caller.setRscriptExecutable("/usr/bin/Rscript");
caller.runAndReturnResult("result");
System.out.println(caller.getParser().getNames());
int[] clusters = caller.getParser().getAsIntArray("cluster");
double[][] centers = caller.getParser().getAsDoubleMatrix("centers");
double[] totalSumOfSquares = caller.getParser().getAsDoubleArray("totss");
// RCaller automatically replaces dots with underlines in variable names
// So the parameter tot.withinss is accessible as tot_withinss
double[] totalWithinSumOfSquares = caller.getParser().getAsDoubleArray("tot_withinss");
double[] totalBetweenSumOfSquares = caller.getParser().getAsDoubleArray("betweenss");
for (int i = 0; i < clusters.length; i++) {
System.out.println("Observation " + i + " is in cluster " + clusters[i]);
}
System.out.println("Cluster Centers:");
for (int i = 0; i < centers.length; i++) {
for (int j = 0; j < centers[0].length; j++) {
System.out.print(centers[i][j] + " ");
}
System.out.println();
}
System.out.println("Total Within Sum of Squares: " + totalWithinSumOfSquares[0]);
System.out.println("Total Between Sum of Squares: " + totalBetweenSumOfSquares[0]);
System.out.println("Total Sum of Squares: " + totalSumOfSquares[0]);
}
}
import rcaller.RCaller;
import rcaller.RCode;
public class KMeansRCaller {
public static void main(String[] args) {
RCaller caller = new RCaller();
RCode code = new RCode();
double[] x = new double[]{1, 2, 3, 4, 5, 10, 20, 30, 40, 50};
double[] y = new double[]{2, 4, 6, 8, 10, 20, 40, 60, 80, 100};
code.addDoubleArray("x", x);
code.addDoubleArray("y", y);
code.addRCode("result <- kmeans(cbind(x,y), 2)");
caller.setRCode(code);
caller.setRscriptExecutable("/usr/bin/Rscript");
caller.runAndReturnResult("result");
System.out.println(caller.getParser().getNames());
int[] clusters = caller.getParser().getAsIntArray("cluster");
double[][] centers = caller.getParser().getAsDoubleMatrix("centers");
double[] totalSumOfSquares = caller.getParser().getAsDoubleArray("totss");
// RCaller automatically replaces dots with underlines in variable names
// So the parameter tot.withinss is accessible as tot_withinss
double[] totalWithinSumOfSquares = caller.getParser().getAsDoubleArray("tot_withinss");
double[] totalBetweenSumOfSquares = caller.getParser().getAsDoubleArray("betweenss");
for (int i = 0; i < clusters.length; i++) {
System.out.println("Observation " + i + " is in cluster " + clusters[i]);
}
System.out.println("Cluster Centers:");
for (int i = 0; i < centers.length; i++) {
for (int j = 0; j < centers[0].length; j++) {
System.out.print(centers[i][j] + " ");
}
System.out.println();
}
System.out.println("Total Within Sum of Squares: " + totalWithinSumOfSquares[0]);
System.out.println("Total Between Sum of Squares: " + totalBetweenSumOfSquares[0]);
System.out.println("Total Sum of Squares: " + totalSumOfSquares[0]);
}
}
The output is
Observation 0 is in cluster 2
Observation 1 is in cluster 2
Observation 2 is in cluster 2
Observation 3 is in cluster 2
Observation 4 is in cluster 2
Observation 5 is in cluster 2
Observation 6 is in cluster 2
Observation 7 is in cluster 1
Observation 8 is in cluster 1
Observation 9 is in cluster 1
Cluster Centers:
40.0 6.42857142857143
80.0 12.8571428571429
Total Within Sum of Squares: 2328.57142857143
Total Between Sum of Squares: 11833.9285714286
Total Sum of Squares: 14162.5
Have a nice read!