RCaller 3.0 is released with new features.
Please visit the page
http://mhsatman.com/rcaller-3-0
for the source code, compiled binaries, other downloads and the blog post.
Hope you enjoy the project!
Showing posts with label rcaller. Show all posts
Showing posts with label rcaller. Show all posts
Tuesday, May 24, 2016
Saturday, March 14, 2015
Handling all variables in a workspace in R with RCaller
It is known that the R assigns a value to a variable name by using the Assignment Symbol <- which corresponds to assign function.
RCaller handles results as list objects. Since R environments are list s, they can easily be converted to R lists (Visit the previous blog post on R list here).
Here is an example of RCaller on getting all variables that are created in the run time in R side.
As it is seen in output, created variables avector, a, b and d are returned to Java side in a single call without any manual translations.
Have a nice read!
RCaller handles results as list objects. Since R environments are list s, they can easily be converted to R lists (Visit the previous blog post on R list here).
Here is an example of RCaller on getting all variables that are created in the run time in R side.
package rcallerenvironments;
import rcaller.RCaller;
import rcaller.RCode;
public class RCallerEnvironments {
public static void main(String[] args) {
RCaller rcaller = new RCaller();
RCode code = new RCode();
rcaller.setRscriptExecutable("/usr/bin/Rscript");
code.addRCode("a <- 3");
code.addRCode("b <- 10.45");
code.addRCode("d <- TRUE");
code.addRCode("avector <- c(9,6,5,6)");
code.addRCode("allvars <- as.list(globalenv())");
rcaller.setRCode(code);
rcaller.runAndReturnResult("allvars");
System.out.println(rcaller.getParser().getNames());
try {
System.out.println(rcaller.getParser().getXMLFileAsString());
} catch (Exception e) {
System.out.println("Error in accessing XML");
}
}
}
Have a nice read!
Friday, March 13, 2015
RCaller 2.5 is available for downloading
We are happy to announce that our 'easy to use' Java library for calling R from Java is available for downloading by now on. Developers access the compiled jar file in site
https://github.com/jbytecode/rcaller/releases/tag/2.5
This release does not extend the main functionality of the library but now there are some handy functions for performing some calculations and later development of the library.
What is new:
* Official document bibtex added to cite RCaller in any projects or papers
* RealMatrix class is implemented. Matrix operations are performed in more 'java-ish style'
* RService is implemented for developing wrapper functions
Where to start?
* Read the web page on RCaller http://mhsatman.com/tag/rcaller/
* Read blog entries in http://stdioe.blogspot.com.tr/search/label/rcaller
* Have a look at the source tree in https://github.com/jbytecode/rcaller
* Download the library in https://github.com/jbytecode/rcaller/releases/tag/2.5
Have a nice try!
https://github.com/jbytecode/rcaller/releases/tag/2.5
This release does not extend the main functionality of the library but now there are some handy functions for performing some calculations and later development of the library.
What is new:
* Official document bibtex added to cite RCaller in any projects or papers
* RealMatrix class is implemented. Matrix operations are performed in more 'java-ish style'
* RService is implemented for developing wrapper functions
Where to start?
* Read the web page on RCaller http://mhsatman.com/tag/rcaller/
* Read blog entries in http://stdioe.blogspot.com.tr/search/label/rcaller
* Have a look at the source tree in https://github.com/jbytecode/rcaller
* Download the library in https://github.com/jbytecode/rcaller/releases/tag/2.5
Have a nice try!
Migration of RCaller and Fuzuli Projects to GitHub
Since Google announced that they are shutting down the code hosting service 'Google code' in which our two projects RCaller and Fuzuli Programming Language are hosted.
We migrated our projects into the popular code hosting site GitHub.
Source code of these projects will no longer be committed in Google code site. Please check the new repositories.
GitHub pages are listed below:
RCaller:
https://github.com/jbytecode/rcaller
Fuzuli Project:
https://github.com/jbytecode/fuzuli
We migrated our projects into the popular code hosting site GitHub.
Source code of these projects will no longer be committed in Google code site. Please check the new repositories.
GitHub pages are listed below:
RCaller:
https://github.com/jbytecode/rcaller
Fuzuli Project:
https://github.com/jbytecode/fuzuli
Monday, March 9, 2015
Nearest-Neighbor Clustering using RCaller - A library for Calling R from Java
RCaller is a software for calling R from Java. A blog post includes the latest version of downloadable jar and documentation here. The latest news can always be traced using the RCaller label in Practical Code Solutions blog.
A blog post on performing a k-means clustering analysis using RCaller is also available at this link.
In the code below, two double arrays, x and y, are created in Java side. These variables are then passed to R. In R side, distance matrix d is calculated. The R function hclust performs the main calculations. Finally, calculated heights of clustering tree and a dendrogram plot are returned to Java. The source code, output text and the returned plot are presented here:
package kmeansrcaller;
import java.io.File;
import rcaller.RCaller;
import rcaller.RCode;
public class SingleLinkageClustering {
public static void main(String[] args) {
RCaller caller = new RCaller();
RCode code = new RCode();
File dendrogram = null;
double[] x = new double[]{1, 2, 3, 4, 5, 10, 20, 30, 40, 50};
double[] y = new double[]{2, 4, 6, 8, 10, 20, 40, 60, 80, 100};
code.addDoubleArray("x", x);
code.addDoubleArray("y", y);
code.addRCode("d <- dist(cbind(x,y))");
code.addRCode("h&<- hclust(d, method=\"single\")");
try {
dendrogram = code.startPlot();
code.addRCode("plot(h)");
code.endPlot();
} catch (Exception e) {
System.out.println("Plot Error: " + e.toString());
}
caller.setRCode(code);
caller.setRscriptExecutable("/usr/bin/Rscript");
caller.runAndReturnResult("h");
System.out.println(caller.getParser().getNames());
if (dendrogram != null) {
code.showPlot(dendrogram);
}
double[] heights = caller.getParser().getAsDoubleArray("height");
for (int i = 0; i < heights.length; i++) {
System.out.println("Height " + i + " = " + heights[i]);
}
}
}
The output is
[merge, height, order, method, call, dist_method]
Height 0 = 2.23606797749979
Height 1 = 2.23606797749979
Height 2 = 2.23606797749979
Height 3 = 2.23606797749979
Height 4 = 11.1803398874989
Height 5 = 22.3606797749979
Height 6 = 22.3606797749979
Height 7 = 22.3606797749979
Height 8 = 22.3606797749979
The screen shot of the plotted graphics is here:
Have a nice read!
Saturday, March 7, 2015
K-means clustering with RCaller - A library for calling R from Java
Here is an example of RCaller, a library for calling R from Java.
In the code below, we create two variables x and y. K-means clustering function kmeans is applied on the data matrix that consists of x and y. The result is then reported in Java.
The output is
[cluster, centers, totss, withinss, tot_withinss, betweenss, size, iter, ifault]
Observation 0 is in cluster 2
Observation 1 is in cluster 2
Observation 2 is in cluster 2
Observation 3 is in cluster 2
Observation 4 is in cluster 2
Observation 5 is in cluster 2
Observation 6 is in cluster 2
Observation 7 is in cluster 1
Observation 8 is in cluster 1
Observation 9 is in cluster 1
Cluster Centers:
40.0 6.42857142857143
80.0 12.8571428571429
Total Within Sum of Squares: 2328.57142857143
Total Between Sum of Squares: 11833.9285714286
Total Sum of Squares: 14162.5
Have a nice read!
In the code below, we create two variables x and y. K-means clustering function kmeans is applied on the data matrix that consists of x and y. The result is then reported in Java.
package kmeansrcaller;
import rcaller.RCaller;
import rcaller.RCode;
public class KMeansRCaller {
public static void main(String[] args) {
RCaller caller = new RCaller();
RCode code = new RCode();
double[] x = new double[]{1, 2, 3, 4, 5, 10, 20, 30, 40, 50};
double[] y = new double[]{2, 4, 6, 8, 10, 20, 40, 60, 80, 100};
code.addDoubleArray("x", x);
code.addDoubleArray("y", y);
code.addRCode("result <- kmeans(cbind(x,y), 2)");
caller.setRCode(code);
caller.setRscriptExecutable("/usr/bin/Rscript");
caller.runAndReturnResult("result");
System.out.println(caller.getParser().getNames());
int[] clusters = caller.getParser().getAsIntArray("cluster");
double[][] centers = caller.getParser().getAsDoubleMatrix("centers");
double[] totalSumOfSquares = caller.getParser().getAsDoubleArray("totss");
// RCaller automatically replaces dots with underlines in variable names
// So the parameter tot.withinss is accessible as tot_withinss
double[] totalWithinSumOfSquares = caller.getParser().getAsDoubleArray("tot_withinss");
double[] totalBetweenSumOfSquares = caller.getParser().getAsDoubleArray("betweenss");
for (int i = 0; i < clusters.length; i++) {
System.out.println("Observation " + i + " is in cluster " + clusters[i]);
}
System.out.println("Cluster Centers:");
for (int i = 0; i < centers.length; i++) {
for (int j = 0; j < centers[0].length; j++) {
System.out.print(centers[i][j] + " ");
}
System.out.println();
}
System.out.println("Total Within Sum of Squares: " + totalWithinSumOfSquares[0]);
System.out.println("Total Between Sum of Squares: " + totalBetweenSumOfSquares[0]);
System.out.println("Total Sum of Squares: " + totalSumOfSquares[0]);
}
}
import rcaller.RCaller;
import rcaller.RCode;
public class KMeansRCaller {
public static void main(String[] args) {
RCaller caller = new RCaller();
RCode code = new RCode();
double[] x = new double[]{1, 2, 3, 4, 5, 10, 20, 30, 40, 50};
double[] y = new double[]{2, 4, 6, 8, 10, 20, 40, 60, 80, 100};
code.addDoubleArray("x", x);
code.addDoubleArray("y", y);
code.addRCode("result <- kmeans(cbind(x,y), 2)");
caller.setRCode(code);
caller.setRscriptExecutable("/usr/bin/Rscript");
caller.runAndReturnResult("result");
System.out.println(caller.getParser().getNames());
int[] clusters = caller.getParser().getAsIntArray("cluster");
double[][] centers = caller.getParser().getAsDoubleMatrix("centers");
double[] totalSumOfSquares = caller.getParser().getAsDoubleArray("totss");
// RCaller automatically replaces dots with underlines in variable names
// So the parameter tot.withinss is accessible as tot_withinss
double[] totalWithinSumOfSquares = caller.getParser().getAsDoubleArray("tot_withinss");
double[] totalBetweenSumOfSquares = caller.getParser().getAsDoubleArray("betweenss");
for (int i = 0; i < clusters.length; i++) {
System.out.println("Observation " + i + " is in cluster " + clusters[i]);
}
System.out.println("Cluster Centers:");
for (int i = 0; i < centers.length; i++) {
for (int j = 0; j < centers[0].length; j++) {
System.out.print(centers[i][j] + " ");
}
System.out.println();
}
System.out.println("Total Within Sum of Squares: " + totalWithinSumOfSquares[0]);
System.out.println("Total Between Sum of Squares: " + totalBetweenSumOfSquares[0]);
System.out.println("Total Sum of Squares: " + totalSumOfSquares[0]);
}
}
The output is
Observation 0 is in cluster 2
Observation 1 is in cluster 2
Observation 2 is in cluster 2
Observation 3 is in cluster 2
Observation 4 is in cluster 2
Observation 5 is in cluster 2
Observation 6 is in cluster 2
Observation 7 is in cluster 1
Observation 8 is in cluster 1
Observation 9 is in cluster 1
Cluster Centers:
40.0 6.42857142857143
80.0 12.8571428571429
Total Within Sum of Squares: 2328.57142857143
Total Between Sum of Squares: 11833.9285714286
Total Sum of Squares: 14162.5
Have a nice read!
Monday, June 16, 2014
RCaller 2.4 has just been released
The key properties of this release:
Get informed using the formal blog http://stdioe.blogspot.com.tr/search/label/rcaller
Download page: https://drive.google.com/?authuser=0#folders/0B-sn_YiTiFLGZUt6d3gteVdjTGM
Source code: https://code.google.com/p/rcaller/
Home page: http://mhsatman.com/tag/rcaller/
Journal Documentation: http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U59D8_mSy1Y
- Added deleteTempFiles() method in class RCaller for deleting temporary files that are created by RCaller at any time.
- runiversal.r is now more compact
- StopRCallerOnline() method in class RCaller now stops the R instances in the memory which are created in runAndReturnResultOnline(). Click to see the example for RCaller.stopRCallerOnline() method.
Get informed using the formal blog http://stdioe.blogspot.com.tr/search/label/rcaller
Download page: https://drive.google.com/?authuser=0#folders/0B-sn_YiTiFLGZUt6d3gteVdjTGM
Source code: https://code.google.com/p/rcaller/
Home page: http://mhsatman.com/tag/rcaller/
Journal Documentation: http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U59D8_mSy1Y
Labels
rcaller
Friday, June 13, 2014
R GUI written in Java using RCaller
This video demonstrates how the Java version of R GUI based on RCaller is now faster after the speed improvements. This simple gui is available in the source tree. Typed commands are passed to R using the online call mechanism of RCaller and there is a single active R process at the background.
Please follow the rcaller label in this blog site to achive latest RCaller news, updates, examples and other materials.
Have a nice watching!
Labels
rcaller
Scholarly papers, projects and thesis that cite RCaller
RCaller is now in its 4th year with its version of 2.3 and it is considerable mature now. It is used in many commercial projects as well as scholarly papers and thesis. Here is the list of scholarly papers, projects and thesis that I stumbled upon in Google Scholar.
- MingXue Wang; Handurukande, S.B.; Nassar, M., "RPig: A scalable framework for machine learning and advanced statistical functionalities," Cloud Computing Technology and Science (CloudCom), 2012 IEEE 4th International Conference on , vol., no., pp.293,300, 3-6 Dec. 2012
doi: 10.1109/CloudCom.2012.6427480
URL: http://ieeexplore.ieee.org/stamp/stamp.jsp?tp=&arnumber=6427480&isnumber=6427477
- Niya Wang, Fan Meng, Li Chen, Subha Madhavan, Robert Clarke, Eric P. Hoffman, Jianhua Xuan, and Yue Wang. 2013. The CAM software for nonnegative blind source separation in R-Java. J. Mach. Learn. Res. 14, 1 (January 2013), 2899-2903. http://dl.acm.org/citation.cfm?id=2567753
- Meng, Fan. Design and Implementation of Convex Analysis of Mixtures Software Suite, Master's Thesis, 2012. Abstract: Various convex analysis of mixtures (CAM) based algorithms have been developed to address real world blind source separation (BSS) problems and proven to have good performances in previous papers. This thesis reported the implementation of a comprehensive software CAM-Java, which contains three different CAM based algorithms, CAM compartment modeling (CAM-CM), CAM non-negative independent component analysis (CAM-nICA), and CAM non-negative well-grounded component analysis (CAM-nWCA). The implementation works include: translation of MATLAB coded algorithms to open-sourced R alternatives. As well as building a user friendly graphic user interface (GUI) to integrate three algorithms together, which is accomplished by adopting Java Swing API.In order to combine R and Java coded modules, an open-sourced project RCaller is used to handle the establishment of low level connection between R and Java environment. In addition, specific R scripts and Java classes are also implemented to accomplish the tasks of passing parameters and input data from Java to R, run R scripts in Java environment, read R results back to Java, display R generated figures, and so on. Furthermore, system stream redirection and multi-threads techniques are used to build a simple R messages displaying window in Java built GUI.The final version of the software runs smoothly and stable, and the CAM-CM results on both simulated and real DCE-MRI data are quite close to the original MATLAB version algorithms. The whole GUI based open-sourced software is easy to use, and can be freely distributed among the communities. Technical details in both R and Java modules implementation are also discussed, which presents some good examples of how to develop software with both complicate and up to date algorithms, as well as decent and user friendly GUI in the scientific or engineering research fields. http://scholar.lib.vt.edu/theses/available/etd-08202012-162249/
- Emanuel Gonçalves, Julio Saez-Rodriguez. Cyrface: An interface from Cytoscape to R that provides a user interface to R packages, F1000Research 2013, 2:192 Last updated: 20 JAN 2014, http://f1000research.com/articles/2-192/v1/pdf
- Alexandre Rossi Alvares, Norton Trevisan Roman. AgreeCalc: Uma Ferramenta para Analise da Concord ancia entre Multiplos Anotadores, http://www.lbd.dcc.ufmg.br/colecoes/stil/2013/001.pdf
- Manuel Piubelli, OSSQuery - A Data Mining System for Investigation on Cause - Effect Relations in OSS, Free University of Bolzano Faculty of Computer Science, Master Thesis, July, 2011, http://pro.unibz.it/library/thesis/00007317_21688.pdf
- Miroslav Batchkarov, An evolutionary approach to lane departure warning, BSc dissertation, University of Sussex, 2011, http://www.sussex.ac.uk/Users/mmb28/resources/diss/report_final.pdf
- Satman, M.H. RCaller: A Software Library for Calling R from Java, British Journal of Mathematics & Computer Science, ISSN: 2231-0851,Vol.: 4, Issue.: 15 (01-15 August), 2014, http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5WfTPl_t2M
Labels
rcaller
Monday, June 9, 2014
New Documentation for RCaller
As a new documentation and brief introduction, the research paper "RCaller: A Software Library for Calling R from Java" has just been published in the scholarly journal "British Journal of Mathematics and Computer Science".
The aim and the motivation underlying this paper is to give a brief introduction to RCaller, how to use it in relatively small projects by means of calling R scripts and commands from Java, generating plots and images, running commands online and converting and sending plain Java objects to R.
Other two important projects, rJava and Rserve, are compared to RCaller by means of time efficiency. As a result of this, it is shown that, rJava and Rserve outperforms the RCaller in time complexity, but RCaller seems to be easier to learn and requires less setting-up effort.
The paper is freely available for downloading at
http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5VvkPl_t2M
and the author's page is
http://mhsatman.com/research-paper-rcaller-a-software-library-for-calling-r-from-java/ .
Have a nice read!
The aim and the motivation underlying this paper is to give a brief introduction to RCaller, how to use it in relatively small projects by means of calling R scripts and commands from Java, generating plots and images, running commands online and converting and sending plain Java objects to R.
Other two important projects, rJava and Rserve, are compared to RCaller by means of time efficiency. As a result of this, it is shown that, rJava and Rserve outperforms the RCaller in time complexity, but RCaller seems to be easier to learn and requires less setting-up effort.
The paper is freely available for downloading at
http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5VvkPl_t2M
and the author's page is
http://mhsatman.com/research-paper-rcaller-a-software-library-for-calling-r-from-java/ .
Have a nice read!
Labels
rcaller
Thursday, May 15, 2014
New Release: RCaller 2.3.0
New version of RCaller has just been uploaded in the Google Drive repository.
The new version includes basic bug fixes, new test files and speed enhancements.
XML file structure is now smaller in size and this makes RCaller a little bit faster than the older versions.
The most important issue in this release is the method
public int[] getDimensions(String name)
which reports the dimensions of a given object with 'name'. Here is an example:
The new version includes basic bug fixes, new test files and speed enhancements.
XML file structure is now smaller in size and this makes RCaller a little bit faster than the older versions.
The most important issue in this release is the method
public int[] getDimensions(String name)
which reports the dimensions of a given object with 'name'. Here is an example:
int n = 21; |
int m = 23; |
double[][] data = new double[n][m]; |
for (int i=0;i<data.length;i++){ |
for (int j=0;j<data[0].length;j++){ |
data[i][j] = Math.random(); |
} |
} |
RCaller caller = new RCaller(); |
Globals.detect_current_rscript(); |
caller.setRscriptExecutable(Globals.Rscript_current); |
RCode code = new RCode(); |
code.addDoubleMatrix("x", data); |
caller.setRCode(code); |
caller.runAndReturnResult("x"); |
int[] mydim = caller.getParser().getDimensions("x"); |
Assert.assertEquals(n, mydim[0]); |
Assert.assertEquals(m, mydim[1]);
In the code above, a matrix with dimensions 21 and 23 is passed to R and got back to Java. The variable mydim holds the number of rows and columns and they are as expected as 21 and 23. Please use the download link https://drive.google.com/?tab=mo&authuser=0#folders/0B-sn_YiTiFLGZUt6d3gteVdjTGM to access compiled jar files of RCaller. Good luck! |
Labels
rcaller
Monday, April 21, 2014
Matrix Inversion with RCaller 2.2
Here is the example of passing a double[][] matrix from Java to R, making R calculate the inverse of this matrix and handling the result in Java. Note that code is current for 2.2 version of RCaller.
RCaller caller = new RCaller(); Globals.detect_current_rscript(); caller.setRscriptExecutable(Globals.Rscript_current); RCode code = new RCode();
double[][] matrix = new double[][]{{6, 4}, {9, 8}}; code.addDoubleMatrix("x", matrix); code.addRCode("s<-solve font="" x="">); caller.setRCode(code); caller.runAndReturnResult("s"); double[][] inverse = caller.getParser().getAsDoubleMatrix("s", -solve>
matrix.length, matrix[0].length);
for (int i = 0; i < inverse.length; i++) { for (int j = 0; j < inverse[0].length; j++) {
System.out.print( inverse[i][j] + " ");
}
System.out.println(); }
Saturday, April 12, 2014
RCaller 2.2.0 has just been released
We plan to clean recently reported bugs, but the most important one was having some errors about the R package Runiversal, which is required by the library for generating XML files. The basic issue underlying this problem was the package storing policy of R which depends on the user that installed the package.
In the most recent version 2.2, users do not need to pre-install the R package Runiversal. Simply add RCaller-2.2.0-SNAPSHOT.jar to your classpath and go!
The download link of the compiled library is here [Google Driver Link]
The library is tested in a pc with Ubuntu OS installed and the usual test scenarios are success in all cases. The library has not been tested on Windows machines.
Please use the link of Google code page at http://code.google.com/p/rcaller/issues/list and enter your problems in issues part and do not hesitate to contribute our library.
Labels
rcaller
Saturday, August 17, 2013
A User Document For RCaller
A new research paper as a RCaller documentation is freely available at http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5YSoPmSy1Y
A new research paper as a RCaller documentation is freely available at http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5YSoPmSy1Y
RCaller: A library for calling R from Java
by M.Hakan Satman
August 17, 2013
Contents
1 Introduction
2 Calling R Functions
3 Handling Plots
4 Live Connection
5 Monitoring the Output
6 Conclusion
2 Calling R Functions
3 Handling Plots
4 Live Connection
5 Monitoring the Output
6 Conclusion
Abstract
RCaller is an open-source, compact, and easy-to-use library for calling
R from Java. It offers not only an elegant solution for the task but
its simplicity is key for non-programmers or programmers who are not
familier with the internal structure of R. Since R is not only a statistical
software but an enormous collection of statistical functions, accessing its
functions and packages is of tremendous value. In this short paper, we
give a brief introduction on the most widely-used methods to call R from
Java and highlight some properties of RCaller with short examples. User
feedback has shown that RCaller is an important tool in many cases where
performance is not a central concern.
1 Introduction
R [R Development Core Team(2011)] is an open source and freely distributed
statistics software package for which hundreds of external packages are
available. The core functionality of R is written mostly in C and wrapped
by R functions which simplify parameter passing. Since R manages
the exhaustive dynamic library loading tasks in a clever way, calling
an external compiled function is easy as calling an R function in R.
However, integration with JVM (Java Virtual Machine) languages is
painful.
The R package rJava [Urbanek(2011a)] provides a useful mechanism for
instantiating Java objects, accessing class elements and passing R objects to
Java methods in R. This library is convenient for the R packages that
rely on external functionality written in Java rather than C, C++ or
Fortran.
The library JRI, which is now a part of the package rJava, uses JNI (Java
Native Interface) to call R from Java [Urbanek(2009)]. Although JNI is the
most common way of accessing native libraries in Java, JRI requires that
several system and environment variables are correctly set before any run,
which can be difficult for inexperienced users, especially those who are not
computer scientists.
The package Rserve [Urbanek(2011b)] uses TCP sockets and acts as a TCP
server. A client establishes a connection to Rserve, sends R commands, and
receives the results. This way of calling R from the other platforms is more
general because the handshaking and the protocol initializing is fully platform
independent.
Renjin (http://code.google.com/p/renjin) is an other interesting project
that addresses the problem. It solves the problem of calling R from Java by
re-implementing the R interpreter in Java! With this definition, the project
includes the tasks of writing the interpreter and implementing the internals.
Renjin is intended to be 100% compatible with the original. However, it
is under development and needs help. After all, an online demo is
available which is updated simultaneously when the source code is
updated.
Finally, RCaller [RCaller Development Team(2011)] is an LGPL’d library
which is very easy to use. It does not do much but wraps the operations well. It
requires no configuration beyond installing an R package (Runiversal) and
locating the Rscript binary distributed with R. Altough it is known to be
relatively inefficient compared to other options, its latest release features
significant performance improvements.
2 Calling R Functions
Calling R code from other languages is not trivial. R includes a huge collection
of math and statistics libraries with nearly 700 internal functions and hundreds
of external packages. No comparable library exists in Java. Although libraries
such as the Apache Commons Math [Commons Math Developers(2010)] do
provide many classes for those calculations, its scope is quite limited
compared to R. For example, it is not easy to find such a library that
calculates quantiles and probabilities of non-central distributions. [Harner
et al.(2009)Harner, Luo, and Tan] affirms that using R’s functionality from
Java prevents the user from writing duplicative codes in statistics
softwares.
RCaller is an other open source library for performing R operations from
within Java applications in a wrapped way. RCaller prepares R code using the
user input. The user input is generally a Java array, a plain Java object or the
R code itself. It then creates an external R process by running the Rscript
executable. It passes the generated R code and receives the output as XML
documents. While the process is alive, the output of the standard input and the
standard error streams are handled by an event-driven mechanism. The
returned XML document is then parsed and the returned R objects are
extracted to Java arrays.
The short example given below creates two double vectors, passes them
to R, and returns the residuals calculated from a linear regression
estimation.
RCaller caller = new RCaller();
RCode code = new RCode();
double[] xvector = new double[]{1,3,5,3,2,4};
double[] yvector = new double[]{6,7,5,6,5,6};
caller.setRscriptExecutable("/usr/bin/Rscript");
code.addDoubleArray("X", xvector);
code.addDoubleArray("Y", yvector);
code.addRCode("ols <- lm ( Y ~ X )");
caller.setRCode(code);
caller.runAndReturnResult("ols");
double[] residuals =
caller.getParser().
getAsDoubleArray("residuals");
RCode code = new RCode();
double[] xvector = new double[]{1,3,5,3,2,4};
double[] yvector = new double[]{6,7,5,6,5,6};
caller.setRscriptExecutable("/usr/bin/Rscript");
code.addDoubleArray("X", xvector);
code.addDoubleArray("Y", yvector);
code.addRCode("ols <- lm ( Y ~ X )");
caller.setRCode(code);
caller.runAndReturnResult("ols");
double[] residuals =
caller.getParser().
getAsDoubleArray("residuals");
The lm function returns an R list with a class of lm whose elements are
accessible with the $ operator. The method runAndReturnResult() takes the
name of an R list which contains the desired results. Finally, the method
getAsDoubleArray() returns a double vector with values filled from the vector
residuals of the list ols.
RCaller uses the R package Runiversal [Satman(2010)] to convert R lists to
XML documents within the R process. This package includes the method
makexml() which takes an R list as input and returns a string of XML
document. Although some R functions return the results in other
types and classes of data, those results can be returned to the JVM
indirectly. Suppose that obj is an S4 object with members member1 and
member2. These members are accessible with the @ operator like
obj@member1 and obj@member2. These elements can be returned to
Java by constructing a new list like result\A1-list(m1=obj@member1,
m2=obj@member2).
3 Handling Plots
Although the graphics drivers and the internals are implemented in
C, most of the graphics functions and packages are written in the
R language and this makes the R unique with its graphics library.
RCaller handles a plot with the function startPlot() and receives a
java.io.File reference to the generated plot. The function getPlot() returns
an instance of the javax.swing.ImageIcon class which contains the
generated image in a fully isolated way. A Java example is shown
below:
RCaller caller = new RCaller();
RCode code = new RCode();
File plotFile = null;
ImageIcon plotImage = null;
caller.
setRscriptExecutable("/usr/bin/Rscript");
code.R_require("lattice");
try{
plotFile = code.startPlot();
code.addRCode("
xyplot(rnorm(100)~1:100, type=’l’)
");
}catch (IOException err){
System.out.println("Can not create plot");
}
caller.setRCode(code);
caller.runOnly();
plotImage = code.getPlot(plotFile);
code.showPlot(plotFile);
RCode code = new RCode();
File plotFile = null;
ImageIcon plotImage = null;
caller.
setRscriptExecutable("/usr/bin/Rscript");
code.R_require("lattice");
try{
plotFile = code.startPlot();
code.addRCode("
xyplot(rnorm(100)~1:100, type=’l’)
");
}catch (IOException err){
System.out.println("Can not create plot");
}
caller.setRCode(code);
caller.runOnly();
plotImage = code.getPlot(plotFile);
code.showPlot(plotFile);
The method runOnly() is quite different from the method RunAndReturnResult().
Because the user only wants a plot to be generated, there is nothing returned by
R in the example above. Note that more than one plots can be generated in a
single run.
Handling R plots with a java.io.File reference is also convenient in web
projects. Generated content can be easly sent to clients using output streams
opened from the file reference. However, RCaller uses the temp directory and
does not delete the generated files automatically. This may be a cause of a too
many files OS level error which can not be caught by a Java program.
However, cleaning the generated output using a scheduled task solves this
problem.
4 Live Connection
Each time the method runAndReturnResult() is called, an Rscript instance is
created to perform the operations. This is the main source of the inefficiency of
RCaller. A better approach in the cases that R commands are repeatedly called
is to use the method runAndReturnResultOnline(). This method creates an R
instance and keeps it running in the background. This approach avoids the time
required to create an external process, initialize the interpreter, and load
packages in subsequent calls.
The example given below returns the determinants of a given matrix and its
inverse in sequence, that is, it uses a single external instance to perform more
than one operation.
double[][] matrix =
new double[][]{{5,4,5},{6,1,0},{9,-1,2}};
caller.setRExecutable("/usr/bin/R");
caller.setRCode(code);
code.clear();
code.addDoubleMatrix("x", matrix);
code.addRCode("result<-list(d=det(x))");
caller.runAndReturnResultOnline("result");
System.out.println(
"Determinant is " +
caller.getParser().
getAsDoubleArray("d")[0]
);
code.addRCode("result<-list(t=det(solve(x)))");
caller.runAndReturnResultOnline("result");
System.out.println(
"Determinant of inverse is " +
caller.getParser().
getAsDoubleArray("t")[0]
);
new double[][]{{5,4,5},{6,1,0},{9,-1,2}};
caller.setRExecutable("/usr/bin/R");
caller.setRCode(code);
code.clear();
code.addDoubleMatrix("x", matrix);
code.addRCode("result<-list(d=det(x))");
caller.runAndReturnResultOnline("result");
System.out.println(
"Determinant is " +
caller.getParser().
getAsDoubleArray("d")[0]
);
code.addRCode("result<-list(t=det(solve(x)))");
caller.runAndReturnResultOnline("result");
System.out.println(
"Determinant of inverse is " +
caller.getParser().
getAsDoubleArray("t")[0]
);
This use of RCaller is fast and convenient for repeated commands. Since R is
not thread-safe, its functions can not be called by more than one threads.
Therefore, each single thread must create its own R process to perform
calculations simultaneously in Java.
5 Monitoring the Output
RCaller receives the desired content as XML documents. The content is a list of
the variables of interest which are manually created by the user or returned
automatically by a function. Apart from the generated content, R produces
some output to the standard output (stdout) and the standard error
(stderr) devices. RCaller offers two options to handle these outputs.
The first one is to save them in a text file. The other is to redirect
all of the content to the standard output device. The example given
below shows a conditional redirection of the outputs generated by
R.
if(console){
caller.redirectROutputToConsole();
}else{
caller.redirectROutputToFile(
"output.txt" /* filename */,
true /* append? */);
}
caller.redirectROutputToConsole();
}else{
caller.redirectROutputToFile(
"output.txt" /* filename */,
true /* append? */);
}
6 Conclusion
In addition to being a statistical software, R is an extendable library with its
internal functions and external packages. Since the R interpreter was written
mostly in C, linking to custom C/C++ programs is relatively simple.
Unfortunately, calling R functions from Java is not straightforward. The
prominent methods use JNI and TCP sockets to solve this problem.
In addition, renjin offers a different perspective to this issue. It is a
re-implementation of R in Java which is intended to be 100% compatible
with the original. However, it is under development and needs help.
Finally, RCaller is an alternative way of calling R from Java. It is
packaged in a single jar and it does not require setup beyond the one-time
installation of the R package Runiversal. It supports loading external
packages, calling functions, handling plots and debugging the output
generated by R. It is not the most efficient method compared to the
alternatives, but users report that performance improvements in the latest
revision and its simplicity of use make it an important tool in many
applications.
References
[Commons Math Developers(2010)] Commons Math
Developers. Apache Commons Math, Release 2.1. Available from
http://commons.apache.org/math/download_math.cgi, Apr. 2010.
URL http://commons.apache.org/math.
[Harner et al.(2009)Harner, Luo, and Tan] E. Harner, D. Luo, and
J. Tan. JavaStat: A Java/R-based statistical computing environment.
Computational Statistics, 24(2):295–302, May 2009.
[R Development Core Team(2011)] R Development Core Team. R:
A Language and Environment for Statistical Computing. R
Foundation for Statistical Computing, Vienna, Austria, 2011. URL
http://www.R-project.org/. ISBN 3-900051-07-0.
[RCaller Development Team(2011)] RCaller
Development Team. RCaller: A library for calling R from Java, 2011.
URL http://code.google.com/p/rcaller.
[Satman(2010)] M. H. Satman. Runiversal: A Package for
converting R objects to Java variables and XML., 2010. URL
http://CRAN.R-project.org/package=Runiversal. R package
version 1.0.1.
[Urbanek(2009)] S. Urbanek. How to talk to strangers: ways
to leverage connectivity between R, Java and Objective C.
Computational Statistics, 24(2):303–311, May 2009.
[Urbanek(2011a)] S. Urbanek. rJava: Low-level R to Java interface,
2011a. URL http://CRAN.R-project.org/package=rJava. R package
version 0.9-2.
[Urbanek(2011b)] S. Urbanek. Rserve: Binary R server, 2011b. URL
http://CRAN.R-project.org/package=Rserve. R package version
0.6-5.
A new research paper as a RCaller documentation is freely available at http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5YSoPmSy1Y
Subscribe to:
Posts (Atom)