Showing posts with label r. Show all posts
Showing posts with label r. Show all posts

Saturday, August 17, 2013

A User Document For RCaller

A new research paper as a RCaller documentation is freely available at http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5YSoPmSy1Y




RCaller: A library for calling R from Java

by M.Hakan Satman

August 17, 2013

Contents


Abstract

RCaller is an open-source, compact, and easy-to-use library for calling R from Java. It offers not only an elegant solution for the task but its simplicity is key for non-programmers or programmers who are not familier with the internal structure of R. Since R is not only a statistical software but an enormous collection of statistical functions, accessing its functions and packages is of tremendous value. In this short paper, we give a brief introduction on the most widely-used methods to call R from Java and highlight some properties of RCaller with short examples. User feedback has shown that RCaller is an important tool in many cases where performance is not a central concern.

1 Introduction


R [R Development Core Team(2011)] is an open source and freely distributed statistics software package for which hundreds of external packages are available. The core functionality of R is written mostly in C and wrapped by R functions which simplify parameter passing. Since R manages the exhaustive dynamic library loading tasks in a clever way, calling an external compiled function is easy as calling an R function in R. However, integration with JVM (Java Virtual Machine) languages is painful.
The R package rJava [Urbanek(2011a)] provides a useful mechanism for instantiating Java objects, accessing class elements and passing R objects to Java methods in R. This library is convenient for the R packages that rely on external functionality written in Java rather than C, C++ or Fortran.
The library JRI, which is now a part of the package rJava, uses JNI (Java Native Interface) to call R from Java [Urbanek(2009)]. Although JNI is the most common way of accessing native libraries in Java, JRI requires that several system and environment variables are correctly set before any run, which can be difficult for inexperienced users, especially those who are not computer scientists.
The package Rserve [Urbanek(2011b)] uses TCP sockets and acts as a TCP server. A client establishes a connection to Rserve, sends R commands, and receives the results. This way of calling R from the other platforms is more general because the handshaking and the protocol initializing is fully platform independent.
Renjin (http://code.google.com/p/renjin) is an other interesting project that addresses the problem. It solves the problem of calling R from Java by re-implementing the R interpreter in Java! With this definition, the project includes the tasks of writing the interpreter and implementing the internals. Renjin is intended to be 100% compatible with the original. However, it is under development and needs help. After all, an online demo is available which is updated simultaneously when the source code is updated.
Finally, RCaller [RCaller Development Team(2011)] is an LGPL’d library which is very easy to use. It does not do much but wraps the operations well. It requires no configuration beyond installing an R package (Runiversal) and locating the Rscript binary distributed with R. Altough it is known to be relatively inefficient compared to other options, its latest release features significant performance improvements.

2 Calling R Functions


Calling R code from other languages is not trivial. R includes a huge collection of math and statistics libraries with nearly 700 internal functions and hundreds of external packages. No comparable library exists in Java. Although libraries such as the Apache Commons Math [Commons Math Developers(2010)] do provide many classes for those calculations, its scope is quite limited compared to R. For example, it is not easy to find such a library that calculates quantiles and probabilities of non-central distributions. [Harner et al.(2009)Harner, Luo, and Tan] affirms that using R’s functionality from Java prevents the user from writing duplicative codes in statistics softwares.
RCaller is an other open source library for performing R operations from within Java applications in a wrapped way. RCaller prepares R code using the user input. The user input is generally a Java array, a plain Java object or the R code itself. It then creates an external R process by running the Rscript executable. It passes the generated R code and receives the output as XML documents. While the process is alive, the output of the standard input and the standard error streams are handled by an event-driven mechanism. The returned XML document is then parsed and the returned R objects are extracted to Java arrays.
The short example given below creates two double vectors, passes them to R, and returns the residuals calculated from a linear regression estimation.
RCaller caller = new RCaller();
RCode code = new RCode();
double[] xvector = new double[]{1,3,5,3,2,4};
double[] yvector = new double[]{6,7,5,6,5,6};

caller.setRscriptExecutable("/usr/bin/Rscript");

code.addDoubleArray("X", xvector);
code.addDoubleArray("Y", yvector);
code.addRCode("ols <- lm ( Y ~ X )");

caller.setRCode(code);

caller.runAndReturnResult("ols");

double[] residuals =
   caller.getParser().
     getAsDoubleArray("residuals");  

The lm function returns an R list with a class of lm whose elements are accessible with the $ operator. The method runAndReturnResult() takes the name of an R list which contains the desired results. Finally, the method getAsDoubleArray() returns a double vector with values filled from the vector residuals of the list ols.
RCaller uses the R package Runiversal [Satman(2010)] to convert R lists to XML documents within the R process. This package includes the method makexml() which takes an R list as input and returns a string of XML document. Although some R functions return the results in other types and classes of data, those results can be returned to the JVM indirectly. Suppose that obj is an S4 object with members member1 and member2. These members are accessible with the @ operator like obj@member1 and obj@member2. These elements can be returned to Java by constructing a new list like result\A1-list(m1=obj@member1, m2=obj@member2).

3 Handling Plots


Although the graphics drivers and the internals are implemented in C, most of the graphics functions and packages are written in the R language and this makes the R unique with its graphics library. RCaller handles a plot with the function startPlot() and receives a java.io.File reference to the generated plot. The function getPlot() returns an instance of the javax.swing.ImageIcon class which contains the generated image in a fully isolated way. A Java example is shown below:
RCaller caller = new RCaller();
RCode code = new RCode();
File plotFile = null;
ImageIcon plotImage = null;

caller.
setRscriptExecutable("/usr/bin/Rscript");

code.R_require("lattice");

try{
 plotFile = code.startPlot();
 code.addRCode("
      xyplot(rnorm(100)~1:100, type=’l’)
      ");
}catch (IOException err){
 System.out.println("Can not create plot");
}

caller.setRCode(code);
caller.runOnly();

plotImage = code.getPlot(plotFile);
code.showPlot(plotFile);

The method runOnly() is quite different from the method RunAndReturnResult(). Because the user only wants a plot to be generated, there is nothing returned by R in the example above. Note that more than one plots can be generated in a single run.
Handling R plots with a java.io.File reference is also convenient in web projects. Generated content can be easly sent to clients using output streams opened from the file reference. However, RCaller uses the temp directory and does not delete the generated files automatically. This may be a cause of a too many files OS level error which can not be caught by a Java program. However, cleaning the generated output using a scheduled task solves this problem.

4 Live Connection


Each time the method runAndReturnResult() is called, an Rscript instance is created to perform the operations. This is the main source of the inefficiency of RCaller. A better approach in the cases that R commands are repeatedly called is to use the method runAndReturnResultOnline(). This method creates an R instance and keeps it running in the background. This approach avoids the time required to create an external process, initialize the interpreter, and load packages in subsequent calls.
The example given below returns the determinants of a given matrix and its inverse in sequence, that is, it uses a single external instance to perform more than one operation.
double[][] matrix =
    new double[][]{{5,4,5},{6,1,0},{9,-1,2}};
caller.setRExecutable("/usr/bin/R");
caller.setRCode(code);

code.clear();
code.addDoubleMatrix("x", matrix);
code.addRCode("result<-list(d=det(x))");
caller.runAndReturnResultOnline("result");

System.out.println(
"Determinant is " +
  caller.getParser().
   getAsDoubleArray("d")[0]
   );

code.addRCode("result<-list(t=det(solve(x)))");
caller.runAndReturnResultOnline("result");

System.out.println(
"Determinant of inverse is " +
  caller.getParser().
   getAsDoubleArray("t")[0]
   );

This use of RCaller is fast and convenient for repeated commands. Since R is not thread-safe, its functions can not be called by more than one threads. Therefore, each single thread must create its own R process to perform calculations simultaneously in Java.

5 Monitoring the Output


RCaller receives the desired content as XML documents. The content is a list of the variables of interest which are manually created by the user or returned automatically by a function. Apart from the generated content, R produces some output to the standard output (stdout) and the standard error (stderr) devices. RCaller offers two options to handle these outputs. The first one is to save them in a text file. The other is to redirect all of the content to the standard output device. The example given below shows a conditional redirection of the outputs generated by R.
if(console){
 caller.redirectROutputToConsole();
}else{
 caller.redirectROutputToFile(
     "output.txt" /* filename */,
     true  /* append? */);
}

6 Conclusion


In addition to being a statistical software, R is an extendable library with its internal functions and external packages. Since the R interpreter was written mostly in C, linking to custom C/C++ programs is relatively simple. Unfortunately, calling R functions from Java is not straightforward. The prominent methods use JNI and TCP sockets to solve this problem. In addition, renjin offers a different perspective to this issue. It is a re-implementation of R in Java which is intended to be 100% compatible with the original. However, it is under development and needs help. Finally, RCaller is an alternative way of calling R from Java. It is packaged in a single jar and it does not require setup beyond the one-time installation of the R package Runiversal. It supports loading external packages, calling functions, handling plots and debugging the output generated by R. It is not the most efficient method compared to the alternatives, but users report that performance improvements in the latest revision and its simplicity of use make it an important tool in many applications.

References


[Commons Math Developers(2010)]   Commons Math Developers. Apache Commons Math, Release 2.1. Available from http://commons.apache.org/math/download_math.cgi, Apr. 2010. URL http://commons.apache.org/math.
[Harner et al.(2009)Harner, Luo, and Tan]   E. Harner, D. Luo, and J. Tan. JavaStat: A Java/R-based statistical computing environment. Computational Statistics, 24(2):295–302, May 2009.
[R Development Core Team(2011)]   R Development Core Team. R: A Language and Environment for Statistical Computing. R Foundation for Statistical Computing, Vienna, Austria, 2011. URL http://www.R-project.org/. ISBN 3-900051-07-0.
[RCaller Development Team(2011)]   RCaller Development Team. RCaller: A library for calling R from Java, 2011. URL http://code.google.com/p/rcaller.
[Satman(2010)]   M. H. Satman. Runiversal: A Package for converting R objects to Java variables and XML., 2010. URL http://CRAN.R-project.org/package=Runiversal. R package version 1.0.1.
[Urbanek(2009)]   S. Urbanek. How to talk to strangers: ways to leverage connectivity between R, Java and Objective C. Computational Statistics, 24(2):303–311, May 2009.
[Urbanek(2011a)]   S. Urbanek. rJava: Low-level R to Java interface, 2011a. URL http://CRAN.R-project.org/package=rJava. R package version 0.9-2.
[Urbanek(2011b)]   S. Urbanek. Rserve: Binary R server, 2011b. URL http://CRAN.R-project.org/package=Rserve. R package version 0.6-5.





A new research paper as a RCaller documentation is freely available at http://www.sciencedomain.org/abstract.php?iid=550&id=6&aid=4838#.U5YSoPmSy1Y






























Sunday, April 21, 2013

R Package: mcga

Machine coded genetic algorithm (MCGA) is a fast tool for real-valued optimization problems. It uses the byte representation of variables rather than real-values. It performs the classical crossover operations (uniform) on these byte representations. Mutation operator is also similar to classical mutation operator, which is to say, it changes a randomly selected byte value of a chromosome by +1 or -1 with probability 1/2. In MCGAs there is no need for encoding-decoding process and the classical operators are directly applicable on real-values. It is fast and can handle a wide range of a search space with high precision. Using a 256-unary alphabet is the main disadvantage of this algorithm but a moderate size population is convenient for many problems. Package also includes multi_mcga function for multi objective optimization problems. This function sorts the chromosomes using their ranks calculated from the non-dominated sorting algorithm.

Package Url:

http://cran.r-project.org/web/packages/mcga/index.html

R Installation:

  install.packages ("mcga")

For help and example type

  ?mcga

in R console.


Friday, August 17, 2012

A nice video on RCaller 2.0


A nice video on RCaller is in Youtube now. The original link is in Quantlabs.net. Thanks for the uploader.

Wednesday, May 30, 2012

An informative video on RCaller

Somebody on the Internet submitted an informative video on how to use RCaller for calling R from withing Java applications in YouTube.

It is nice to see RCaller has an higher usage rates after its 2.0 version.

You can see the embedded video in this entry. Have a nice training!

Sunday, October 30, 2011

RCallerPhp is ready for testing

Hey web guys! RCaller now supports Php and we are planning to carry RCaller to other platforms and languages. The first step of our attack plan was to implement a Php edition and it is ready for testing now.

The second step is to implement RCaller for Perl and Python. We have now our Perl developer and he is in progress. Python is not our primary language and we are waiting for your helps. If you are familier with R and a developer of one of those languages below, join us. We are planning to carry RCaller to

  • Python
  • .Net
at first.

And... How it looks like.. Let's give up Java and speak Php for a minute:

 1 <?php
 2 include_once ("RCaller.php");
 3 
 4 $rcaller = new RCaller();
 5   $rcaller->setRscriptExecutable("/usr/bin/Rscript");
 6   $rcode = new RCode("");
 7   $rcode->clear();
 8   $rcode->addRCode("mylist <- list(x=1:3, y=c(7,8,9))");
 9 
10   $rcaller->setRCode($rcode);
11   $rcaller->runAndReturnResult("mylist");
12 
13   $x = $rcaller->getParser()->getAsStringArray("x");
14   $y = $rcaller->getParser()->getAsStringArray("y");
15 
16   echo ("X is <br>");
17   print_r ($x);
18 
19   echo ("<br><br>Y is <br>");
20   print_r ($y);
21 ?>


Waaav! Nothings changes! When you run this code, you will see values of x as 1, 2 and 3 and values of y as 7, 8, 9... The code above seems 100% compatible with the original library...

If you have used RCaller (Java edition) before, you will probably
understand the whole code. If not, lets have a look at the page RCaller 2.0 - Calling R from Java.


Note that, it is as in-efficient as the original version. Because RCaller creates external Rscript processes in each time RunAndReturnResult() thingies called. Be careful before using it in big and critical projects. Another note is about using it with too many users. RCaller uses temp directory to store its R codes and outputs. You need to clear this directory periodically. Otherwise you can have a "too many files" error.


Finally, source of is ready for use and development. Please visit the RCaller source code and downloads page. Php codes are stored as a separate project with name RCallerPhp.

Test it and do not hasitate to ask us!

Wednesday, September 28, 2011

Passing plain Java objects to R using RCaller

Well, you are using RCaller for your statistical calculations. Probably, you are passing your double arrays to R and type some R commands in order to get the desired outputs. After a calculation process, you handle the returned arrays through the parser. This is the general use of RCaller.

Suppose that you have got a Java class which has got some variables with data types int, short, long, float, double and String. This class also includes some arrays of types int[], double[], ..., String[]. Of course it may include some functions, constructors or anything else. But we don't care about this for now.  How about passing this class with its publicly defined variables to R? Yeah! It is possible in its last submitted revision.

Have a look at the Java class below:


class TestClass {

  public int i = 9;
  public float f = 10.0f;
  public double d = 3.14;
  public boolean b = true;
  public String s = "test";
}

This class simply includes five publicly defined variables with basic data types. Our other class inherits the TestClass and defines some additional arrays:


class TestClassWithArrays extends TestClass {

  public int[] ia = new int[]{1, 2, 3, 4, 5};
  public double[] da = new double[]{1.0, 2.0, 3.0, 4.0, 9.9, 10.1};
  public String[] sa = new String[]{"One", "Two", "Three"};
  public boolean[] ba = new boolean[]{true, true, false};
}

Ok, they are very simple but there is no reason those classes not to have any methods. Whatever those classes have methods, we consider them as data structures.

Lets pass this data structure to R:


TestClassWithArrays tcwa = new TestClassWithArrays();
    JavaObject jo = new JavaObject("tcwa", tcwa);

    RCaller rcaller = new RCaller();
    rcaller.setRscriptExecutable("/usr/bin/Rscript");
    rcaller.cleanRCode();

    rcaller.addRCode(jo.produceRCode());
    rcaller.runAndReturnResult("tcwa");


Well, if there is no expection we have the results in a R list named "tcwa". This R list includes all of the elements that included in TestClassWithArrays and TestClass with their values.

This is an example of proof, the related @Test method is ready for browsing in the Google Code:

@Test
  public void TestClassWithArrays() throws IllegalAccessException, IOException {
    TestClassWithArrays tcwa = new TestClassWithArrays();
    JavaObject jo = new JavaObject("tcwa", tcwa);

    RCaller rcaller = new RCaller();
    rcaller.setRscriptExecutable("/usr/bin/Rscript");
    rcaller.cleanRCode();

    rcaller.addRCode(jo.produceRCode());
    rcaller.runAndReturnResult("tcwa");

    int[] expectedIntArray = rcaller.getParser().getAsIntArray("ia");
    for (int i = 0; i < tcwa.ia.length; i++) {
      assertEquals(expectedIntArray[i], tcwa.ia[i]);
    }


    double[] expectedDoubleArray = rcaller.getParser().getAsDoubleArray("da");
    for (int i = 0; i < tcwa.da.length; i++) {
      assertEquals(expectedDoubleArray[i], tcwa.da[i], delta);
    }

    String[] expectedStringArray = rcaller.getParser().getAsStringArray("sa");
    for (int i = 0; i < tcwa.sa.length; i++) {
      assertEquals(expectedStringArray[i], tcwa.sa[i]);
    }

  }

It is shown that in examples, in R side, we can access to elements with their original names that defined in the Java class. That sounds good.

Finally, we can pass our Java objects with defined contents. This use of RCaller narrows the code of addDoubleArray, addIntArray and reduce all of them to simple command of

 JavaObject jo = new JavaObject("tcwa", tcwa);
.
.
.
rcaller.addRCode ( jo.produceRCode() );
 

It is simplicity...

Saturday, September 17, 2011

RCaller: Support for sequential commands with a single process

I think, this revision will be the foundation of the version  2.1. RCaller is supposed to be slow but the easiest way of calling R from Java.

Finally I have implemented the method runAndReturnResultOnline() for running sequential commands in a single process. What does this stand for? Let me give an example to explain this:

Suppose that you want to perform a simulation study to measure the success of your new procedure. For this, you decide to draw random numbers from a distribution and calculate something and handle the results in Java. RCaller creates  Rscript processes for each single iteration. This cause to too many operating system calls.

Latest release of RCaller includes the method for this. Lets have a look at the Test file:


@Test
  public void onlineCalculationTest() {
    RCaller rcaller = new RCaller();
    rcaller.setRExecutable("/usr/bin/R");
    rcaller.cleanRCode();
    rcaller.addRCode("a<-1:10");
    rcaller.runAndReturnResultOnline("a");
    assertEquals(rcaller.getParser().getAsIntArray("a")[0], 1);

    rcaller.cleanRCode();
    rcaller.addRCode("b<-1:10");
    rcaller.addRCode("m<-mean(b)");
    rcaller.runAndReturnResultOnline("m");
    assertEquals(rcaller.getParser().getAsDoubleArray("m")[0], 5.5, 0.000001);

    rcaller.cleanRCode();
    rcaller.addRCode("a<-1:99");
    rcaller.addRCode("k<-median(a)");
    rcaller.runAndReturnResultOnline("k");
    assertEquals(rcaller.getParser().getAsDoubleArray("k")[0], 50.0, 0.000001);
  }
  }
 

In first stage,we are creating an integer vector and getting the first element. In the second one, we are creating the same integer vector with a different name and calculating the arithmetic mean. In the last one, we are recreating the vector a and getting the median, which is equal to 50.

This example uses the same RCaller object. In first stage, the R executable file (it is /usr/bin/R in my Ubuntu Linux) is created once. In second stage the same R file is used and no longer processes are created again. In this stage, the vector a is accessible and still remains alive. At the last stage, b is alive again and a is recreated. So this example does not cause the R to open and close three times but only once.

This modification speeds up the RCaller, but it can be still considered as slow.
However, it is still easy to implement and much more faster than the previous implementation.

Have Fun!


Thursday, September 15, 2011

Handling R lists with RCaller 2.0

Since RCaller creates an Rscript process for each single run, it is said to be in-efficient for most cases. But there are useful non-hack methods to improve the method. Suppose that your aim is to calculate medians of two double vector like this:












@Test
  public void singleResultTest() {
    double delta = 0.0000001;
    RCaller rcaller = new RCaller();
    rcaller.setRscriptExecutable("/usr/bin/Rscript");
    rcaller.cleanRCode();
    rcaller.addRCode("x <- c(6 ,8, 3.4, 1, 2)");
    rcaller.addRCode("med <- median(x)");

    rcaller.runAndReturnResult("med");

    double[] result = rcaller.getParser().getAsDoubleArray("med");

    assertEquals(result[0], 3.4, delta);
  }

However, this example considers only computing the median of x, effort for computing medians of three variables needs three process which is very slow. Lists are "vector of vector" objects but they are different from matrices. A list object in R can handle several types of vector with their names. For example


alist <- list (
s = c("string1", "string2", "string3") , 
i = c(5,4,7,6),
d = c(5.5, 6.7, 8.9)
)
 

the list object alist is formed by three different kind of vectors: string vector s, integer vector i and double vector d. Also their names are s, i and d, respectively. Accessing elements of this list is straightforward. There are two ways to access to elements. First one is conventional way using indices. When the example above runs, strvec is set to String vector s.



alist <- list (
strvec <- alist[1]
While a list object can handle R objects with their names, we can handle more than more result in a single RCaller run. Back to our example, we wanted to calculate medians of three double vectors in a single run.
@Test
  public void TestLists2()throws Exception {
    double delta = 0.0000001;
    RCaller rcaller = new RCaller();
    rcaller.setRscriptExecutable("/usr/bin/Rscript");
    rcaller.cleanRCode();
    rcaller.addRCode("x <- c(6 ,8, 3.4, 1, 2)");
    rcaller.addRCode("med1 <- median(x)");

    rcaller.addRCode("y <- c(16 ,18, 13.4, 11,12)");
    rcaller.addRCode("med2 <- median(y)");

    rcaller.addRCode("z <- c(116 ,118, 113.4,111,112)");
    rcaller.addRCode("med3 <- median(z)");

    rcaller.addRCode("results <- list(m1 = med1, m2 = med2, m3 = med3)");

    rcaller.runAndReturnResult("results");

    double[] result = rcaller.getParser().getAsDoubleArray("m1");
    assertEquals(result[0], 3.4, delta);

    result = rcaller.getParser().getAsDoubleArray("m2");
    assertEquals(result[0], 13.4, delta);

    result = rcaller.getParser().getAsDoubleArray("m3");
    assertEquals(result[0], 113.4, delta);
  }
This code passes the tests. By the result at hand, we have three medians of three different vectors with one pass calculation. With this way, an huge number of vectors can be accepted as a result from R and this method may be considered efficient... these test files were integrated to source structure of project in http://code.google.com/p/rcaller/

hope works!

Wednesday, September 7, 2011

Embedding R in Java Applications using Renjin

Effort of embedding R in other languages is not a short history for programmers. Rserve, Rjava, RCaller and Renjin are prominent efforts for doing this. Their approaches are completely different. RServe opens server sockets and listens for connections whatever the client is. It uses its own protocol to communicate with clients and it passes commands to R which were sent by clients. This is the neatest idea for me.

RJava uses the JNI (Java Native Library) way to interoperate R and Java. This is the most common and intuitive way for me.

RCaller sends commands to R interpreter by creating a process for each single call. Then it handles the results as XML and parses it. It is the easiest and the most in-efficient way of calling R from Java. But it works.

And finally, Renjin, is a re-implementation of R for the Java Virtual Machine. I think, this will be the most rational way of calling R from Java because it is something like

Renjin,
is not for calling R from Java,
is for calling itself and maybe it can be said that: it is for calling java from java :),
for Java programmers who aimed to use R in their projects


So that is why I participated this project. External function calls are always make pain whatever the way you use.

Renjin is an R implementation in Java.

I think all these paragraphs tell the whole story.

How can we embed Renjin to our Java projects? Lets do something... But we have some requirements:

  1. renjin-core-0.1.2-SNAPSHOT.jar (Download from http://code.google.com/p/renjin/wiki/Downloads?tm=2)
  2. commons-vfs-1.0.jar (Part of apache commons)
  3. commons-logging-1.1.1.jar (Part of apache commons)
  4. guava-r07.jar (http://code.google.com/p/guava-libraries/downloads/list)
  5. commons-math-2.1.jar (Part of apache commons)

Ok. These are the renjin and required Jar files. Lets evaluate the R expression "x<-1:10" which creates a vector of integers from one to ten. Tracking the code is straightforward.
package renjincall;



import java.io.StringReader;

import r.lang.Context;

import r.lang.SEXP;

import r.parser.ParseOptions;

import r.parser.ParseState;

import r.parser.RLexer;

import r.parser.RParser;

import r.lang.EvalResult;



public class RenjinCall {



  public RenjinCall() {

    Context topLevelContext = Context.newTopLevelContext();

    try {

      topLevelContext.init();

    } catch (Exception e) {

    }

    StringReader reader = new StringReader("x<-1:10\n");
    ParseOptions options = ParseOptions.defaults();
    ParseState state = new ParseState();
    RLexer lexer = new RLexer(options, state, reader);
    RParser parser = new RParser(options, state, lexer);
    try {
      parser.parse();
    } catch (Exception e) {
      System.out.println("Cannot parse: " + e.toString());
    }
    SEXP result = parser.getResult();
    System.out.println(result);
  }

  public static void main(String[] args) {
    new RenjinCall();
  }
}



We are initializing the library, creating the lexer and the parser and hadling the result as a SEXP. Finally we are printing the SEXP object (not itself, its String representation)


<-(x, :(1.0, 10.0))
This is the parsed version of our "x<-1:10", it contains the same amount of information but it is a little bit different in form. Since we only parsed the content but it has not been evaluated. Track the code:
EvalResult eva = result.evaluate(topLevelContext, topLevelContext.getEnvironment());
System.out.println(eva.getExpression().toString());


Now, the output is

c(1, 2, 3, 4, 5, 6, 7, 8, 9, 10)

and this is the well known representation of R integer vectors. Of course printing the result in String format is not all the work. We would handle the elements of this array rather than print it. Lets do some work on it:

IntVector vector = (IntVector) eva.getExpression();
    for (int i = 0; i < vector.length(); i++) {
      System.out.println(
i + ". element of this vector is: " + vector.getElementAsInt(i)
);
    }

IntVector is defined in renjin core library and is for handling integer vectors. We simple used the .length() and .getElementAsInt() methods like using Java's ArrayList class. Finally the result is

0. element of this vector is: 1
1. element of this vector is: 2
2. element of this vector is: 3
3. element of this vector is: 4
4. element of this vector is: 5
5. element of this vector is: 6
6. element of this vector is: 7
7. element of this vector is: 8
8. element of this vector is: 9
9. element of this vector is: 10

It is nice, hah?

Monday, September 5, 2011

Online R Interpreter - Under development

This is the online R interpreter, Renjin, the Java implementation of the popular statistical programme. Note that it is under development and it includes unimplemented functionality and bugs. But it may be nice to try it online and you can report some bugs or join this project. Link is http://renjindemo.appspot.com/

Friday, August 26, 2011

renjin - JVM-based Interpreter for R Language for Statistical Computing

Today, i have just participated to renjin project with my first patch. I believe that porting R from C to Java makes the R available in different kind of computers rather than PC's. At a first glance, it may the R available in Android systems, for example (Except for native libraries).




Thursday, August 25, 2011

Renjin - R interpreter written in Java

Today, I stumbled upon a web page under the Google Project Hosting, which is a re-implementation of R in Java. It is a good news for R & Java programmers because it opens a new way to call R functions from Java directly. Project is open source and distributed under the GNU GPL v3. Many functions were implemented, R interpreter works well. I think there is much more work to do, especially, speed is the main issue. Unfortunately, there are only three developers that i only saw and this is a wonderful work. Finally, it would be good to be involved this project. The project web page is http://code.google.com/p/renjin/ and there is a live demo of the interpreter in site http://renjindemo.appspot.com/.

More contributors needed for the project RCaller

We need new contributors to enhance the functionality of RCaller. We need also feedbacks about
  • type of projects that RCaller used in
  • frequently used functions of R
  • new functionality required.
  • Bug reports
We also need a web page, rather then http://www.mhsatman.com/rcaller. A Logo would be good.

We need developers, testers, documenters which may have skills on Java, R, LaTeX or HTML.

We can enlarge the space spanned by RCaller, say that, PhpCaller, CCaller or something derivative may be included for Php and C, respectively. Note that, there are already some libraries for calling R from other languages. RCaller has lesser efficiency on run time but higher speed on development time.

Please join the project.
google code page: https://code.google.com/p/rcaller/

Friday, July 22, 2011

Random Number Generation with RCaller 2.0

Java has two standard libraries for generating random numbers. The java.lang.Math class has a random method with is used for generating uniform distributed random numbers. The second one is the java.util.Random class which has got several functions for generating random numbers. We can draw random numbers from several distribution using the probability integral transform. But R has many internal functions for random number generation from several probability distributions including the gamma, the binomial, the normal etc.


RCaller has a wrapper class, under the package statistics, for generating random number for those distributions. The class statistics. RandomNumberGenerator has these functions:


public double[] randomNormal(int n, double mean, double standardDeviation)
public double[] randomLogNormal(int n, double logmean, double logStandardDeviation) 
public double[] randomUniform(int n, double min, double max) 
public double[] randomBeta(int n, double shape1, double shape2)
public double[] randomCauchy(int n, double location, double scale) 
public double[] randomT(int n, int df) 
public double[] randomChisqyare(int n, int df)
public double[] randomF(int n, int df1, int df2)
public double[] randomPoisson(int n, double lambda) 
public double[] randomBinom(int n, int size, double p)
public double[] randomNegativeBinom(int n, int size, double p)
public double[] randomMultinomial(int n, int size, double p)
public double[] randomGeometric(int n, double p) 
public double[] randomWeibull(int n, double shape, double scale) throws 
public double[] randomHyperGeometric(int amount, int n, int m, int k) 
public double[] randomExponential(int n, double theta) throws Exception 
public double[] randomGamma(int n, double shape, double rate, double scale) 


One can see the usage of class in the Test5 class in the source of RCaller 2.0.
http://code.google.com/p/rcaller/source/browse/RCaller/src/test/Test5.java

Sunday, July 17, 2011

About the licence of RCaller

The licence of RCaller 2.0 was changed to LGPL . That means you can use it in commercial projects without distributing the source code.

For our users who like RCaller...

Tuesday, July 12, 2011

Debugging the R output of RCaller

RCaller 2.0 has been submitted to Google Code before two or three days. Many RCaller users are testing it and except a Windows installation it seems not to be so problematic.

RCaller 2.0 receives the R outputs as XML files. If the user does not know the returned variable names or if there was a problem with the results some debugging stuff is needed.

Now the function 'getXMLFileAsString()' is implemented in RCaller, by using it, the converted R output can be investigated.

Suppose that we want to run some R code from Java and we want to have a look at the returned XML content. Have a look at these codes:



package test;

import rcaller.RCaller;

/**
 *
 * @author Mehmet Hakan Satman
 */

public class Test4 {
   
    public static void main (String[] args){
        new Test4();
    }
   
    public Test4(){
        try{
            /*
             * Creating an instance of RCaller
             */
            RCaller caller = new RCaller();
           
            /*
             * Defining the Rscript executable
             */
            caller.setRscriptExecutable("/usr/bin/Rscript");
           
            /*
             * Some R Stuff
             */
            caller.addRCode("set.seed(123)");
            caller.addRCode("x<-rnorm(10)");
            caller.addRCode("y<-rnorm(10)");
            caller.addRCode("ols<-lm(y~x)");
           
            /*
             * We want to handle the object 'ols'
             */
            caller.runAndReturnResult("ols");
           
            /*
             * Getting R results as XML
             * for debugging issues.
             */
            System.out.println(caller.getParser().getXMLFileAsString());
        }catch(Exception e){
            System.out.println(e.toString());
        }
    }
}



Because of the "set.seed()"  function of R, this code should produce the same results for all machines. The XML output is






This structure of this XML file is simple and one can see that each single variable is encapsulated within a "" and a "" tag.

Another way of handling the variable names is to use "ROutputParser.getNames()". This function simply returns an ArrayList which includes the variable names returned by R.

Friday, July 8, 2011

Rcaller 2.0 - Calling R from Java


NEWS:

2016-05-15: New Release - RCaller 3.0

2015-03-13: New Release - RCaller 2.5

2014-07-15: New Release - RCaller 2.4

2014-06-07: New Research Paper on RCaller

2014-05-15: New Release - RCaller 2.3

2014-04-15: New Release -  RCaller 2.2



I have received too many e-mails since i had first submitted the early versions of the RCaller. Some users found it usable so i was planning to develop a newer and enhanced version of this library. Now, i think, it is ready for testing. The 2.0.0 version of the RCaller is downloadable from

http://code.google.com/p/rcaller/downloads/list

with both compiled jar file and the source file with the directory structure of NetBeans 7.

The use of RCaller is changed after version 1.0 but it is still easy to implement, it does not need extra libraries, it is platform independent and compatible with the recent R versions.

Some new features in version 2.0.0 are:
1) Support for plots
2) Easier code generation
3) Enhanced interaction with R

Before anything, install the R package "Runiversal" by typing

install.packages ( "Runiversal" )

in R interactive interpreter. If installation is success, you are ready for calling R from Java. 

Let me explain them with some examples. Suppose that we have a double array with values of {1,4,3,5,6,10} and we want to show a time series plot with this. Firstly we import the needed libraries:

import java.io.File;
import rcaller.RCaller;

We are declaring the double array:

double[] numbers = new double[] {1,4,3,5,6,10};

Creating an object of Rcaller class:

RCaller caller = new RCaller();

RCaller needs the Rscript executable file (Rscript.exe in windows) which is shipped with the R. You must tell the full path of this file in RCaller like this:

caller.setRscriptExecutable("/usr/bin/Rscript");

This is the location of my Rscript file in my Ubuntu Linux. We didn't do much thing, but this code initializes the whole thing:

caller.cleanRCode();

the cleanRCode() function of RCaller class cleans the code buffer and puts some code in it. You can browse the source code if you want to know more about the initialization. Now, we can add our double array to our R code:

caller.addDoubleArray("x", numbers);

Now we have 'x' with the value of numbers[]. Now we are creating the time series plot:

File file = caller.startPlot();
            caller.addRCode("plot.ts(x)");
            caller.endPlot();

Finally we are sending the whole code to R interpreter:

caller.runOnly();

With this code, Rscript runs our code but it does not return anything. After all, if we have'nt got any errors, we can handle the generated image using

ImageIcon ii=caller.getPlot(file);

or we can show it directly using

caller.showPlot(file);

The generated plot is shown below:



The source code of entire example is given below:


package test;

import java.io.File;
import javax.swing.ImageIcon;
import rcaller.RCaller;


public class Test1 {

    public static void main(String[] args) {
        new Test1();
    }

    /*
     * Test for simple plots
     */
    public Test1() {
        try {
            RCaller caller = new RCaller();
            caller.setRscriptExecutable("/usr/bin/Rscript");
            caller.cleanRCode();

            double[] numbers = new double[]{1, 4, 3, 5, 6, 10};

            caller.addDoubleArray("x", numbers);
            File file = caller.startPlot();
            caller.addRCode("plot.ts(x)");
            caller.endPlot();
            caller.runOnly();
            ImageIcon ii = caller.getPlot(file);
            caller.showPlot(file);
        } catch (Exception e) {
            System.out.println(e.toString());
        }
    }
}



This example shows how to send vectors to R, call R function from Java and handle the result from Java. We are generating the x and the y vectors in Java and sending an "ordinary least squares" command to R. After running process, we handle the calculated residuals, fitted values and residuals from Java. I hope the example is clear enough to understand.


package test;

import rcaller.RCaller;


public class Test2 {
   
    public static void main(String[] args){
        new Test2();
    }
   
    public Test2(){
        try{
            //Creating an instance of class RCaller
            RCaller caller = new RCaller();
           
            
            //Important. Where is the Rscript?
            //This is Rscript.exe in windows
            caller.setRscriptExecutable("/usr/bin/Rscript");
           
           
            //Generating x and y vectors
            double[] x = new double[]{1,2,3,4,5,6,7,8,9,10};
            double[] y = new double[]{2,4,6,8,10,12,14,16,18,30};
           
            //Generating R code
            //addDoubleArray() method converts Java arrays to R vectors
            caller.addDoubleArray("x", x);
            caller.addDoubleArray("y", y);
           
            //ols<-lm(y~x) is totally R Code
            //but we send the x and the y vectors from Java
            caller.addRCode("ols<-lm(y~x)");
           
            //We are running the R code but
            //we want code to send some result to us (java)
            //We want to handle the ols object generated in R side
            //
            caller.runAndReturnResult("ols");
           
           
            //We are printing the content of ols
            System.out.println("Available results from lm() object:");
            System.out.println(caller.getParser().getNames());
           
           
            //Parsing some objects of lm()
            //Residuals, coefficients and fitted.values are some elements of lm()
            //object returned by the R. We parsing those elements to use in Java
            double[] residuals = caller.getParser().getAsDoubleArray("residuals");
            double[] coefficients = caller.getParser().getAsDoubleArray("coefficients");
            double[] fitteds = caller.getParser().getAsDoubleArray("fitted_values");
           
            //Printing results
            System.out.println("Coefficients:");
            for (int i=0;i< span="">
                System.out.println("Beta "+i+" = "+coefficients[i]);
            }
           
            System.out.println("Residuals:");
            for (int i=0;i< span="">
                System.out.println(i+" = "+residuals[i]);
            }

        }catch (Exception e){
            System.out.println(e.toString());
        }
    }
   
}


I hope it works for you.

Saturday, May 14, 2011

Handling plots with rcaller

Using RCaller is a simple way of calling R scripts from Java. Unfortunately image handling routines was not implemented so user must handle this stuff himself. In R the result of a plot object can be saved in a simple way like this:











#Data generating process
x<-rnorm(100, 0, 2)
#we generated a normal sample with mean 0 and standard deviation 2

png("path/to/file.png")
plot.ts(x)
dev.off()

After running this short script, no screen output is produced but path/to/file.png is created as a png image. After calling this script from Java, produced image can be loaded like this:

ImageIcon myIcon = new ImageIcon("/path/to/file.png");
 
This image can be easly drawn on a swing container using

public void paintComponent(Graphics g) {
    super.paintComponent(g);
    myIcon.paintIcon(this, g, 0, 0);
}
 

Thursday, November 11, 2010

Calling R from Java - RCaller

Edit: This version of RCaller is deprecated, please check the new version of 2.0. This entry is about older versions of RCaller and may be outdated. A blog entry for version 2.0 is in http://stdioe.blogspot.com/2011/07/rcaller-20-calling-r-from-java.html and http://stdioe.blogspot.com.tr/2014/04/rcaller-220-is-released.html for RCaller version 2.2

 

New version : Rcaller 0.5.2

 

Note: The source page of this article is http://www.mhsatman.com/rcaller.php

[2010/08/07] Now, Rcaller has a new version, 0.5.2, with some bug fixes and additional functionality. Some changes are done and some bugs are fixed by John Oliver. John is now second developer of the Rcaller.

Change Log for version 0.5.2:

  • Added a multi-threaded StreamReader class to RCaller, for stream reading both stderr & stdout to prevent read blocks.
  • StreamReader will optionally echo what it receives to the parent process stdout & stderr, so you can see what is going on
  • Changed RunRCode to use StreamReader
  • Changed RunRCode to wait for the sub-process to complete before returning
  • int[] RGetAsIntArray(String name) function was added so results from R functions can be handled as integer arrays
  • String[] RGetAsStringArray(String name) function was added in order to handle R results as String arrays
  • Removed extra cat(javaCode) call from makejava.r

RCaller

RCaller is an other simple way to call R from Java without JNI. There are lots of queries in the internet about "how to call r from java" or "call r function from java with / without JNI". There are some solutions about these works, for example, RServe is a server application written in C and it waits for socket connections, then accepts clients and runs the R code that sent from socket streams and returns SEXP 's (S / R Expressions). Also, rJava is a JNI solution for calling R from Java. But as i see, users don't want to struggle this things and they seeks more practical solutions.


RCaller uses neither sockets nor JNI interface for calling R functions from Java. RCaller simply runs RScript executable file using java's Runtime and Process classes. Then runs R commands using arguments and handles results using streams. RCaller converts R objects to Java's double or String arrays using a R script and BeanShell interpreter. After these operations R results can be handled by user using getter methods.


You can use it in your Java applications that needs some statistical calculations. Implementation and setting-up processes are easy. You can download source codes as Netbeans project and jars. Simply add two jars to your classpath and start calling R!





Examples

1)Getting Pi from R!



In this example, we are calling R code "a<-pi;" that sets the value of pi to variable a. Then, we handle this result from Java.


RCaller caller=new RCaller();
        StringBuffer code=new StringBuffer();
        code.append("a<-pi;cat(makejava(a))");
        try{
            caller.RunRCode(code.toString(),false,false);
            System.out.println(caller.RGet("a[0]"));
        }catch (Exception e){
            System.out.println(e);
        }


The result is 3.14159. RCaller always handles results as arrays, so a is not variable but double array. Array has only one element, so a[0] is the value that sent from R. We have to use cat(makejava(a)) to make R object 'a' usable in Java.
We call RunRCode() function with 3 parameters. Last 2 parameters are boolean. If first one is true, then content of stderr will be written on console. If the second one is true, then content of stdout will be written. We set them false not to write both outputs on the screen.

2)Calculate Linear Regression from Java using R



In this example, we set x and y with random variables that come from standard normal distributions and estimate linear regression using R and Java.


RCaller caller=new RCaller();
        StringBuffer code=new StringBuffer();
        code.append("x<-rnorm(10);");
        code.append("y<-rnorm(10);");
        code.append("ols<-lm(y~x);");
        code.append("cat(makejava(ols));");
        try{
            caller.RunRCode(code.toString(),false,false);
            double[] coefs=caller.RGetAsDoubleArray("coefficients");
            for (int i=0;i
        }catch (Exception e){
            System.out.println(e);
        }


The result is
-0.815634476060036
0.637334790434423


so, these are the estimated coefficients of the ordinary least squres regression.

3)Running RCaller in different platforms (Linux, Windows, Mac, etc)



RCaller is pure Java and can be run any platform that Java virtual machine runs. Also, you need to be have R as well. Default R engine is Rscript executable file that distrubited in R. Default value of engine is /usr/bin/Rscript but user can change location using setRScriptExecutableFile(String location) method.


RCaller caller=new RCaller();
        caller.setRScriptExecutableFile("C:\\Program Files\\...\\R\\..\\Rscript.exe");
 //caller.setRScriptExecutableFile("/usr/bin/Rscript");

4)What objects returned after running my R command?



RCaller converts R objects to Java objects. You can handle returned values' names like this:


RCaller caller=new RCaller();
        StringBuffer code=new StringBuffer();
        code.append("x<-rnorm(10);");
        code.append("y<-rnorm(10);");
        code.append("ols<-lm(y~x);");
        code.append("s<-summary(ols);");
        code.append("cat(makejava(s));");
        try{
            caller.RunRCode(code.toString(),false,false);
            ArrayList fields=caller.getFieldList();
            for (int i=0;i
        }catch (Exception e){
            System.out.println(e);
        }


The result is:
double[] residuals
double[] coefficients
double[] sigma
double[] df
double[] rsquared
double[] adjrsquared
double[] fstatistic
double[] covunscaled
double[] residuals
double[] coefficients
double[] sigma
double[] df
double[] rsquared
double[] adjrsquared
double[] fstatistic
double[] covunscaled


and these are all returned fields from the summary() R command.

Download source code and jars






Version0.5.2
Netbeans project and source codeDownload
Jars (RCaller.jar and bsh-core-2.0b4.jar)Download

Version0.5.1
Netbeans project and source codeDownload
Jars (RCaller.jar and bsh-core-2.0b4.jar)Download

If you like this solution or you have got any questions, you can send e-mail using mhsatman [at] yahoo.com.
Mehmet Hakan Satman, Istanbul University, Faculty of Economics, Department of Econometrics