Tuesday, May 24, 2016

RCaller 3.0 is released!

RCaller 3.0 is released with new features.

Please visit the page

http://mhsatman.com/rcaller-3-0

for the source code, compiled binaries, other downloads and the blog post.

Hope you enjoy the project!


Sunday, September 6, 2015

Deleting a page of a pdf file in Ubuntu

Suppose you have a pdf file with many pages and you want to delete a single page (or a list of pages) from this file in Ubuntu. You can first install the pdftk package using Ubuntu's package manager or by simply typing

sudo apt-get install pdftk

in the command line.

Now think that you want 4th page to be excluded from the pdf. You can type

pdftk old.pdf cat 1-3 5-end output new.pdf

where the original pdf is old.pdf, new.pdf is the new one with 4th page is excluded.

Hope this helps.

Wednesday, September 2, 2015

Parallel Numeric Integration using Python Multiprocessing Library

Parallel algorithms can divide a problem on several cores. Some algorithms can easly divide a problem into smaller ones such as matrix calculation and numerical function integration because divided parts of these operations are independent each other, that is, integrating a function for a < x < b and b < x < c domains equals to integrating the same function for a < x < c, when the divided numerical integrals are summed up.

In Python, the class Thread can be extended by a user defined class for multi-threaded applications. However, because of the interpreter lock, a multi-threaded solution can take longer computation times even if the computer has more than one cores on the cpu. The solution is using the multiprocessing library.

Suppose that we have the standard normal distribution function coded in Python:


 def normal(x):  
   return ((1/math.sqrt(2*math.pi)) * math.exp(-0.5 * math.pow(x,2)))  
   
   


Now we define an integrator function that takes the function, the bounds of the domain and a Queue object for storing the result as parameters:


 def Integrator(func, a, b, lstresults):  
   epsilon = 0.00001  
   mysum = 0.0  
   i = float(a)  
   while(i < b):  
     mysum += func(i) * epsilon  
     i += epsilon  
   lstresults.put(mysum)  
   
   


The function Integrator simply integrates the given function from a to b. This function uses a single core even the computer has more. However, this operation can be divided into several smaller parts. Now we define a MultipleIntegrator class to distribute these operation into n parts, where n is a user defined integer, optionally equals to number of threads.


 class MultipleIntegrator:  
   def __init__(self):  
     "Init"  
   
   def do(self, func, a, b, cores):  
     self.cores = cores  
     integrationrange = (float(b) - float(a)) /float(cores)  
     self.integrators = [None]*cores  
     mya = float(a)  
     allresults = Queue()  
     result = 0.0  
     for i in range(0,cores):  
       self.integrators[i] = Process(target=Integrator, args=(func, mya, mya + integrationrange,allresults,))  
       self.integrators[i].start()  
       mya += integrationrange  
   
     for myIntegrator in self.integrators:  
       myIntegrator.join()  
   
   
     while(not allresults.empty()):  
       result += allresults.get()  
   
     return(result)  
   


When we instantiate this class and call the do function on a user defined function, the integration process of func will be shared on several cores.
Lets integrate the normal function from -4 to 4 using 2 threads:


 m = MultipleIntegrator()  
 print(m.do(normal, -10, 10,2))  
   


This result and the computation time is:

1.0000039893377686

real    0m1.762s
user    0m3.384s
sys     0m0.024s



 If we increase the number of threads to 4:


 m = MultipleIntegrator()  
 print(m.do(normal, -10, 10,4))  
   


The computation time is reported as:



1.0000039894435513

real    0m1.364s
user    0m5.056s
sys     0m0.028s




 which is reduced by the multi-processing. The whole example is given below:


 from multiprocessing import Process,Queue  
 import math  
   
   
 def normal(x):  
     return ((1/math.sqrt(2*math.pi)) * math.exp(-0.5 * math.pow(x,2)))  
   
   
 def Integrator(func, a, b, lstresults):  
     epsilon = 0.00001  
     mysum = 0.0  
     i = float(a)  
     while(i < b):  
         mysum += func(i) * epsilon  
         i += epsilon  
     lstresults.put(mysum)  
   
   
 class MultipleIntegrator:  
     def __init__(self):  
         "Init"      
   
     def do(self, func, a, b, cores):  
         self.cores = cores  
         integrationrange = (float(b) - float(a)) /float(cores)  
         self.integrators = [None]*cores  
         mya = float(a)  
         allresults = Queue()  
         result = 0.0  
         for i in range(0,cores):  
             self.integrators[i] = Process(target=Integrator, args=(func, mya, mya + integrationrange,allresults,))  
             self.integrators[i].start()  
             mya += integrationrange  
   
         for myIntegrator in self.integrators:  
             myIntegrator.join()  
   
               
         while(not allresults.empty()):  
             result += allresults.get()  
   
         return(result)  
   
               
   
       
   
 m = MultipleIntegrator()  
 print(m.do(normal, -10, 10,4))  
   
   



Note that, the standard normal function is a probability density function and can not be integrated to a number bigger than 1. All the results include many floating-point rounding errors. Many corrections can be added to Integrator for rounding issues.

Hope you get fun!

Sunday, August 30, 2015

Overloading Constructor in Python Classes

In Python, the constructor __init__ can not be overloaded. If you are a Java or C++ programmer, you probably used this facility before because even the standard libraries of these languages use function overloading.

Altough the language does not support constructor overloading, we can follow the factory design pattern for using multiple constructors in Python. In factory design pattern, we define static class members that create new instances with given parameters. In Python, if a function labeled with a @classmethod, then this function belongs to the class rather than an object. This is the same as the static methods in Java.

Now think that we want to write a Python class with three constructors. In first constructor we want to set param1 to a specific value and others to zero. In second constructor we want to set param2 to a specific value and others to zero and goes on.

 class Clazz:  
   def __init__(self):  
     "please use create1, create2 or create3"  
   
   @classmethod  
   def create1(cls, param1):  
     c = Clazz()  
     c.param1 = param1  
     c.param2 = 0  
     c.param3 = 0  
     return c  
   
   @classmethod  
   def create2(cls, param2):  
     c = Clazz()  
     c.param1 = 0  
     c.param2 = param2  
     c.param3 = 0  
     return c  
   
   @classmethod  
   def create3(cls,param3):  
     c = Clazz()  
     c.param1 = 0  
     c.param2 = 0  
     c.param3 = param3  
     return c  
   
   def dump(self):  
     print("Param1 = %d, Param2 = %d, Param3 = %d" %  
       (self.param1,self.param2,self.param3))  



Of course we don't need to set other parameters to zero in all static methods. Lets get the code more compact:


 class Clazz:  
   
   param1 = 0  
   param2 = 0  
   param3 = 0  
   
   def __init__(self):  
     "please use create1, create2 or create3"  
   
   @classmethod  
   def create1(cls, param1):  
     c = Clazz()  
     c.param1 = param1  
     return c  
   
   @classmethod  
   def create2(cls, param2):  
     c = Clazz()  
     c.param2 = param2  
     return c  
   
   @classmethod  
   def create3(cls,param3):  
     c = Clazz()  
     c.param3 = param3  
     return c  
   
   def dump(self):  
     print("Param1 = %d, Param2 = %d, Param3 = %d" %  
       (self.param1,self.param2,self.param3))  
   
   



And now, we can instantiate some objects from this class using different factory methods:


 myc1 = Clazz.create1(5)  
 myc1.dump()  
   
 myc2 = Clazz.create2(10)  
 myc2.dump()  
   
 myc3 = Clazz.create3(50)  
 myc3.dump()  
   
 myc4 = Clazz()  
 myc4.dump()  
   


myc1 calls the first static factory method. As it is expected, only the value of the first parameter is changed. Following methods sets the other parameters only. The output is

 Param1 = 5, Param2 = 0, Param3 = 0  
 Param1 = 0, Param2 = 10, Param3 = 0  
 Param1 = 0, Param2 = 0, Param3 = 50  
 Param1 = 0, Param2 = 0, Param3 = 0  


As we seen at the last line of the output, the object myc4 is created using the __init__constructor and all the parameters have value of zero.

Hope you get fun!

Tuesday, August 25, 2015

Lisp for the C++ programmer: Numerical Integration

In the series of Lisp for the C++ programmer, we present some Common Lisp examples in a way similar to examples which are written in C++ or some other languages belong the same family tree of C++.

Here is the example of Riemann sum in Common Lisp. We first define a function f which takes a single parameter x and the function y = f(x) = x, for simplicity. We will change this function later. It is known that integration of this function from 0 to 1 is 0.50.

Common Lisp code for numerical integration (approximatly result) is:



 (defun f (x)  
     x  
 )  
 (defun integrate (f start stop)  
     (setq epsilon 0.0001)  
     (setq sum 0.0)  
     (loop for i from start to stop by epsilon do  
         (setq sum (+ sum (* (funcall f i) epsilon)))  
     )  
 sum  
 )  
 (print (integrate 'f 0 1))  

The result is 0.4999532. Now we can use a more complex function, for example a normal distribution function with zero mean and unit variance. This function can be defined in Common Lisp as


 (defun normal (x)  
     (setq a (* (/ 1 (sqrt (* 2 3.141592))) (exp (* -0.5 (* x x)))))  
 a  
 )  
 (defun integrate (f start stop)  
     (setq epsilon 0.0001)  
     (setq sum 0.0)  
     (loop for i from start to stop by epsilon do  
         (setq sum (+ sum (* (funcall f i) epsilon)))  
     )  
 sum  
 )  
 (print (integrate 'normal -1 1))  



In the code above, as it can clearly be seen, we integrate the standard normal distribution from -1 to 1 and the result is 0.6826645.