Monday, November 28, 2011

String ambiguity in Java



Java has two kind of data types. The first one includes the primitive data types. They are int, long, double
float, short, byte, boolean and char. Defining a single variable with one of the data types given above is similar
to variable declaration in C or C++. As in C brother, Java simply allocates a proper memory area for the given
type and maps the variable name for it when we type

int i;

in a program. It is simple to understand and clear. The mechanism underlying instantiating a class to create a new object
is similar to C++. For example, we create an object by instantiating a CCObject class using

CCObject *obj = new CCObject();


in C++, whereas, it is

CCObject obj = new CCObject();


in Java. In C++ example, a memory area is allocated for the CCObject and it is mapped to pointer obj which is shown as *obj.
We call its method "meth" using a code similar to

obj->meth();

whereas, it is

obj.meth();

in java. They are both created using a dynamic loading mechanism, that is, they does not exist in compile time and they
are created in runtime. In this use of "->" operator, object instantiating mechanisms are similar. However there is an
other method of creating objects in C++ which is like

CCObject obj;
obj.meth();


and it is totally different from the examples above. In this example, the object obj is created at the compile time
and it is faster. The dot operator is also different from the same operator in Java. Understanding the dynamic class
instantiating and compile time creating is important.

Lets have a look at the String class in Java, which is defined in the package java.lang in the Java core library. We can
create two Strings using a code

String s1 = "This is string 1";
String s2 = "And this is the second one!";

and we can do

String cat = s1 + s2;

which requires a "operator overloading" operation in C++. In Java, there is no operator overloading, that is, you can
not define a behavior for a given operator on a given class. But String class do that!

String class is an exception and it has got different properties when compared to others. Java compiler behaves different
when it compiles the codes with a String object. The another ambiguity is using literals with class methods. Look that:

String s = "Hello, this is a curse Java string";

This use seems like the Java Strings are built-in data types rather than objects! Ofcourse, Java compiler changes this line to

String s = new String("Hello, this is a curse Java string");

but what about this? :

int l = "Hello, this is a curse Java string".length;

If a Java student looks this line, the Java String seems to be an object again! This is because of the weird design of Java Strings.
Java compiles them in a consistent way but being Java Strings a exception obstructs the clean pattern of Java language.

What is the solution?

Firstly, for my personel opinion, operator overloading would be a good property for Java language. So, it would be a more elegant solution
to drop this

Matrix A = new Matrix (data1);
Matrix B = new Matrix (data2);
Matrix C = A.transpose().product(A).inverse().product(A.transpose().product(B.transpose));

and replace with

Matrix A = new Matrix (data1);
Matrix B = new Matrix (data2);
Matrix C = (A.transpose() * A).inverse() * A.transpose() * B.transpose();

This provides the consistency in the use of summation operator with String classes.

Dropping the ambiguity in the cases that "a string content".length and String s = "This is a string" is hard because millions of
Java code uses this syntax. Deprecating this use and dropping it in next revision is a solution. It would be still as it is. But
remember the Basic language and remember how difficult to parse it was. Writing code in an easy way is not the whole art. A consistent
language is like an deterministic toolbox.

My radical idea is to use the C++ syntax in Java virtual machine. Something must be an object, something must be a pointer to an object. Operators
would be overloaded as "Java does it to its Strings!". So, how it would be nice if we compile this code into the JVM:

String *s = new String("Hello there, it would be a Java String!");
int length = s->length;
System::out::println(*s->chars);

and of course

String s = String("Hello there... it is also a Java String");
int length = s.length;
System::out::println(&s.chars);

:)

Moreoever, there would be some strangers around who want their C++ syntaxed Java code compiled into the JVM!

No comments:

Post a Comment

Thanks