Skip to main content

Antonio Cisternino's Home Page

Go Search
Home
  
Antonio Cisternino's Home Page > My Blog > Considerations about Generics for Java  

My Blog: Considerations about Generics for Java

Title

Considerations about Generics for Java 

Body

I just found the time to play with Tiger, the upcoming version of Java. One of the most requested features of the language is parametric polymorphism (aka Generics).
I was aware that Pizza/GJ was the chosen implementation for Genericity in Java but I'm afraid the the final result is not what people was looking for.
I tried to consider the following example:
 
class Stack<T> {
  private T[] s;
  int sp;
  public Stack() {
    s = new T[10];
    sp = 0;
  }
}
 
I compiled this example and I got the following error (remember to use -source 1.5 flag to invoke the javac compiler):
 
Stack.java:5: generic array creation
    s = new T[10];
        ^
1 error
 
I was puzzled at first by this error: in a language supporting parametric polymorphism it would a be natural thing to do.
Then I realized what was the problem: generics types are abstractions provided only by the javac and not by JVM.
By this I mean that the following two classes generate the same bytecode:
 
class Stack<T> {
  private Vector<T> s;
  private int sp;
  public Static() {
    s = new Vector<T>();
    sp = 0;
  }
}
 
class Stack {
  private Vector s;
  private int sp;
  public Static() {
    s = new Vector();
    sp = 0;
  }
}
In GJ each parameter type is bound to a type (either interface or class). Within a class the type parameter (T in the example) is assumed to inherits from the bound so that it is possible to typecheck the generic type assuming that on instances of T it is possible to invoke methods inherited from bounds. Thus Vector<T> maps to Vector because T is bound to Object.
GJ converts parametric polymorphism into subtype polymorphism by replacing type parameters with their bounds. Consider for instance the following class:
 
class Foo<T extends Enumeration> {
  public boolean hasMoreElements(T e) {
    return e.hasMoreElements();
  }
}
 
If we compile class Foo and run javap Foo we get the following output:
 
> javap Foo
Compiled from "Foo.java"
class Foo extends java.lang.Object{
    Foo();
    public boolean hasMoreElements(java.util.Enumeration);
}
 
As you can see the type parameter is mapped to the bound and the hasMoreElements accepts a java.util.Enumeration as argument.
Now we can go back to the Stack example and understand why the compiler complains about that simple class. Let us apply the substitution of the type parameter with the bound (which is Object because not specified otherwise):
 
class Stack {
  private Object[] s;
  int sp;
  public Stack() {
    s = new Object[10];
    sp = 0;
  }
}
 
Now if the array s is used inside the class there is no problem, but what if a class behaves as an array factory? Consider the following example:
 
class Foo<T> {
  public T[] m(int sz) {
    return new T[sz];
  }
}
 
What happens if we apply the substitution in this example and we get:
 
class Foo {
  public Object[] m(int sz) {
    return new Object[sz];
  }
}
 
In this case if we run m we get an array of objects and not an array of Ts as expected. In Bracha (http://java.sun.com/j2se/1.5/pdf/generics-tutorial.pdf) it is proposed a factory pattern to work around this problem.
Note that there is also another problem to implement a factory pattern for returning arrays. Java doesn't support variance on arrays (i.e. T inherits from V doesn't imply that T[] inherits from V[]): thus a method returning T[] (with T a parameter type) can't rely on a class that generates an array of the proper type because it can't be returned by the generated method.
Another issue of Generics for Java is the lack of exact runtime type infos. Languages that support only parametric polymorphism can guarantee that a program is typesafe. But when dynamic loading and subtype polimorphims enter the picture things get complex.
Consider the following example:
 
class RTTI {
  public static void Main(String[] args) {
    Vector<String> vs = new Vector<String>();
    Object o = vs;
    Vector<Integer> vi = (Vector<Integer>)o;
    vs.addElement("Hello");
    vi.addElement(2);
    String s = vs.elementAt(1); // Runtime Error Here
  }
}
 
Now we know that Vector<String> and Vector<Integer> map on Vector at runtime it is pretty easy to see that upcasting vs to Object and then back down to vi lead to a problem. And, more important, the compiler has no hope (in general) to find the exact type at compile time.
But if there are so many problems why Sun decided for such a curious implementation of generics? The main answer is "for backward compatibility". Standard Java classes like Vector, can be seen as generic classes without even recompiling the source code. Personally I would have preferred the NextGen extension to Java, proposed by Robert Cartwright, which involves modification of the JVM but a far better type system. As a matter of fact this is the choice made by Microsoft with the upcoming generics: they've modified CLR in order to support true generic types at runtime!
By the way generics in Java doesn't allow to use non reference types as type parameters. Thus parametric polymorphism is just a matter of expressivity of the language, not of performace.
I personally prefer the way chosen by Microsoft with generics CLR (preview available using Rotor http://sharedsourcecli.sscli.net/ and Gyro http://gyro.sscli.net/) and I think it has been a mistake not to impelement a NextGen like implementation of generics in Tiger.

Expires

 

Category

Programming 
Attachments
Created at 3/7/2004 23:31  by Antonio Cisternino 
Last modified at 3/11/2004 23:14  by Antonio Cisternino