To check whether a type is an erased subtype of another we walk along the inheritance hierarchy of the (presumably) subtype and check whether we find a type with the same qualified name as the (erased) supertype. If this is the case we stop the search and return success.
- This does not take reference subtyping rules into account:
List<?>
is not a subtype ofList<Object>
but the current implementation will tell you it is as it only compares theList
type.
Adapting a type is the process of converting a generic type or type variable from the class it is declared in to a subclass. This can change the type quite dramatically as the following examples show:
interface Parent<T> {}
interface Middle<Q> extends Parent<Q> {}
interface Bottom extends Middle<String> {}
Adapting T
to Middle
returns Q
, as the type parameter was renamed in the
subclass. Adapting Q
(or T
) to Bottom
returns String
as the subclass
substitutes a concrete type.
Adaption currently works like this:
-
We compute the hierarchy from our context (subclass) to the supertype. This hierarchy is a tree built from the subclass by examining all supertypes and consists of two types of nodes: Declaration and Glue nodes.
Declaration nodes are created for each type we visit along the way and store their formal type parameters in the order they appear in the type.
Glue nodes are created for each type use. If we have a class
interface Foo<T> extends Bar<T>
, thenBar<T>
is a use of the typeBar
that glues the two declaration nodes (forFoo
andBar
) together. Glue nodes can translate type parameters (e.g. convertT
toQ
) in the example in the beginning or translate a type parameter to a concrete type (e.g.Q
toString
).This hierarchy is then inverted so the root is the top most type. This is needed as we translate a reference from the supertype to the subtype by walking down the tree.
The hierarchy for the example above would be:
| DeclarationNode `Parent` | Formal Arguments: [T] | Children: [ | | GlueNode `Parent` | | Actual Arguments: [Q] | | Children: [ | | | DeclarationNode `Middle` | | | Formal Arguments: [Q] | | | Children: [ | | | | GlueNode `Middle` | | | | Actual Arguments: [String] | | | | Children: [ | | | | | DeclarationNode `Bottom` | | | | | Formal Arguments: [] | | | | | Children: [] | | | | ] | | | ] | | ] | ]
-
If the input is a type parameter we will traverse down the tree until we either arrive at a leaf or find a concrete type (like
String
in the example above).If the input reference is a generic type like
List<? extends T>
it is split apart and each part is recursively translated.
For translating formal type parameters from one method to another we can not apply the same algorithm, as the subclass method can rename them arbitrarily, but we need to return the same names! Consider this:
interface Parent {
<T> void foo(T t);
}
interface Child extends Parent {
<R> void foo(R t);
}
Translating T
from Parent#foo
to Child#foo
should obviously return R
-
but there is no mapping that tells us this! Thankfully method parameters are
easier to translate. We know that a program that typechecks will have a
compatible translated type in the same place in the subclass method as it had
in the superclass method. For this reason we can just search for a usage of T
in Parent#foo
and then return the corresponding type from Child#foo
. In the
example we would find out that T
is used as the type for the first parameter
and would return the type of the first parameter of Child#foo
- R
. Which is
correct!
For this reason the type adaption method check whether it is translating a formal type parameter declared on a method to another method - and short-circuit the operation.
Java is subject to a form of the diamond problem as interfaces allow for multiple inheritance. Due to this there might actually exist multiple paths down from a given root to a leaf. Thankfully Java disallows a type to implement any type more than once with differing parameters, so we can just choose an arbitrary path to follow. This implementation always follows the first child.
It is important to note that this will not cause us to get stuck by taking a wrong turn and no longer having a path down to our target class. We build the hierarchy from the bottom up and therefore ensure that we can get down to where we need to go from any other node in the tree.
If we can not find any hierarchy we currently return the input type unchanged.
Java allows classes to use generics of their enclosing class, e.g.
public class Outer<A> {
public class Inner<B> {
public void bar(A a, B b) {}
}
}
If you now additionally introduce a dual inheritance relationship
public class OuterSub<A1> extends Outer<A1> {
public class InnerSub<B1> extends Outer.Inner<B1> {
public void bar(A1 a, B1 b) {}
}
}
Adapting the method from Outer.Inner#bar
to OuterSub.InnerSub#bar
involves
building the hierarchy from InnerSub
to Outer.Inner
and then adapting A1
and B1
.
While this hierarchy can be used to resolve B
to B1
, it will not be helpful
for adapting A
to A1
: This generic type is passed down through a completely
different inheritance chain.
To solve this, we check whether a type parameter is declared by the class
containing the type reference.
For example, when translating A1
we notice that even though the usage is in
InnerSub#bar
, the declaration of the type parameter was in OuterSub
.
Therefore, we adjust the end of our hierarchy to Outer
instead of Inner
and
the start to the innermost enclosing class inheriting from that, which is
OuterSub
in our example.
Notice that there could be multiple matching enclosing classes, but we
have no way to decide which one to use, and just arbitrarily resolve the
ambiguity by picking the first one.
After this adjustment of the start and end types, the rest of the translation continues normally.
Closely related to type adaption (but not exactly the same!) is translating entire methods. As a first approach we might try to just re-use our type adaption code and adapt the return, parameter and thrown types down to the target class. This, however, is not sufficient as methods might declare their own type parameters. Consider this example:
interface Parent<T> {
<R extends T> R foo();
}
interface Child<Q> extends Parent<Q> {}
If we translated all types using the type adaption algorithm above we might get
R foo();
as there is no equivalent to R
we can adapt to - there just is nothing here
which would correspond to the method type parameter with an updated bound.
What we do instead is cloning the method and copying the formal type parameters but adapting their bound. The example above is first turned into:
<R extends Q> R foo();
After this step the other types are translated using the normal algorithm. As
we keep the names of the formal type parameters the same we do not need to
rewrite the method body, but we might incur clashes with existing type variables
in the subclass (have a look at what happens when you rename Q
to T
in the
subclass and add a T t
parameter to the superclass method).
It is sometimes necessary to find out whether two methods conflict, i.e. whether you can not declare both in the same type. A simple example would be
String foo();
String foo();
Obviously you can only have one of those in a type at a given time! A simple way to detect this is just checking whether the names and parameter types of the methods match.
Sadly, generic erasure makes this problem a lot more difficult. The following two methods also conflict:
void foo(List<String> list);
void foo(List<Integer> list);
One solution to fix this is only looking at the type erasure of the two methods. And this indeed solves the issue here as the methods would erase to:
void foo(List list);
void foo(List list);
Those methods have the same name and parameters and therefore obviously conflict.
Alas, this is still not enough. Consider this:
interface Foo<T> {
void foo(T t);
}
interface SubFoo extends Foo<String> {
void foo(String t);
}
Those two methods would erase - respectively - to:
void foo(Object t);
void foo(String t);
That's a bummer! The erasure of the two methods differs but the two methods
still very much conflict. In fact, one of them is overriding the other! In case
you are curious how this works in Java: The compiler generates a synthetic
bridge method in SubFoo
that delegates to the more specific one:
void foo(Object t) {
this.foo((String) t);
}
We solve this problem by falling down into the slow path, explicitly checking whether one method overrides the other if the parameter type erasure doesn't match. Only doing this when needed speeds up the process.
To detect whether one method overrides we perform a suite of basic checks:
- Do the methods have the same name?
- The same parameter count? The same parameter types?
- Is the declaring type of the first a subtype of the other?
- Are the methods static? Then they can not override anything!
If all of these check out, we grab the biggest hammer we can find in our toolbox: Method adaption. If a method overrides another, their types' erasure must match after adapting the superclass method to the subclass. In our conflict check example above:
interface Foo<T> {
void foo(T t);
}
interface SubFoo extends Foo<String> {
void foo(String t);
}
We would adapt the superclass method to:
void foo(String t);
Which is a match even before computing the type erasure!
If overriding was too easy, this introduces another layer of complexity. Two methods have the same signature if they have the same types after adapting the types.
If two methods conflict or override each other they obviously have the same
signature. But there is another way that can happen and for this context is
everything. In the explanation at the start we casually stated that the types
should be "adapted". But to what? Type adaption needs a context! The answer
isn't clear-cut here and the only user of this function
(AllMethodsSameSignatureFunction
) does the right thing. But let's have a look
at an example to illustrate this point:
interface InterfaceB<T> {
void foo(T t);
}
class TypeA {
void foo(Exception e) {}
}
class TypeB extends TypeA implements InterfaceB<Exception> {}
The interesting question here is whether TypeA#foo
and InterfaceB#foo
have
the same signature. If TypeB
would not exist the answer is a resounding No!
- the two methods have different parameter types.
TypeB
makes this harder though. It implements the interface with the same typeTypeA
uses in its method and, in fact, thefoo
methodTypeB
inherits fromTypeA
is actually the implementationTypeB
uses! In some sense,TypeA#foo
"implements"InterfaceB#foo
here and they do share the same signature!
To solve this problem, isSameSignature
first adapts both input methods to
its context and then passes the resulting methods to isConflicting
. In the
example above the adaption would produce
void foo(Exception t); // from InterfaceB
void foo(Exception e); // from TypeA
and isConflicting
would return true. Crisis averted.
With this you are hopefully well-equipped to tackle the implementation. The first drafts actually weren't much longer than this document - but over time more and more of the problems above arose. Today, the adaption code is a bit more lengthy but hopefully split up well enough that you can go bug hunting after reading this document. Happy coding :)