Reification (n.): When an abstract concept [...] is treated as a concrete 'thing'.[1]
When Java-the-language added generic types in version 1.5, Java-the-virtual-machine (JVM) did not. Generic types are a fiction of the compiler. They exist at compile time, but they are omitted from the generated bytecode and are therefore unavailable at run time. This phenomenon is known as erasure.
By contrast, when C# added generic types in version 2.0, the .NET Common Language Runtime (CLR) added them as well. This allows C# to access information about generic types at run time. We say that C#'s generic types are reified.
Java's type erasure leads to some peculiar limitations. To deal with these limitations, Java programmers often find themselves passing around java.lang.Class<T> objects: concrete manifestations of the otherwise abstract concept that is the type "T".
Starting with version 2.7.2, Scala has added manifests, an undocumented (and still experimental) feature for reifying types. They take advantage of a pre-existing Scala feature: implicit parameters.
To use Manifests, simply add an implicit scala.reflect.Manifest[T] parameter to your method, like so:
This method prints the name of any Scala type. To use it, invoke the method with some type parameter:def name[T](implicit m: scala.reflect.Manifest[T]) = m.toString
When using implicit parameters, you usually have to declare a implicit identifier with the same type as your method expects, in the same scope where the method is called. With Manifests, the compiler automatically injects the implicit parameter for you, as long as it has enough type information to generate a Manifest. Essentially, this is a way of carrying over type information available at compile time into objects available at run time.name[Int => Int]
// returns "scala.Function1[int, int]"
An Example: Generic Collections Optimized for Primitives
In Java, generic collections can't be used with primitive types. The boxed types must be used instead. This has implications for both run time performance and memory consumption. Despite HotSpot's best optimization efforts, boxed types can add significant overhead in performance-critical code.
In C#, by contrast, generic collections are optimized for performance with primitive types. This is possible because C#'s types are reified. If your code calls for a List<T>, C# knows at run time what T is and can use the specific implementation of List that is optimized for your T.
Let's see what Scala can do with Manifests and a lot of help from the fastutil library, which provides 1000+ (!) Java collection classes optimized for primitives.
Now, invoking theimport scala.reflect.Manifest
import it.unimi.dsi.fastutil._
def mkList[T](implicit m: Manifest[T]) = {
(m.toString match {
case "boolean" => new booleans.BooleanArrayList
case "byte" => new bytes.ByteArrayList
case "char" => new chars.CharArrayList
case "double" => new doubles.DoubleArrayList
case "float" => new floats.FloatArrayList
case "int" => new ints.IntArrayList
case "long" => new longs.LongArrayList
case "short" => new shorts.ShortArrayList
case _ => new objects.ObjectArrayList[T]
}).asInstanceOf[java.util.List[T]]
}
mkList method with the proper type (i.e., mkList[Int], mkList[Double], etc.) will create an ArrayList optimized for that particular primitive type. We can verify that the correct classes are being instantiated by doing a little reflection.These optimized collections can have considerable benefits over their unoptimized counterparts inmkList[Int].getClass.getName
// returns "it.unimi.dsi.fastutil.ints.IntArrayList"
mkList[Double].getClass.getName
// returns "it.unimi.dsi.fastutil.doubles.DoubleArrayList"
java.util. Just imagine, for example, the memory overhead of storing boxed Bytes instead of unboxed bytes. Storing just the pointer for one boxed Byte takes up at least four (and possibly eight) bytes!Of course, the same effect could be achieved in Java by passing an explicit java.lang.Class<T> parameter to mkList. The disadvantage is that this imposes a significant overhead on the programmer to create and pass around these Class<T> objects. Scala's support for implicit parameters, and the automatic injection of Manifests by the compiler, makes reifying types on the JVM slightly less painful.
Appendix: Other Methods on Manifests
In this example, the only method we used on Manifests was
toString. What other methods are available? Manifests are as-yet undocumented (not in the official Scaladocs), but looking through the source can give us an idea of what's available.erasureThis method returns the java.lang.Class that corresponds to the run time erasure of the Scala type represented by the Manifest.
<:< and >:>These methods test whether the type represented by the Manifest is a subtype or a supertype (respectively) of the type represented by the given parameter. As the source warns, the current implementation is merely an approximation, as it is based on the erasure of the two types.
equalsThis method tests whether two types are equal. Again, the source warns that the current implementation is merely an approximation, as it is based on the erasure of the two types.
Thanks to David Hall for suggesting the example.
6 comments:
Hi Jorge,
Just be careful not to use anonymous functions to work with the primitive collections or the results might not be what you expect.
In terms of generic specialisations, the following links are interesting:
https://lampsvn.epfl.ch/trac/scala/changeset/16665
https://lampsvn.epfl.ch/trac/scala/wiki/AnnotatedTypeParams
Ismael
Hi Ismael,
Yup, there are lots of pitfalls on the JVM. Unfortunately this is another area where C# "got it right".
As you explore in your blog post, this can be remedied with some compiler hacking. It'd be interesting to see an open source compiler plugin that produced specialized generic functions for use with primitive collections.
And thanks for the links! I wasn't aware of those efforts, and they certainly are interesting.
Hi Jorge,
Yes, I agree that .NET got it right in that respect.
I would also like to see a compiler plugin for that. In fact, I wanted to work on one, but time (or lack thereof) is the issue as usual.
Yes, I am quite interested to see how those efforts turn out.
By the way, two related things that are interesting are the -optimise flag for scalac and scalar replacement for HotSpot. I intend to do some tests with the former, and I've done some with the latter:
http://blog.juma.me.uk/2008/12/17/objects-with-no-allocation-overhead/
Actually, you can obtain compile time information at runtime with Java, overcoming type erasure.
This is possible if you employ anonymous classes. It's a dirty trick, in fact, but it can be done. See my article about this:
http://www.jquantlib.org/index.php/Using_TypeTokens_to_retrieve_generic_parameters
So, once ...
1. it is possible to keep type information at runtime (as I explained in my article)
2. scalac is a compiler which can potentially "implement it right", creating all data structures needed at runtime when RTTI is needed (and eventually not needing at all any trick like I explained),
...my question is:
Why scalac is not able to produce multiple "incarnations" of a class, pretty much what C++ templates do?
Thanks
Richard Gomes
http://www.jquantlib.org/
Richard:
That's a very neat trick! I'm impressed. The downsides are that you can't retrofit others' generic classes with type parameters, there's no way to guarantee such a class is instantiated in such a reified manner, and you can't propagate a requirement that a generic type be reified up the call stack.
To answer your question though: Scala can compile multiple specialized implementations of a class. Look at Scala's @specialized annotation.
--j
your example is great, it helped more than anything elsebuy research papers
Post a Comment