Wednesday, December 3, 2008

Manifests: Reified Types

Reification (n.): When an abstract concept [...] is treated as a concrete 'thing'.[1]
When Java-the-language added generic types in version 1.5, Java-the-virtual-machine (JVM) did not. Generic types are a fiction of the compiler. They exist at compile time, but they are omitted from the generated bytecode and are therefore unavailable at run time. This phenomenon is known as erasure.

By contrast, when C# added generic types in version 2.0, the .NET Common Language Runtime (CLR) added them as well. This allows C# to access information about generic types at run time. We say that C#'s generic types are reified.

Java's type erasure leads to some peculiar limitations. To deal with these limitations, Java programmers often find themselves passing around java.lang.Class<T> objects: concrete manifestations of the otherwise abstract concept that is the type "T".

Starting with version 2.7.2, Scala has added manifests, an undocumented (and still experimental) feature for reifying types. They take advantage of a pre-existing Scala feature: implicit parameters.

To use Manifests, simply add an implicit scala.reflect.Manifest[T] parameter to your method, like so:
def name[T](implicit m: scala.reflect.Manifest[T]) = m.toString
This method prints the name of any Scala type. To use it, invoke the method with some type parameter:
name[Int => Int]
// returns "scala.Function1[int, int]"
When using implicit parameters, you usually have to declare a implicit identifier with the same type as your method expects, in the same scope where the method is called. With Manifests, the compiler automatically injects the implicit parameter for you, as long as it has enough type information to generate a Manifest. Essentially, this is a way of carrying over type information available at compile time into objects available at run time.

An Example: Generic Collections Optimized for Primitives

In Java, generic collections can't be used with primitive types. The boxed types must be used instead. This has implications for both run time performance and memory consumption. Despite HotSpot's best optimization efforts, boxed types can add significant overhead in performance-critical code.

In C#, by contrast, generic collections are optimized for performance with primitive types. This is possible because C#'s types are reified. If your code calls for a List<T>, C# knows at run time what T is and can use the specific implementation of List that is optimized for your T.

Let's see what Scala can do with Manifests and a lot of help from the fastutil library, which provides 1000+ (!) Java collection classes optimized for primitives.
import scala.reflect.Manifest
import it.unimi.dsi.fastutil._

def mkList[T](implicit m: Manifest[T]) = {
(m.toString match {
case "boolean" => new booleans.BooleanArrayList
case "byte" => new bytes.ByteArrayList
case "char" => new chars.CharArrayList
case "double" => new doubles.DoubleArrayList
case "float" => new floats.FloatArrayList
case "int" => new ints.IntArrayList
case "long" => new longs.LongArrayList
case "short" => new shorts.ShortArrayList
case _ => new objects.ObjectArrayList[T]
}).asInstanceOf[java.util.List[T]]
}
Now, invoking the mkList method with the proper type (i.e., mkList[Int], mkList[Double], etc.) will create an ArrayList optimized for that particular primitive type. We can verify that the correct classes are being instantiated by doing a little reflection.
mkList[Int].getClass.getName
// returns "it.unimi.dsi.fastutil.ints.IntArrayList"

mkList[Double].getClass.getName
// returns "it.unimi.dsi.fastutil.doubles.DoubleArrayList"
These optimized collections can have considerable benefits over their unoptimized counterparts in java.util. Just imagine, for example, the memory overhead of storing boxed Bytes instead of unboxed bytes. Storing just the pointer for one boxed Byte takes up at least four (and possibly eight) bytes!

Of course, the same effect could be achieved in Java by passing an explicit java.lang.Class<T> parameter to mkList. The disadvantage is that this imposes a significant overhead on the programmer to create and pass around these Class<T> objects. Scala's support for implicit parameters, and the automatic injection of Manifests by the compiler, makes reifying types on the JVM slightly less painful.

Appendix: Other Methods on Manifests

In this example, the only method we used on Manifests was toString. What other methods are available? Manifests are as-yet undocumented (not in the official Scaladocs), but looking through the source can give us an idea of what's available.

erasure

This method returns the java.lang.Class that corresponds to the run time erasure of the Scala type represented by the Manifest.

<:< and >:>

These methods test whether the type represented by the Manifest is a subtype or a supertype (respectively) of the type represented by the given parameter. As the source warns, the current implementation is merely an approximation, as it is based on the erasure of the two types.

equals

This method tests whether two types are equal. Again, the source warns that the current implementation is merely an approximation, as it is based on the erasure of the two types.

Thanks to David Hall for suggesting the example.

15 comments:

Ismael Juma said...

Hi Jorge,

Just be careful not to use anonymous functions to work with the primitive collections or the results might not be what you expect.

In terms of generic specialisations, the following links are interesting:

https://lampsvn.epfl.ch/trac/scala/changeset/16665

https://lampsvn.epfl.ch/trac/scala/wiki/AnnotatedTypeParams

Ismael

Jorge Ortiz said...

Hi Ismael,

Yup, there are lots of pitfalls on the JVM. Unfortunately this is another area where C# "got it right".

As you explore in your blog post, this can be remedied with some compiler hacking. It'd be interesting to see an open source compiler plugin that produced specialized generic functions for use with primitive collections.

And thanks for the links! I wasn't aware of those efforts, and they certainly are interesting.

Ismael Juma said...

Hi Jorge,

Yes, I agree that .NET got it right in that respect.

I would also like to see a compiler plugin for that. In fact, I wanted to work on one, but time (or lack thereof) is the issue as usual.

Yes, I am quite interested to see how those efforts turn out.

By the way, two related things that are interesting are the -optimise flag for scalac and scalar replacement for HotSpot. I intend to do some tests with the former, and I've done some with the latter:

http://blog.juma.me.uk/2008/12/17/objects-with-no-allocation-overhead/

Richard said...

Actually, you can obtain compile time information at runtime with Java, overcoming type erasure.

This is possible if you employ anonymous classes. It's a dirty trick, in fact, but it can be done. See my article about this:
http://www.jquantlib.org/index.php/Using_TypeTokens_to_retrieve_generic_parameters

So, once ...

1. it is possible to keep type information at runtime (as I explained in my article)

2. scalac is a compiler which can potentially "implement it right", creating all data structures needed at runtime when RTTI is needed (and eventually not needing at all any trick like I explained),

...my question is:

Why scalac is not able to produce multiple "incarnations" of a class, pretty much what C++ templates do?

Thanks

Richard Gomes
http://www.jquantlib.org/

Jorge Ortiz said...

Richard:

That's a very neat trick! I'm impressed. The downsides are that you can't retrofit others' generic classes with type parameters, there's no way to guarantee such a class is instantiated in such a reified manner, and you can't propagate a requirement that a generic type be reified up the call stack.

To answer your question though: Scala can compile multiple specialized implementations of a class. Look at Scala's @specialized annotation.

--j

alia said...

your example is great, it helped more than anything elsebuy research papers

Matthew Burdick said...

Your article is incredibly interesting! Browsing your work has educated me. Learned a lot from it. I will store your site and will pursue to read your future blog posts. Excellent! Kudos! Matthew, essay writing trainer

Mari said...

http://www.reconstructinglawschool.org/2012/03/book-review-living-history.html?showComment=1379972199095#c916584798846166612

Mari said...

look.....essay writing service
Great post! Interesting information and cute writing style. essay writing service
but essay writing service I dont know why I always visit here

Елена Смит said...

I really likeyour blog. There are many interesting posts.
рубанок

steve7876 said...

Your article has helped me to read this subject on a different level. I would like to thank your efforts for exploring this issue.Healthy body INC

get rid of cellulite said...

Like your writing! Still you can do some things to improve it.
schifffonds

M RAZA ABBAS said...

Thanks for a very interesting blog. What else can I write that kind of information in such a perfect approach to get? I have a company that I'm just now working on, and I look for more of these.
vastgoedfonds

get rid of cellulite said...

I am very enjoyed for this blog. Its an informative topic. It help me very much to solve some problems. Its opportunity is so fantastic and working style so speedy. Thanks
Energiefonds

steve7876 said...

The write-up features helped everyone to learn that issue using a different levels. I'd really like in order to give thanks to your energy pertaining to checking out this issue.Best Mobile wallet