Tuesday, December 16, 2008

Growing a language

Today I needed to do some load testing on a web site. All I wanted to do was hammer it with requests and observe how well it held up. I wasn't familiar with any tools that did this, and had no desire to learn them, so I sat down wrote my own mini-framework in less than a hundred lines of Scala. It's nothing special, but I want to share bits of it to show how you can grow Scala to fit your needs.

HTTP GET

At the heart of any web testing is HTTP. Scala runs on the JVM, so I browsed through some Java libraries for fetching HTTP content, but the existing Java approaches were too verbose for my taste. Java is a systems language and it shows, especially in the library design decisions. For this particular application, I don't want to deal with the nitty-gritty of InputStreams or RetryHandlers. I just want to GET. So I borrowed a page from a Ruby library and wrote the following:

object Http {
def get(url: String): Array[Byte] = ...
}

Http.get("http://scala-tools.org/")
Thanks Ruby, that's much cleaner. (Scala's big win over Ruby? Concurrency for Grownups, as opposed to Ruby's fake kiddie toy concurrency. Thankfully configuring commons-httpclient to manage concurrent requests is fairly straightforward.)

Timing is Everything

When load testing, it's nice to know how long things take. In Java, you need to spare a few lines of code whenever you want to time an operation. In Scala, once the appropriate abstraction is written, it's a one liner.
object Time {
def apply[T](action: => T): (T, Long) = ...
def elapsed(action: => Unit): Long = ...
}

val (response, ms) = Time(Http.get("http://scala-blogs.org/"))
val ms = Time.elapsed(Http.get("http://scala-tools.org/"))
A call to Time(...) returns a pair with the result of the action and the time it took to complete that action. I'm using special val syntax to deconstruct the pair and get at its contents. If I don't care about the result of my action, a call to Time.elapsed(...) just returns how long the action took.

How does it work? The type of the parameters has a => in front of it. This means the parameters are call-by-name parameters. That just means the parameter won't get evaluated until I use it's name somewhere in my method. That means Http.get doesn't get called until my method implementation says it should be called. This lets me record the start and end times for the Http.get operation, and report the elapsed time when my method returns. (Try that in Java!)

Getting Repetitive

I want to hammer my poor web server and put it under a lot of stress, so repetition is key. Scala has ways to repeat something several times (notably, (1 to n) syntax), but for several reasons (it's lazy by default, I don't care about the indices) I wrote my own.
def repeat[T](n: Int)(what: => T): List[T] = ...

repeat(5) {
println(Time.elapsed(Http.get("http://scala-blogs.org/")))
}

val randomInts: List[Int] = repeat(10)(random.nextInt)

val latencies: List[Long] =
repeat(1000)(Time.elapsed(Http.get("http://scala-blogs.org/")))

val (latencies, totalTime) =
Time(repeat(1000)(Time.elapsed(Http.get("http://scala-blogs.org/")))
The repeat method takes a number n and an action. It performs the action n times and returns a list (of size n) with the results of the actions.

Repetition, Repetition, Repetition (in parallel!)

Of course, in order to really hammer a server we have to request stuff in parallel, not sequentially. Actors make this really easy.
def repeatParallel[T](n: Int)(what: => T): List[T] =
repeat(n)(scala.actors.Futures.future(what)).map(_.apply)
The call to future fires off an actor that will perform the given action and immediately returns a "promise". Calling apply on the promise blocks the current thread until the actor has finished computing and returned the result of the action we requested. The repeatParallel method launches n actors in parallel, and then blocks until they've all finished computing.

Load Testing

So what does my load testing code look like?
def url = ... // randomly generated url
def timedRequest = (Time elapsed (Http get url))

repeat(5) {
val (latencies, ms) = Time(repeatParallel(1000)(timedRequest))
report(latencies, ms)
}
The report will just print out some basic statistics about the latencies (min, max, average, median) as well as the total throughput (number of requests per second over the whole period).

These six lines of code will hammer my poor little server with 5,000 requests, printing a report after every thousand requests. Every request will hit a randomly generated URL, so I'm making sure to stress test all the aspects of my web service that I care about. I'm not doing any validation on the responses (I've done separate tests for correctness), in fact, I'm throwing the responses away, but I could easily add some validators. (Scala's XML libraries and it's ability to pass functions as objects would make web service validators a zinch.)

The Lesson?

Scala lets you grow the language to mold it to your needs. The guts of my testing logic is about half a dozen lines. The other hundred-or-so lines of code I wrote are hidden away in little utilities here and there, utilities that have extended Scala in tiny ways and made it a more powerful language for the particular task I had. More importantly, all these little utilities are reusable, so now I have a more powerful language to confront whatever tasks I might face tomorrow.

6 comments:

liesen said...

I also find this type of "function wrapping" useful. I often wrap a function call to fire it off asynchronously, with a timeout and a callback.

Here's a quite pleasant HTTP library for Scala: http://technically.us/code/x/pour-some-sugar-on-httpclient/

Channing Walton said...

Cool, I did something similar for load testing an object space.

You might be interested in Apache jMeter though, its very easy to set up and run.

Channing

renghenthecow said...

code please

Aji said...

I recently came accross your blog and have been reading along. I thought I would leave my first comment. I dont know what to say except that I have enjoyed reading. Nice blog. I will keep visiting this blog very often.


Miriam

http://www.craigslistposter.info

Monis Iqbal said...

Nice demo. Know what, we may get confused with using curly brackets for single parameter methodz because at some places we could also use ().

alia said...

this explanation helped me? thanks cheap term paper