Thursday, January 29, 2009

Announcing ScalaNLP

I'd like to announce the first release of ScalaNLP, an open-source
collection of libraries with the goal of making Natural Language
Processing and Machine Learning research easier.

Currently, ScalaNLP consists of 3 projects:

* Core - A collection of utility libraries.
* SMR - Scala Map Reduce, a wrapper library for Hadoop with some stand
alone support.
* Scalala - A Scala Linear algebra library inspired by Matlab

Today, I'm mostly announcing Core, which consists of sampling routines
for many distributions, counters (specialized numeric-valued maps),
and some support for managing datasets. There are essentially no off
the shelf tools at the moment; hopefully in the future there will be.

For more information and to get started, please see the wiki at
http://www.scalanlp.org/ .

Thursday, January 15, 2009

Folks,

After many months of "breaking changes", the Lift team is proud to announce Lift 0.10. Lift 0.10 has frozen APIs and barring any material problems, the 0.10 APIs will be the same as the upcoming 1.0 APIs. We're planning to release 1.0 at the end of February.

Lift is an expressive and elegant framework for writing web applications. Lift stresses the importance of security, maintainability, scalability and performance while allowing for high levels of developer productivity. Lift is a Scala web framework.

Over the next 6 weeks, the Lift team will be focusing effort on cleaning up some Lift internals, improving performance, improving documentation, and a new Lift web site.

There are a ton of people who have contributed to Lift over the last two years by writing code, by writing documentation, by asking questions, by building Lift-based apps, and by participating in the Lift community. I would like to extend special thanks to some folks who have made Lift 0.10 both possible and awesome:

- Marius... he's an awesome developer who has amazing vision and
understanding of what Lift is and should be. If I got hit by a bus, Marius
could continue to drive the Lift code base forward.
- DavidB provides great structure and process to Lift. He runs the
scala-tools.org site and keeps all the dependencies and pieces of Lift
and the greater Scala/Maven ecosystem humming.
- Jorge's calm, deliberative way helps bring the needs of all users into
perspective.
- Derek's JPA contributions have moved Lift into a place where it can
play in non-greenfield apps.
- Tyler's energy is infectious.
- Tim's energy is properly infectious.
- Derek, Marius, and Tyler have conspired to deliver some excellent Lift
bookage for those who can convert LyX to PDF.
- Charles who delivers apps and asks questions.
- Debby who is bringing her legendary cat-herding skills to the Lift
committers.

So, as we spirit Lift through the final part of the beta process, thank you all for participating in the community.

Changes in this version include:

New features:
o A Currency class
o Added a nifty mechanism for stateful form management (Hoot)
o Added fix CSS support
o Added support for other JS libraries
o Added JSON forms support
o Added PayPal Integration module

Changes:
o Consolidate LiftRules
o Added HTTP authentication support
o Upgrade to Scala 2.7.3
o Upgrade to Scalacheck 1.5
o Upgrade to Specs 1.4.0
o Added Record/Field generic support
o Changed Can to Box
o Changed RequestState to Req
o Updated LiftView to be more syntactically pleasing
o Fixed a bug with how RequestVars and traits work
o Enhanced the DateTime inputs
o First pass at complete PayPal ITN and PDT stuff
o Refactoring of the PayPal stuff
o Updated to jQuery 1.2.6
o Redesigned Gavatar widget

Thanks,

David

Wednesday, January 14, 2009

Google data twitters

I long had a hunch that Google Data API is not the only Atom-based API out there, but never had the time to investigate. Now I just had a look at twitter (although I'm not a user -- yet), and it turns out the XML combinators I wrote some time ago for Google Data work out-of-the box! I was so happy to notice, I had to twitter blog about it.

GData-Scala-client is a library for using Google services programmatically. The core of this library is a collection of combinators for XML serialization, together with picklers for atom and other common data. Since Twitter offers an atom-based feed API, I decided to give it a shot. I fired a Scala interpreter:


scala> import com.google.gdata._
import com.google.gdata._

scala> val service = new Service("acme-twitter", "tw") {}
service: com.google.gdata.Service = $anon$1@dfee1


Make sure you start scala including the path to your downloaded gdata-scala-client jar. The Service class we created takes care of sending simple queries over HTTP and applying a pickler on the returned data. The two parameters are the application and the service name, but they have no meaning outside Google's realm, so we can be creative. Now let's make a simple query:



scala> import data._
import data._

cala> val atomFeed = new StdAtomFeed
atomFeed: com.google.gdata.data.StdAtomFeed = com.google.gdata.data.StdAtomFeed@11db25

scala> val f = service.query("http://twitter.com/statuses/public_timeline.atom", atomFeed.feedPickler)
res1: atomFeed.Feed =
Authors:
Id: tag:twitter.com,2007:Status
Title: (None,Twitter public timeline)
Updated: 2009-01-14T21:04:11.000Z
Entries: Entry:
Authors: (Robert Basic,Some(http://robertbasic.com),None)
Id: Some(tag:twitter.com,2007:http://twitter.com/robertbasic/statuses/1119276026)
Title: (None,robertbasic: @bojanpejic thanks mate :) you'll see it tomorrow in action ;))
...


Surprise! It really worked! (at least, that was my reaction). Let's see what we had to do: create an instance of StdAtomFeed, which is a class defining the standard contents of an atom feed, together with the right pickler (serialization objects). Next, we issued a query, passing the URL and the pickler to deserialize the response.

We can further play with the Atom feed, inspecting its entries, or learning more about its contents:


scala> f.updated
res10: com.google.gdata.data.util.DateTime = 2009-01-14T21:04:11.000Z

scala> f.entries.filter (_.authors.exists(_.name.startsWith("K") ))
res17: List[atomFeed.Entry] =
List(Entry:
Authors: (Katie,None,None)
Id: Some(tag:twitter.com,2007:http://twitter.com/myheartradio/statuses/1119276031)
Title: (None,myheartradio: @E_Steve Chapter Six...I'm not too far into it yet.)
Updated: 2009-01-14T21:04:11.000Z)


That's about how far I got. Authentication does not yet work, as Google Services use a different protocol. You can find out more about what's available here, but overall it's good news and I hope others will find this library useful as well.