Monthly Archives: October 2008

In the NYT Tech Section

 

The New York Times already has a story about Live Labs Sandbox, a project we’ve released just a few days ago. I am happy to see one of Live Labs‘ projects I’ve been driving from the embryonic stages receiving this kind of attention! For a closer look drop by our PDC 2008 session in L.A. or join the discussion about Web 2.0 security in the community forums available on websandbox.livelabs.com.

Live Labs Social Streams

In the post Political Streams Online (work blog) I mentioned that I left my fingerprints on Social Streams, the platform underneath Political Streams. For those who followed my work for a while the connections are probably obvious. For those who’d like to get oriented here are a few starting points:

  • Streams fit right into data flow architectures. A Data Flow Pattern Language covers sources, filters, and sinks, as well as how they interact with each other.
  • Aggregating data from heterogeneous sources is an integration problem. Integration Patterns discusses many proven techniques for tackling those problems.

Enjoy!

Cloud Computing and LINQ

Due to scheduling conflicts I will miss the forthcoming workshop on Cloud Computing and Its Applications (CCA08), scheduled to kick off in a couple of days in Chicago. Erik Meijer will be there to present our LINQ-to-Datacenter paper–thanks Erik!. Here’s the abstract:

A plethora of Cloud/fabric frameworks/substrates have emerged within the industry: S3/EC2, Bigtable/Sawzall, Hadoop/PigLatin. Typically these substrates have low-level, idiosyncratic interfaces with data- and query- models heavily influenced by legacy SQL.

The many choices translate into high pain of adoption by developers because of the high risk of making the wrong bet. The SQL-like query model translates into high pain of adoption because it doesn’t appeal to developers who embraced object-oriented languages like C# or Java. The SQL-like data model is suboptimal for MapReduce computations because it is not fully compositional. This conservative approach is puzzling because recent language and tool innovations such as Language Integrated Query (LINQ) address precisely the problem of compositional programming with data in modern object-oriented languages.

The proponents of the current substrates have no incentive to come up with a general and developer-friendly abstraction that hides the idiosyncrasies of their proprietary solutions and graduates from the SQL model to a modern, object-oriented and compositional style.

We propose extending the LINQ programming model to massively-parallel, data-driven computations. LINQ provides a seamless transition path from computing on top of traditional stores like relational databases or XML to computing on the Cloud. It offers an object-oriented, compositional model that hides the idiosyncrasies of the underlying substrates. We anticipate that just as the community already built custom LINQ providers for sources such as Amazon, Flickr, or SharePoint, this model will trigger a similar convergence in the space of Cloud-based storage and computation substrates.