Category Archives: Uncategorized

Spark-timeseries with Spark 1.6.1 (and Breeze 0.12)

Over the last few days I’ve been experimenting with Spark in the context of processing time series data. The explorations took me to spark-timeseries, a library aimed at just that. Right off the bat I ran into versioning problems: I use Scala 2.11 and Spark 1.6.1, whereas the spark-timeseries  depends on 2.10 and 1.3.1 (the latter released almost a year ago).

<scala.minor.version>2.10</scala.minor.version>
<scala.complete.version>${scala.minor.version}.4</scala.complete.version>
<spark.version>1.3.1</spark.version>

Going through the pom file I found a scala-2.11 maven profile, which took care of the Scala version mismatch. I created a new one to bring in Spark 1.6.1, and while at it I also upgraded scalanlp/breeze to 0.12 (the latest release as of May 2016). Building Spark-timeseries with these versions amounts to using the following profiles:

mvn package -P scala-2.11,spark-1.6.1

The updates are available from GitHub, on my repo fork. Happy hacking!

WP7 Code: Distance Computations with the GeoLocation API

(This is a repost from my old MSDN blog. I haven’t verified the links, etc.)

In my previous post I showed the most interesting code fragments for a location-aware Windows Phone 7 application. The code generates an event stream corresponding to location readings from the phone’s location subsystem. However, there are many applications that instead of lat/long readings need to compute the traveled distance (for example, when driving, biking, running, or hiking). This post shows how to convert the position readings from the Windows Phone location subsystem into distance measurements.




To begin with remember that motion on earth’s surface occurs on a(n approximate) sphere rather than on a plane. Consequently Euclidean geometry no longer does it; instead, the Haversine formula provides the distance between 2 locations. As implementing Haversine in C# has nothing to do with the phone I will reuse some code surfaced by a Web search. The implementation’s use of C# extension methods is in line with what I used to bridge between .NET events and RxLINQ event streams. In addition, they make the code read like English, which is pretty neat. The C# Haversine code follows, with the argument types updated to match those from the WP7 GeoLocation API:

public enum DistanceIn { Miles, Kilometers };

public static class Haversine
{

public static double Between(this DistanceIn @in, GeoPosition<GeoCoordinate> here, GeoPosition<GeoCoordinate> there)
{
var r = (@in == DistanceIn.Miles) ? 3960 : 6371;
var dLat = (there.Location.Latitude – here.Location.Latitude).ToRadian();
var dLon = (there.Location.Longitude – here.Location.Longitude).ToRadian();
var a = Math.Sin(dLat / 2) * Math.Sin(dLat / 2) +
Math.Cos(here.Location.Latitude.ToRadian()) * Math.Cos(there.Location.Latitude.ToRadian()) *
Math.Sin(dLon / 2) * Math.Sin(dLon / 2);
var c = 2 * Math.Asin(Math.Min(1, Math.Sqrt(a)));
var d = r * c;
return d;
}

private static double ToRadian(this double val)
{
return (Math.PI / 180) * val;
}
}

Start with the application from my previous blog post and change the XAML (MainPage.xaml) to include a text block and a button in the content panel (new code in green):

<Grid x:Name=”ContentPanel” Grid.Row=”1″ Margin=”12,0,12,0″>
    <TextBlock Height=”30″ HorizontalAlignment=”Left” Margin=”36,69,0,0″ Name=”textBlock1″ Text=”(no reading)” VerticalAlignment=”Top” Width=”392″ />
<Button Content=”Start” Height=”84″ HorizontalAlignment=”Left” Margin=”121,493,0,0″ Name=”button1″ VerticalAlignment=”Top” Width=”227″ />
</Grid>

Next wire the button such that taps (i.e., Click events) start or stop the GeoLocationWatcher. As clicks are asynchronous events RxLINQ provides an elegant solution to deal with them. First add to the Helpers class an extension method that brings Click events into the realm of Rx:

public static IObservable<RoutedEventArgs> GetClickEventStream(this Button button)
{
return Observable.Create<RoutedEventArgs>(observable =>
{
RoutedEventHandler handler = (s, e) =>
{
observable.OnNext(e);
};
button.Click += handler;
return () => { button.Click -= handler; };
});
}

Next remove the gcw.Start() from OnNavigatedTo override and add an RxLINQ query and subscriber to the button click event stream (new code in green):

protected override void OnNavigatedTo(System.Windows.Navigation.NavigationEventArgs e)
{
base.OnNavigatedTo(e);

if (gcw == null)
gcw = new GeoCoordinateWatcher();

ShowGeoLocation();
}

// snip

public void ShowGeoLocation()
{
var statusChanges = from statusChanged in gcw.GetStatusChangedEventStream()
where statusChanged.Status == GeoPositionStatus.Ready
select statusChanged;

button1.GetClickEventStream().Scan( false
, (isStarted, _) =>
{
if (isStarted)
gcw.Stop();
else
gcw.Start();
return !isStarted;
}).Subscribe(isStarted => button1.Content = isStarted ? “Stop” : “Start” );

The Scan operator allows the query to carry state from one Click event to the other. Each click event starts or stops the GeoLocationWatcher instance depending on the accumulator’s value, and then toggles it. The subscriber updates the button’s content accordingly.

Next the code must ensure that the distance computations takes into account only readings that have different lat/long values. The DistinctUntilChanged RxLINQ operator with a comparator that considers only the Latitude and Longitude values provides an elegant solution to the deduplication problem. Here’s the comparator’s code:

public class PositionComparator : IEqualityComparer<GeoPosition<GeoCoordinate>>
{

public bool Equals(GeoPosition<GeoCoordinate> x, GeoPosition<GeoCoordinate> y)
{
return (x.Location.Latitude == y.Location.Latitude && x.Location.Longitude == y.Location.Longitude);
}

// snip
}

And here’s the updated positions query (new code in green):

var positions = (from position in positionChanges
where position.Location.HorizontalAccuracy <= 100
select position).DistinctUntilChanged(new PositionComparator());

Computing distances requires two points. An elegant solution to using the current and previous location events is to combine the location event stream with itself such that every event pair represents the current and previous coordinates. The RxLINQ Zip operator combines the event streams, and the Skip operator provides the shift required for this pairing. Here’s the query that computes the distance in Km (for imperial units replace Kilometers with Miles):

var distances = positions.Zip(positions.Skip(1), (l, r) => DistanceIn.Kilometers.Between(r, l));

Finally, the Scan operator applied to the distances event stream computes the total distance; the accumulator’s initial value is 0.0, and the display shows meters:

var distance = distances.Scan(0.0, (a, e) => a + e);

distance.Subscribe(d => this.textBlock1.Text = string.Format(“Distance so far {0:00.000} m”,d*1000));

In summary, this post has shown:

  • How to convert Windows Phone 7 lat/long readings into distances,
  • How to use RxLINQ queries for UI (button click events), and
  • How to perform calculations on adjacent elements in an event stream via the Zip and Skip operators.

WP7 Code: Using the GeoLocation API

(This is a repost from my old MSDN blog. I haven’t verified the links, etc.)

I’m kicking off a series of blog posts focused on writing Windows Phone 7 code with one of the APIs that will probably attract many developers interested in getting their feet wet: GeoLocation. Building this code requires:

Before I begin, a few things to be aware of. First, the samples are not intended to be production code. In other words, don’t use this code in your avionics system. Second, I emphasize Windows Phone code at the expense of other aspects. For example, while data binding may provide an elegant solution to updating GUI elements, I’m leaving its implementation as an exercise to the reader 🙂 Caveat emptor.

Enough prose, let’s write some code. Start by opening a new Windows Phone Application in Visual Studio. This will create the necessary directories and unfold the appropriate templates. Add a TextBlock to the Grid called ContentPanel; the XAML will look similar to the following (new content in green):

<Grid x:Name=”ContentPanel” Grid.Row=”1″ Margin=”12,0,12,0″>
<TextBlock Height=”30″ HorizontalAlignment=”Left” Margin=”0,49,0,0″ Name=”textBlock1″ Text=”TextBlock” VerticalAlignment=”Top” Width=”450″ />
</Grid>

Next add a reference to the assembly holding the GeoLocation API (System.Device), and the appropriate using directive:

using System.Device.Location;

As Windows Phone 7 includes the .NET Reactive Framework (Rx) I will be using it for event-based code. To that end add references to System.Observable and Microsoft.Phone.Reactive, and the appropriate using directive:

using Microsoft.Phone.Reactive;

An instance of GeoLocationWatcher held in an instance variable of the MainPage class provides access to the location information. The OnNavigatedTo and OnNavigatedFrom methods ensure that the location watcher is started when the page opens, and stopped when it is abandoned.

public partial class MainPage : PhoneApplicationPage
{
GeoCoordinateWatcher gcw;

// Constructor
public MainPage()
{
InitializeComponent();
}

protected override void OnNavigatedTo(System.Windows.Navigation.NavigationEventArgs e)
{
base.OnNavigatedTo(e);

if (gcw == null)
gcw = new GeoCoordinateWatcher();

gcw.Start();

ShowGeoLocation();
}

protected override void OnNavigatedFrom(System.Windows.Navigation.NavigationEventArgs e)
{
base.OnNavigatedFrom(e);

if (gcw != null)
gcw.Stop();
}

// snip

The ShowGeoLocation method contains code that updates the UI (i.e., text box) with the location information. Before covering that code let’s go over a couple of helper methods that convert the .NET events raised by the GeoLocationWatcher class into Rx event streams. The first event of interest is StatusChanged. It signals when the location subsystem is ready and thus able to provide location information. The second event is PositionChanged, which signals whenever the position of the device, as determined by the location subsystem, changes. The following extension methods provide the corresponding event streams:

public static class Helpers
{
public static IObservable<GeoPositionStatusChangedEventArgs> GetStatusChangedEventStream(this GeoCoordinateWatcher watcher)
{
return Observable.Create<GeoPositionStatusChangedEventArgs>(observer =>
{
EventHandler<GeoPositionStatusChangedEventArgs> handler = (s, e) =>
{
observer.OnNext(e);
};
watcher.StatusChanged += handler;
return () => { watcher.StatusChanged -= handler; };
}
);
}

public static IObservable<GeoPositionChangedEventArgs<GeoCoordinate>> GetPositionChangedEventStream(this GeoCoordinateWatcher watcher)
{
return Observable.Create<GeoPositionChangedEventArgs<GeoCoordinate>>(observable =>
{
EventHandler<GeoPositionChangedEventArgs<GeoCoordinate>> handler = (s, e) =>
{
observable.OnNext(e);
};
watcher.PositionChanged += handler;
return () => { watcher.PositionChanged -= handler; };
}
);
}
}

If you’re building an application that needs to determine the position just once (e.g., what’s around me based on where I’m at) then the following sequence of queries does just that, updating the contents of the text box with the location information:

public void ShowGeoLocation()
{
var statusChanges = from statusChanged in gcw.GetStatusChangedEventStream()
where statusChanged.Status == GeoPositionStatus.Ready
select statusChanged;

var positionChanges = from s in statusChanges
from position in gcw.GetPositionChangedEventStream()
select position.Position;

var positions = from position in positionChanges
where position.Location.HorizontalAccuracy <= 100
select position;

positions.Take(1).Subscribe(firstLocationFix =>
{
textBlock1.Text = string.Format(“{0},{1}”
, firstLocationFix.Location.Latitude
, firstLocationFix.Location.Longitude
);
}
);

}

The first query (statusChanges) represents an event stream comprised of status changed events signaling that the location subsystem is ready. The second query (positionChanges) is a join between statusChanges and position changed events; this ensures that the latter is gated by the former. Finally, the third query (positions) filters the position changes based on the horizontal accuracy of each reading. The accuracy depends on how the location subsystem determines the location, and the code above discard readings below 100m accuracy. The last statement updates the UI with the first reading satisfying the (composed by now) queries:

  • Filtering of StatusChanged events by Status property
  • Joining between filtered StatusChanged events and LocationChanged events
  • Filtering of resulting LocationChanged events by HorizontalAccuracy property

What if you’re building an application that needs the series of location events (e.g., turn-by-turn navigation) rather than just one location? The Rx queries shown above remain unchanged. The only change involves how the event stream is consumed. The Take method is gone, and Action argument updates the text box contents. Here’s the ShowGeoLocation code, with the updated statement in green:

public void ShowGeoLocation()
{
var statusChanges = from statusChanged in gcw.GetStatusChangedEventStream()
where statusChanged.Status == GeoPositionStatus.Ready
select statusChanged;

var positionChanges = from s in statusChanges
from position in gcw.GetPositionChangedEventStream()
select position.Position;

var positions = from position in positionChanges
where position.Location.HorizontalAccuracy <= 100
select position;

    positions.Subscribe(position =>
{
textBlock1.Text = string.Format( “{0},{1}”
, position.Location.Latitude
, position.Location.Longitude
);
}
);

}

Even though the displayed latitude and longitude values may not change, the read values might change. To illustrate that the final update to this code adds a sequence number to the display. It also displays the horizontal accuracy. The code implements the sequence with the Scan operator (which applies an accumulator over the events) and an anonymous class encapsulating the sequence number. Here’s the ShowGeoLocation code, with the updated statement in green:

public void ShowGeoLocation()
{
var statusChanges = from statusChanged in gcw.GetStatusChangedEventStream()
where statusChanged.Status == GeoPositionStatus.Ready
select statusChanged;

var positionChanges = from s in statusChanges
from position in gcw.GetPositionChangedEventStream()
select position.Position;

var positions = from position in positionChanges
where position.Location.HorizontalAccuracy <= 1000
select position;

   positions.Scan(new { i = 0
, p = default(GeoPosition<GeoCoordinate>)
},
(a,e) => new { i = a.i + 1
, p = e }
).Subscribe(e =>
{
textBlock1.Text = string.Format( “{0}:{1},{2}@{3}”
, e.i
, e.p.Location.Latitude
, e.p.Location.Longitude
, e.p.Location.HorizontalAccuracy
);
}
);

}

In summary, this post has shown:

  • How to get started with the Windows Phone GeoLocation API,
  • How to convert the StatusChanged and PositionChanged events into Rx event streams, and
  • How to write LINQ queries against them.

Back to regular programming

After a long period of silence micro-workflow.com is coming back to regular programming! There have been many changes since the last update: OS, software, platform, and so on. The world is a different, more exciting place now… Stay tuned; as I’m weaning from Facebook I expect more updates and tidbits here Smile

Asleep at the Wheel

This quick rant is about Amazon.com. Before proceeding further let me assure you that I’m a big fan, having been an Amazon.com customer for over a decade. I’ve also successfully tested their A-to-Z coverage, with my shopping history including phones, camera lenses, lawnmowers (yes, several), as well as books. Yes, they’re great!

In an older post I pointed out that Amazon’s recommendation technology didn’t allow one to specify whenever they’re purchasing gifts for others. Consequently any such purchase used to throw it off, adding noise to otherwise useful recommendations. That got fixed since then, though it took longer than I thought.

These days I’m longing for another obvious yet missing feature: the ability to sort/filter by location. I purchase used books/music/videos quite frequently, and the default sort by price (increasing in the following screenshot) no longer does it for me:

image

All online stores offer the above experience; nowadays it is the norm. If my shopping experience is any indication, an improved experience is within close reach, involving information that is already available, yet for the moment is blindly pushed from the seller database(s) onto the glass.

Personally I prefer local (or close-by) sellers vs equally (or even slightly lower) priced, but far-away ones. For the items that I buy used the shipping charges tend to be the same. Consequently, in the above example, assuming the item’s cost doesn’t vary wildly, I’d pick a seller from WA over one in CA, over one in TX, over one in MN, over one in NY. To begin with, the delivery time will be shorter. Then, assuming you care about your carbon footprint, the shorter distance translates into a lower environmental impact. However, although Amazon (or other commerce sites for that matter–this experience reflects today’s common practices) has this information readily available, it doesn’t allow me to pivot by location. Consequently I have to do it through visual inspection. Let’s see how long it takes until the location pivot becomes an integral part of the shopping experience!

Services Without Borders

It’s been a few years since my last vacation overseas. Since then I acquired several e-dependencies on services such as Pandora (see my older post on feature extraction) and Hulu, a service I learned about from my colleague Adam Sheppard (you may have read about Adam on Live Labs’ web site). I discovered that these services don’t work from outside the US:

image

image

This is surprising because in both instances the providers know my permanent location from the ZIP code provided when I set up the accounts.

Luckily Pandora and Hulu are not my only options. I am also a Sirius Satellite Radio subscriber, and unlike with the previous accounts that is a paid subscription. Their service did not complain about my accessing it from outside the US. While this finding hasn’t sunk in completely it resonates with what I’m reading in Jonathan Zittrain’s The Future of the Internet (and How to Stop it) about reducing generativity.

Understanding slashdot

I’ve been a slashdot reader since the end of 1997, when I discovered it over the dial-up connection I had at the University of Illinois. While back then I visited /. almost daily, nowadays my visits are much less frequent. During this time the slashdot community expanded and changed (if nothing else we’re all 10 years older). Consequently I no longer have a good grip on how objective and well-researched the typical slashdot post is.

This changed last night, when the slashdot story Microsoft Developing News Sorting Based On Political Bias covered one of the projects I’m involved with (i.e., Blews). The coverage provided some interesting insight about /.

First, in spite of the “news for nerds” tag line, slashdot stories are not necessarily new. Over a week before the /. coverage Matt Hurst blogged about the mainstream media picking up Blews in their TechFest coverage; I also had a similar post. So if you’re looking for fresh nerdy news you’d be better off going elsewhere.

Second, the /. comments cover a wide spectrum: some are objective. Others are amusing. Others make me wonder whether a sequel to Mel Gibson’s 1997 Conspiracy Theory is in the works. Yet they are far from being evenly distributed–on the contrary. So if you’re after a reasonable S/N you’d also be better off seeking that elsewhere. (BTW if Blews resonates with you consider attending ICSWM 2008; several folks from the Blews team as well as myself will be there.)

So with old news and poor S/N what are those coming to /. after?

With Miguel de Icaza on Open Source, Mono, and Moonlight

A few weeks ago I attended Lang.NET Symposium. Charles Torre asked me to participate in a conversation with Miguel de Icaza, who was among the attendees. (While nowadays most people associate Miguel with Mono, our paths crossed–virtually–many years ago, when Tudor Hulubei and Andrei Pitis were working on GIT.) Charles Torre was our host, and we talked about open source, Mono, Moonlight, and various other bits. Our session is now available as a Channel 9 video. (Note: cross-posted from my work blog.)