Tag Archives: spark

Spark-timeseries with Spark 1.6.1 (and Breeze 0.12)

Over the last few days I’ve been experimenting with Spark in the context of processing time series data. The explorations took me to spark-timeseries, a library aimed at just that. Right off the bat I ran into versioning problems: I use Scala 2.11 and Spark 1.6.1, whereas the spark-timeseries  depends on 2.10 and 1.3.1 (the latter released almost a year ago).

<scala.minor.version>2.10</scala.minor.version>
<scala.complete.version>${scala.minor.version}.4</scala.complete.version>
<spark.version>1.3.1</spark.version>

Going through the pom file I found a scala-2.11 maven profile, which took care of the Scala version mismatch. I created a new one to bring in Spark 1.6.1, and while at it I also upgraded scalanlp/breeze to 0.12 (the latest release as of May 2016). Building Spark-timeseries with these versions amounts to using the following profiles:

mvn package -P scala-2.11,spark-1.6.1

The updates are available from GitHub, on my repo fork. Happy hacking!