Eric Torreborre

Share

Developer at Ambiata
etorreborre.blogspot.com.au/
@etorreborre

Developer at www.ambiata.com. We use Scala & Haskell to move data around (big and small) and run machine-learning algorithms to push our customers revenue up. I created the specs2 library and was part of the team which developed the Scoobi library for Hadoop.

YOW! Lambda Jam 2015 Brisbane

Datatypes for the Real World

TALK – VIEW SLIDES WATCH VIDEO

“Make illegal state irrepresentable” – Yaron Minsky

Starting from Yaron’s quote we will see where it traditionally applies: values sanitization (escaping), data structures (NonEmptyList), CS constructs (parsers). Does it still apply to the rest of the day-to-day programming where things look… complicated?

In particular we will develop the example of executing applications which can run on single Amazon EC2 instances (for testing) or on full EMR clusters (for production) and which can take data from S3 (meaning that they need to download it and put it somewhere first) or use local data.

Can we make sure we get the right paths? Can we know when it is possible to download data efficiently (using distcopy to go directly from S3 to the cluster for example)? The answer is: create specific data types to model the different cases and tame the complexity.

The take-away: datatypes are not only for “data” but they can also encapsulate “logic” Can we make sure we get the right paths? Can we know when it is possible to download data efficiently (using distcopy to go directly from S3 to the cluster for example)?

The answer is: create specific data types to model the different cases and tame the complexity.

The take-away: datatypes are not only for “data” but they can also encapsulate “logic”