Open Source Clojure project to Monitor Deforestation from Satellite Imagery

Hi, this is George Kierstein!  I was checking out the recent talks from Clojure West and ran across an amazing talk by Dan Hammer about his recent experiences building an Open Source Clojure project to Monitor Deforestation from Satellite Imagery.

After the great talk I had a few burning questions which he was awesome enough to answer:

 

General background? Start in programming?

This is me, Dan Hammer.  I am a PhD student at Berkeley and a Presidential Innovation Fellow at NASA. I work with the NASA CTO, writing open APIs for the Agency’s public data.  I was formerly the Chief Data Scientist at the World Resources Institute, where I led the Data Lab — the technical team behind Global Forest Watch.

My background is not in computer science.  Rather, I am a lost economist who codes to compensate.  I started writing Stata scripts to assess the independent impact of primary care clinics in rural India.  Then, while working with David Wheeler and Robin Kraft, I started to work on deforestation, trying to understand the economic drivers of forest clearing activity.  At the time, there was no open and available information on deforestation at the micro-level — so we generated the data from NASA satellite imagery.  This is a lot of information.  Prepackaged tools to move the data around didn’t cut it.  We moved to Python and R, splitting our tasks on the emergent Amazon Web Services.  I have developed a deep appreciation for engineers who can elegantly move bytes around.

 

In your talk you stated that functional programming, and Clojure in particular, are great for scientific computing. How were you exposed to Clojure in a scientific context and what lead you to that conclusion?

I was introduced to Clojure by my friend and paddling teammate, Sam Ritchie.  Sam found Clojure while looking for a way to develop low-maintenance MapReduce applications.  I rewrote the core algorithms in Clojure, which simplified the code and workflow.  It was … oddly fun.  It felt … strangely good.  This was my first exposure to functional programming.  The process of converting long, tangled scripts into a composition of functions was totally consistent with writing proofs in upper-level math courses, like real analysis or topology.  Mapping concepts in functional programming to axioms in functional composition continues to guide development in my projects – and the mapping helps me reason out some of the more esoteric elements of Clojure.

 

Scientific computing has become much more aware of engineering concerns like maintainability, reliability and correctness over the years.  What advantages for science-based projects do you feel Clojure has over traditional approaches?

So this is the big issue.  Robin, Sam, and I learned to write Clojure at around the same time, with Sam forging ahead into Scala and various abstractions over MapReduce jobs.  We maintained our Clojure project for almost a year until it became clear that the project couldn’t be actively maintained by the deforestation monitoring (read: remote sensing) community.  We are rewriting the code once again in JavaScript for general use and maintenance — which is sort of heartbreaking. I’m no good at writing poetry.  And I’m not even that good at writing code.  But my time with Clojure was about as close to poetry through code that I can imagine.  The sheer volume of the codebase dropped by an order of magnitude.  It’s strange, then, that the project should be harder to maintain with a smaller, self-documented codebase — but them’s the breaks.  There just aren’t many Clojure developers working on satellite imagery; and we open sourced the project for widespread usage.  I can imagine that a tech shop with a critical mass of Clojure developers feels a lot like the micro-cult around Oakley Hall’s Warlock.  Some of the best software literature ever written.

 

Any recommendations to communicate effectively with a culture steeped in FORTRAN?

There is a slightly higher learning curve associated with Clojure.  That I’ll admit.  There is, in economic-speak, a relatively high fixed cost associated with Clojure.  But the marginal costs are much lower, with algorithms easily expressed in code.  When showing NASA scientists code written in Clojure, I’ll show them an example of mapping a function across an array — and then the for-loop equivalent in Python or R (and for good measure, the equivalent lambda functions).  The basics of functional programming and — where computation is the evaluation of elemental functions — are highly appreciated by scientists, and especially researchers that deal with images or time-series where mapping across partitions of a sequence is common.

 

There appears to be heightened interest within Government agencies to work more closely with industry and to make data more accessible to the public.  What is your take on this and what role, if any, do you see specific technologies like Clojure playing?

Access to government data is a public good on par with road building or national defense.  Gray Brooks, Nick Mueltder, and others at 18F and USDS to build api.data.gov are laying the first bricks for this infrastructure.  I am doing my small part in building out the franchised api.nasa.gov alongside the data.nasa.gov team.  One such API is easy access to relevant earth imagery, where the metric for ease is defined by the hassle required by developers to retrieve the precise imagery they want.  Cloud free for the White House?  Done.  The most recent images for the NOAA’s facilities in Alaska?  Done, distributed.  Some of the image screening on the server-side (for clouds, for example) will need to be blazing fast.  As this functionality gets built out, I’m sure that the power of Clojure will be more apparent — and used more often (by me, if no one else, at least).

 

Your deforestation tool and Global Forest Watch are great examples of this kind of cross-over. What would you say are some of the most exciting up-and-coming projects in this area ?

Water.  Monitoring water from commercial satellite imagery.  The Global Forest Watch (GFW) frameworks are well-suited for distilling high-resolution imagery into tractable information, actionable insight on surface water.  And water is becoming a clear and present issue in earth observation.  Plus, Congress just proposed cuts to NASA’s earth observation budget in some bout of legislative idiocy.  Who will step in to mine the value of both existing NASA imagery and new commercial imagery, like that of Planet Labs?  It’s still unclear.  Previous successes in this field (including GFW) will certainly guide the new, distributed development — possibly within the government and possibly in the private sector.

 

Closing thoughts?

Thanks for reading this, George!!  It’s exciting that this effort is coming from within the government.  There’s something brewing.  And it smells like civic tech, well-done.



Thanks for the insight and taking the time!  I’m excited to see what else is out there so if you are a scientific programming nerd with a Clojure bent and hear of anything intriguing, or just want to nerd out about it with me, hit me up!

 

 

 

4 comments

  1. Neil says:

    Did Dan mention where the Javascript sources for the project can be found? Perhaps it would be a worthwhile exercise looking at the two sets of code and asking if one really is easier to maintain than the other.

    I suspect what is true is that the remote sensing community isn’t familiar with Clojure, that seems likely, but that the js port isn’t intrinsically easier to maintain just because it’s expressed in Javascript.

    • George Kierstein says:

      Hi!

      The Weather Project’s GitHub is here.
      The Clojure version is here.

      I’m not sure if the matching javascript repo is what they are working on however – I’ll ask!

Comments are closed.