Blog

Artificial Intelligence Helps Distinguish the Forests from the Trees (Part 2)

By: Eric Lewandowski, Tom Swartz, Lisa Wang, Adam Kraft, and Mikaela Weisse

Since 2015, World Resources Institute (WRI) and Orbital Insight have worked together with a grant from the Generation Foundation to find new applications of computer vision and deep learning that will support Global Forest Watch in better monitoring the world's forests.

This blog is Part II of a series that explores our latest project working to identify where oil palm is being planted and grown across large areas of the tropics. In this post, we dive into the technological foundation of our palm oil classifier. See Part I for a breakdown of the real-world implications of mapping agricultural plantations.

As companies around the world pledge to end deforestation-linked palm oil production, it is critical that we are able to monitor when and where these commodities are actually replacing natural forests. The proliferation of high-resolution satellite imagery and advancements in deep learning are now making it easier to differentiate natural forests from vast oil palm plantations.

Orbital Insight and Global Forest Watch (GFW) are working together to leverage these cutting-edge technologies to create preliminary oil palm maps for Malaysia, Cambodia, Indonesia, and Colombia, with plans to expand to other palm oil producing countries soon.

Training an algorithm to identify land use

To create a model that identifies industrial oil palm plantations, we used supervised machine learning. This process involves providing the algorithm examples of satellite images of oil palm plantations alongside non-plantation areas such as cities, bodies of water and natural forest. With these examples, the model can effectively "learn" what a plantation looks like in satellite imagery. For the algorithm to effectively classify tree plantations, it needs an abundance of training examples. Our teams manually labeled over 3,000 satellite images that covered a diverse set of plantations and geographies.

After we marked the images, we trained an algorithm to distinguish industrial oil palm plantations using a technique called a convolutional neural network (CNN). We provided the model with a set of images and corresponding maps marked by our human experts. After "learning" with our training examples, the algorithm is able to identify industrial oil palm plantations. While neural networks are difficult to interpret, the model may use clues such as texture and the pattern of trees and roads to identify industrial plantations. We evaluated the performance of our trained models by comparing predictions to ground truth markings (see Figure 1). Following this training phase, models can then be applied to large numbers of images to create country and region-wide maps.

Mastering high resolution at scale

For this project, we used high-resolution satellite imagery from the imaging company, Planet, which allowed the human markers and algorithm to clearly identify the fine details of land cover that wouldn't be visible on freely available but lower resolution image sources like Landsat.

Planet also has nearly daily global coverage, allowing us to run the algorithm on a large scale and ensuring that we will obtain imagery on clear days, even in extremely cloudy parts of the tropics. We trained an additional CNN to identify which portions of an image were covered by clouds, allowing us to combine the cloud-free parts of each image to increase accuracy and provide greater geographic coverage.

The frequency and resolution of the Planet images created a massive volume of data which posed a problem for traditional processing methods. In order to ingest, process and store petabytes (that's 1 million gigabytes!) of geospatial data, Orbital Insight spent thousands of hours developing a cloud computing-based pipeline, which has now analyzed over 600,000 satellite images for this project alone.

A promising start

We now have a process in place that can analyze large volumes of imagery across entire countries. The prototype is successfully identifying large-scale plantations and is correctly classifying over 90 percent of pixels in a dataset marked by our experts.

We've made significant progress over the last few years, but we still haven't completely accomplished our goal. The current datasets pick up large-scale row plantations reasonably well, but we're still seeing some misclassification between oil palm and other, similar-looking plantations like bananas. The algorithm also can't identify plantations until they are mature enough to be visible in satellite imagery, which means we may not be able to attribute deforestation to oil palm until several years after it occurs.

This project has taught us a great deal about the challenges of implementing deep learning for land use mapping. As noted above, the sheer amount of effort to provide training examples was daunting — the algorithm may learn by itself, but it required significant human help to bring together the materials for its learning. This has implications for the feasibility of these methods for other mapping projects. In addition, the price of high-resolution imagery may mean that expanding these maps further will become cost-prohibitive.

Figure 1: Planet images (Colombia)
Orbital Insight human-marked ground truth (light green: planted forest class, blue: not planted forest)
prediction made by our final trained model
Figure 1: Planet images (Colombia); Orbital Insight human-marked ground truth (light green: planted forest class, blue: not planted forest); prediction made by our final trained model

Our vision for the future

Over the next couple of months, we intend to update the existing maps through 2018, and expand to Peru, Liberia, Guatemala, Honduras, and Papua New Guinea, which are seeing an influx of oil palm plantations.

Land use results for a region in Borneo
Figure 2: Land use results for a region in Borneo (blue: natural forest class, green: planted forest class, orange: urban class)

We see long-term opportunities to better detect oil palm, perhaps by analyzing even higher resolution imagery. We could also use the process now in place to run more frequent updates of the algorithm and better identify change over time. And, of course, there is the potential to use these cutting-edge methods to map other commodities, forest types or deforestation risks.

This experimental project between GFW and Orbital Insight has taught us a great deal about applying deep learning technology to some of the world's trickiest forest monitoring problems. We see great potential for continuing to refine this approach and expand it beyond oil palm in the years to come, improving the data-driven foundation for the global effort to end deforestation.