Skip to main content
European Commission logo
ESARDA
Scientific paper

Developing a Big Data Framework for Processing Sentinel-2 Data in the Context of Nuclear Safeguards

Evaluation of Apache Airflow, Rasdaman and Google Earth Engine | ESARDA Bulletin - The International Journal of Nuclear Safeguards and Non-Proliferation

Details

Identification
ISSN: 1977-5296, DOI: 10.3011/ESARDA.IJNSNP.2022.7
Publication date
1 June 2022
Author
Joint Research Centre

Description

Volume: 64, issue 1, June 2022, pages 75-87

Authors: Lisa Beumer and Irmgard Niemeyer

Forschungszentrum Jülich GmbH

Abstract:In the last years, Earth observation (EO) satellites have generated big amounts of geospatial data. Many providers offer their satellite data at low cost or even for free. For example, initiatives such as the Copernicus program, the European Union's Earth observation program, have revolutionized the market. The growing archives of satellite imagery open up a wide range of satellite EO applications, also in the field of nuclear verification where satellite imagery represents a key source of information for the implementation and verification of nuclear non-proliferation treaties [1]. The data collected, processed, analyzed, and managed for monitoring purposes is not only increasing in volume, but also becoming more and more heterogeneous, unstructured, and complex. However, Big Data is also accompanied with several issues related to capturing the data, sharing, transferring, updating, processing, and analyzing. To meet these demands, novel technologies have been developed. Apache Airflow for example has become a popular tool for defining, scheduling, visualizing, and monitoring Big Data related workflows [2]. For storing and accessing multidimensional raster data, such as satellite imagery, an array database management system, called Rasdaman, has become well established [3]. To analyze these large amounts of data effectively and efficiently, Google has developed a free-to-use cloud computing platform, known as Google Earth Engine (GEE) [4]. In this research an automated procedure for collecting, storing, processing, and analyzing satellite images based on the tools mentioned above was developed. Hereby, the strengths of Airflow in terms of the creation of dynamic workflows with high granularity and the log entries of execution became evident. Furthermore, Rasdaman provides indispensable advantages such as the open standards-based data-cube analytics possibilities. The usability and benefits of GEE with respect to big EO data management and analysis were evaluated through an analysis of two different machine learning algorithms, namely Random Forest (RF) and Classification and Regression Trees (CART). Regarding the target land over classes, the classification results of manual generation were compared with two by GEE provided land cover maps from the years 2017 and 2019. The overall accuracy of the RF and CART classifiers for the Sentinel-2 images was in the range of 87% to 98%, and 68% to 83%, respectively.

Keywords: Satellite Imagery; Big Data; Data Science; Airflow; Copernicus Hub; Rasdaman; Google Earth Engine

Reference guideline:

Beumer, L., & Niemeyer, I. (2022, June). Developing a Big Data Framework for Processing Sentinel-2 Data in the Context of Nuclear Safeguards. ESARDA Bulletin - The International Journal of Nuclear Safeguards and Non-proliferation, 64(1), 75-87. https://doi.org/10.3011/ESARDA.IJNSNP.2022.7

THMB_Bulletin-64-1_p.75-87-Beumer

Files

20 JUNE 2022
Developing a Big Data Framework for Processing Sentinel-2 Data in the Context of Nuclear Safeguards