Long-term ocean observatories generate massive and heterogeneous time series of oceanographic data, ranging from hydrographic profiles to CTD casts and real-time mooring measurements. These time series are fundamental for understanding climate variability, ecosystem dynamics, and anthropogenic pressures.
However, their full scientific value is often constrained by the complexity of data management: raw files accumulate in distributed storage systems, formats vary across instruments and decades, and metadata is frequently incomplete or inconsistent. Transforming these raw sensor outputs into standardized, interoperable time series databases that can be easily integrated into international networks such as EMSO ERIC and OceanSITES is a non-trivial technical challenge that requires flexible and scalable solutions.
In this contribution, we present the design and implementation of a scalable and automated ETL (Extract–Transform–Load) workflow tailored for marine time series data management. The system efficiently handles both near real-time and historical datasets, normalizes heterogeneous sensor outputs into custom databases optimized for time series queries, and applies EuroGOOS RTQC routines to flag anomalies. It further enables advanced cross-querying across multiple sensors and variables, thereby enhancing integrative analysis of oceanographic processes and supporting interactive data exploration through visualization dashboards.

This integrated workflow highlights several technical advancements in marine time series data management:
- Automation
- Scalable
- Findable
- Accessible
- Interoperable
- Reusable
The approach presented here demonstrates how the convergence of modern data engineering techniques (ETL pipelines, object storage, containerization) can bridge the gap between raw instrument outputs and interoperable time series databases. By ensuring that long-term marine time series are systematically curated, quality-controlled, and embedded into global infrastructures, this workflow not only enhances the scientific value of individual observatories but also contributes to the broader goals of the strategies for open and FAIR data strategies.
Authors:
Pablo Fernandez, Andres Cianca, Raquel Suárez Lopez, Eric Delory (Oceanic Platform of the Canary Islands)