Ocean carbon research faces a persistent challenge: high-quality in-situ measurements are extremely sparse and expensive to collect, yet these data are essential for understanding marine primary production and its role in global climate processes. Current satellite-based approaches struggle with limited validation data, a problem recognised by the IPCC as a major constraint on ocean carbon cycle understanding. We demonstrate how GeoFoundation models pre-trained on abundant unlabelled Sentinel-3 satellite data can dramatically improve performance when only small amounts of in-situ measurements are available. Our approach first pretrains a model using 512,000 Sentinel-3 tiles spanning global ocean regions to learn generalizable features, then fine-tunes on sparse oceanographic measurements. The benefits for data-limited applications are substantial. Our primary production model achieved meaningful performance using only 103 in-situ measurements—representing just 6% of the pixels in a single satellite image. When training data was reduced to just 19 observations, the foundation model maintained strong performance whilst conventional approaches failed. This demonstrates the approach's potential to extract maximum value from existing sparse datasets and opportunistically collected measurements. Beyond improved statistical performance, the model captures realistic spatial patterns over large oceanic regions where no training data exists. Large-scale inference reveals detailed coastal productivity structures that conventional physical models typically under-predict, suggesting the approach has learnt meaningful oceanographic relationships. We also evaluated the approach for chlorophyll-a concentration estimation using 274 global in-situ measurements. The GeoFoundation model substantially outperformed existing methods, achieving lower RMSE compared to decision tree approaches and the operational Sentinel-3 OLCI Level-2 neural network product. Crucially, when applied to large-scale inference over the North Sea, the foundation model produced spatial patterns with higher Structural Similarity Index Measure (SSIM) (0.88) to the operational product compared to models trained from scratch (0.82), and the decision tree (0.68), demonstrating improved ability to capture realistic oceanographic features. The implications for operational ocean carbon monitoring are significant. This methodology could enhance existing observation networks by maximising the value of each expensive ship-based measurement and support carbon cycle research in data-poor regions.
Authors: Moffat, David (1); Dawson, Geoffrey (2); Vandaele, Remy (3); Taylor, Andrew (4); Tamura-Wicks, Helen (2); Jackson, Sarah (4); Lickorish, Rosie (2); Fraccaro, Paolo (2); Luo, Chunbo (3); Jones, Anne (2)Organisations: 1: Plymouth Marine Laboratory, United Kingdom; 2: IBM Research Europe; 3: University of Exeter, United Kingdom; 4: STFC Hartree Centre, United Kingdom