HDS-LEE
HDS-LEE is the Helmholtz school for data science in life, Earth and energy. Partners are RWTH Aachen, Forschungszentrum Julich, University of Cologne, DLR Cologne and MPI Cologne. The Geoverbund partners have several PhD-projects in the context of this project.
The Agrosphere institute of Forschungszentrum Julich will set-up the model TSMP-PDAF for the African continent, which simulates water, energy and biogeochemical cycles for the subsurface and land surface including multiscale assimilation of various remote sensing products. Data assimilation (DA) for the coupled land surface-subsurface system at the continental scale is novel and has not been done yet. It will require research on the optimal tuning of the ensemble, to capture uncertainty correctly. We will focus in particular on different soil hydraulic parameter values as input, which reflect the soil map uncertainty and the uncertainty regarding the parameterization of soil properties on the basis of a soil map. The feasibility of joint state-parameter estimation will be explored. Non-linear measurement operators linking remote sensing products and modeled states will be developed and take into account scale differences. Interactive data analysis is needed to control DA runs, also for faster tuning towards the optimal set-up. The assimilation performance and data value will be evaluated with (scarce) independent in situ data. We will also perform simulations for both pristine (incorrect) and transient (correct) land use land cover, and no human water use (incorrect) and human water use (correct) and test whether DA can adjust incorrect input on the basis of systematic model-data deviations. The simulations allow the quantification of the impact of land use land cover change and human water use on the changes of the terrestrial water, energy and carbon cycles over the African continent, conditioned to remotely sensed satellite products. In order to explore the large amount of output data generated by the ensemble simulations at high spatial resolution (>100 TB), parallel big data analytical methods are an important tool. Besides classical uni-, bi- and multivariate statistics, and time series analysis, also methods suited to detect more complex patterns in space and time like wavelet analysis and machine learning (ML) algorithms will be used.