Skip to content

Conversation

@axel-lauer
Copy link
Contributor

@axel-lauer axel-lauer commented Jan 29, 2025

Description

The PR fixes some small issues with the derivation script for sea ice extent, i.e.

  • corrected unit
  • changed list of required variables so the script can be applied to both, CMIP5 and CMIP6 data
  • added/updated some comments

This fix is required for calculating and plotting a correct seasonal cycle of Arctic/Antarctic sea ice extent for REF: ESMValGroup/ESMValTool#3891

sic = cubes.extract_cube(Constraint(name="sic"))
except iris.exceptions.ConstraintMismatchError:
try:
sic = cubes.extract_cube(Constraint(name="siconca"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I remember correctly, siconca was added because some models were missing siconc, but I am not sure if that is still the case.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That is right. In any case, I think it does not hurt to also try "siconca", so I would prefer to keep this here if that's fine.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well in that case, it looks like it needs to be re-added because tests are failing due to siconca not being required anymore.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I justed updated that part a bit: 0c8beb9
I cannot get this preprocessor working for CMIP5 data, though, when adding {"short_name": "siconca", "optional": "true"},. Alternatively, I can remove the "siconca" part. Any advice would be highly appreciated...

Copy link
Contributor

@sloosvel sloosvel Feb 21, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested with the latest commit 1eaa6f0 loading CMIP5 sic data, CMIP6 sic data and CMIP6 data that only has siconca available and it worked finding the data that is needed for each project:

datasets:

        - {dataset: GISS-E2-1-H, grid: gr} # CMIP6 sic data 
        - {dataset: GISS-E2-1-H, exp: piControl, grid: gn, timerange: '3180/3180'} # CMIP6 siconca data
        - {dataset: GISS-E2-H-CC, project: CMIP5, ensemble: r1i1p1, mip: OImon} # CMIP5 sic data

diagnostics:

  test:
    variables:
      siextent:
        project: CMIP6
        mip: SImon
        timerange: '2000/2000'
        derive: true
        exp: historical
        ensemble: r1i1p1f1
    scripts: null

Let me know if it works for you

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for looking into this, @sloosvel! I tried this and I don't think we found the optimum solution yet. If I add a CMIP6 dataset that provides both, siconc and siconca (e.g. MPI-ESM1-2-LR), I run into a shape error, e.g.

ValueError: Chunks do not add up to shape. Got chunks=((96,), (192,)), shape=(220, 256)

Also, our current solution does not support to process any observationally-based data (e.g. projects OBS, OBS6, ana4mips, obs4MIPs, native5, etc.). Here are some examples for observationally-based datasets that I tried:

  - {dataset: ESACCI-SEAICE, project: OBS6, tier: 2, type: sat, version: L4-SICONC-RE-SSMI-12.5kmEASE2-fv3.0-NH,
     supplementary_variables: [{short_name: areacello, mip: Ofx}]}
  - {dataset: HadISST, project: OBS, tier: 2, type: reanaly, version: '1', mip: OImon}
  - {dataset: CFSR, project: ana4mips, tier: 1, type: reanalysis, mip: OImon}

So maybe checking for if project == 'CMIP6' or project == 'OBS6' is enough and a plain else for all other cases in the required function? But then, there is still the shape problem.

Do you have an idea what we could do? I didn't expect this to be so complicated...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are right, trying to include siconca makes things too complicated. I checked and the issue was with only one dataset that was missing siconc. Maybe the data is available nowadays. I will remove the calls to siconca, since it's not worth it to include it just for one dataset.

@codecov
Copy link

codecov bot commented Mar 10, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 95.57%. Comparing base (4262409) to head (166e51e).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2648      +/-   ##
==========================================
- Coverage   95.57%   95.57%   -0.01%     
==========================================
  Files         266      266              
  Lines       15520    15519       -1     
==========================================
- Hits        14834    14833       -1     
  Misses        686      686              

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@github-actions
Copy link
Contributor

In order to maintain a backlog of relevant pull requests, we automatically label them as stale after 180 days of inactivity.

If this pull request is still important to you, please comment below to remove the stale label. Otherwise, this pull request will be automatically closed in 60 days. If this pull request only suffers from a lack of reviewers, please tag the @ESMValGroup/technical-lead-development-team so they can help you find a suitable reviewer.

@github-actions github-actions bot added the Stale label Nov 12, 2025
@axel-lauer
Copy link
Contributor Author

Even though there was no activity for a while I would still like to see this merged eventually.

@axel-lauer axel-lauer removed the Stale label Nov 12, 2025
@bouweandela
Copy link
Member

@sloosvel Would you still be interested in taking a look at this before you leave?

@sloosvel
Copy link
Contributor

sloosvel commented Dec 5, 2025

I fixed the merge conflicts and removed any mention of siconca. I'll try to test with a recipe soon.

@sloosvel
Copy link
Contributor

sloosvel commented Dec 11, 2025

I tested a recipe with a dataset of each project that computes and plots the siextent and it seems to run successfully now:

# ESMValTool
---
documentation:
  description: |
    Example recipe

  title: Recipe that runs an example diagnostic written in Python.

  authors:
    - loosveldt-tomas_saskia

preprocessors:
  pp_siextent:
    area_statistics:
      operator: sum
    convert_units:
      units: 10^6 km2
datasets:

        - {dataset: GISS-E2-1-H, grid: gr, project: CMIP6, mip: SImon, ensemble: r1i1p1f1}
        - {dataset: MPI-ESM1-2-LR, grid: gn, project: CMIP6, mip: SImon, ensemble: r1i1p1f1}
        - {dataset: GISS-E2-H-CC, project: CMIP5, ensemble: r1i1p1, mip: OImon, supplementary_variables: [{short_name: areacello, skip: true}]}
        - {dataset: ESACCI-SEAICE, project: OBS6, tier: 2, type: sat, version: L4-SICONC-RE-SSMI-12.5kmEASE2-fv3.0-NH, mip: SImon,
           supplementary_variables: [{short_name: areacello, mip: Ofx}]}
        - {dataset: HadISST, project: OBS, tier: 2, type: reanaly, version: '1', mip: OImon}
        - {dataset: CFSR, project: ana4mips, tier: 1, type: reanalysis, mip: OImon}
diagnostics:

  test:
    variables:
      siextent:
        timerange: '1995/2000'
        derive: true
        exp: historical
        preprocessor: pp_siextent
    scripts:
      script1:
        script: examples/diagnostic.py
        quickplot:
          plot_type: plot

@bouweandela
Copy link
Member

bouweandela commented Dec 12, 2025

Thanks @sloosvel!

@axel-lauer @sloosvel Do you think this derived variable is actually needed? I noticed that for CMIP6 there are the variables siarean and siareas (total sea ice area in the northern and southern hemisphere respectively), which seems to be what this computes. For other projects, I think you could achieve the same with the preprocessor:

    extract_region:
      # northern hemisphere
      start_longitude: 0
      end_longitude: 360
      start_latitude: 0
      end_latitude: 90
    mask_below_threshold:
      threshold: 15.0
    clip:
      minimum: 100.0
    area_statistics:
      operator: sum
    convert_units:
      units: 10^6 km2

starting from the siconc variable, though the numbers don't seem to be exactly the same.

@axel-lauer
Copy link
Contributor Author

@axel-lauer @sloosvel Do you think this derived variable is actually needed? I noticed that for CMIP6 there are the variables siarean and siareas (total sea ice area in the northern and southern hemisphere respectively), which seems to be what this computes. For other projects, you could achieve the same with the preprocessor:

We need this variable to apply a consistent calculation across models and observational datasets. Observational datasets typically provide sea ice concentration only and have "polar gaps", which cannot be accounted for (e.g. by reducing the geographical extent for the calculation or by applying a common mask) when using the pre-calculated CMIP6 total sea ice areas.

@bouweandela
Copy link
Member

Would using the preprocessors suggested above do the same thing? All the derivation function seems to do is mask values below 15% and set the remaining values to 1 with units 1 and the latter is the same as setting all values to 100%.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants