Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

More Groupers / user stories / strategies #255

Open
dcherian opened this issue Jul 27, 2023 · 2 comments
Open

More Groupers / user stories / strategies #255

dcherian opened this issue Jul 27, 2023 · 2 comments
Labels
documentation Improvements or additions to documentation

Comments

@dcherian
Copy link
Collaborator

dcherian commented Jul 27, 2023

We need more discussion of strategies to label groups or perhaps just more convenient Grouper objects

@dcherian dcherian added the documentation Improvements or additions to documentation label Jul 27, 2023
@dcherian dcherian changed the title More user stories / strategies More Groupers / user stories / strategies Jul 27, 2023
@dcherian
Copy link
Collaborator Author

dcherian commented Aug 3, 2023

Potential SeasonGrouper syntax

SeasonGrouper(["JF", "MAM", "JJAS", "OND"])
SeasonGrouper(["DJFM", "MAMJ", "JJAS", "SOND"])

The list would be used as expected_groups so the output is in the right order.

Here's an interesting thread on SeasonGrouper design: xCDAT/xcdat#416

ds.temporal.group_average(
    'pr',
    freq='season',
    season_config={
        'dec_mode': 'DJF',  
        'drop_incomplete_djf': True, # Or drop_incomplete_season
        'custom_seasons': ['Nov', 'Dec', 'Jan', 'Feb', 'Mar']
    }    
)

All of this is a factorizing problem, I like the idea of custom Grouper objects with custom factorization

@dcherian
Copy link
Collaborator Author

Some prior art: from polars

  1. https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/api/polars.DataFrame.group_by_dynamic.html

    Group based on a time value (or index value of type Int32, Int64).

    Time windows are calculated and rows are assigned to windows. Different from a normal group by is that a row can be member of multiple groups. By default, the windows look like:

     [start, start + period)
    
     [start + every, start + every + period)
    
     [start + 2*every, start + 2*every + period)
    
     …
    

    where start is determined by start_by, offset, and every (see parameter descriptions below).

  2. https://pola-rs.github.io/polars/py-polars/html/reference/dataframe/api/polars.DataFrame.rolling.html#polars.DataFrame.rolling (I don't think this is normal rolling?)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
documentation Improvements or additions to documentation
Projects
None yet
Development

No branches or pull requests

1 participant