Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[ENH] Add new ERP measures #12434

Draft
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

withmywoessner
Copy link
Contributor

@withmywoessner withmywoessner commented Feb 8, 2024

Reference issue

Adds features from #7848

What does this implement/fix?

  • Mean amplitude
  • Area amplitude (all positive, numeric integration)
  • Fractional area latency
  • Fractional peak latency

EDIT: Also adds function to find erp peak.

@withmywoessner withmywoessner changed the title Erp_measures [ENH] Add new ERP measures Feb 8, 2024
@withmywoessner
Copy link
Contributor Author

withmywoessner commented Feb 8, 2024

@larsoner @drammock I went ahead and created a pull request. There's nothing to review yet. Do you think scipy.integrate.trapezoid is fine to calculate the areas? Or do you know of something faster.

Also, I saw another issue related to Type Hints, do you think these are still useful even though these are 'global' functions not called from Evoked?

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Feb 9, 2024

Hi all, I also wanted to ask if the get_mean(evoked) function is even necessary. Can't you just do np.evoked.data.mean() and that will be the same as integrate.trapz(data, times) when the sampling rate is uniform. Is it ever the case in MEG/EEG research that the sampling rate would be non-uniform?

edit: I guess it would be nice to not have to use other operations to restrict the time interval. So in that case maybe I should just remove (or update) the tutorial that mentions using .data.mean()

@larsoner
Copy link
Member

In MNE-Python sample rates are always uniform

guess it would be nice to not have to use other operations to restrict the time interval. So in that case maybe I should just remove (or update) the tutorial that mentions using .data.mean()

Yeah if it's as easy as evoked.copy().crop(x, y).data.mean() I'd rather teach people this one-liner!

@drammock
Copy link
Member

Do you think scipy.integrate.trapezoid is fine to calculate the areas? Or do you know of something faster.

trapz should be plenty fast, I think

I saw another issue related to Type Hints

Currently the plan is to add type hints to whole submodules at once (e.g., do it for mne.stats, then for mne.minimum_norm, etc), rather than adding them piecemeal to each bit of code that gets changed. We can enable mypy testing for each submodule separately, so that will ensure the type hints stay accurate/complete once they're added. If we add them one function at a time, it's hard to test them because of all the unannotated functions surrounding them.

Copy link
Member

@drammock drammock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know this is draft mode still, but I wanted to ask why all the other files within stats are at the root level (stats/_adjacency.py, stats/cluster_level.py, etc), but erp is going within a subfolder (stats/erp/_erp.py)? Is there a particular reason you're choosing to organize it that way?

@withmywoessner
Copy link
Contributor Author

Hey @drammock, thanks for helping me with this!

I know this is draft mode still, but I wanted to ask why all the other files within stats are at the root level (stats/_adjacency.py, stats/cluster_level.py, etc), but erp is going within a subfolder (stats/erp/_erp.py)? Is there a particular reason you're choosing to organize it that way?

I thought that's how @larsoner mentioned it should be structured:
#7848 (comment)

@drammock
Copy link
Member

I thought that's how @larsoner mentioned it should be structured

ah OK. He probably had a good reason :)

@withmywoessner
Copy link
Contributor Author

@drammock @larsoner Is it better to match the functionality of get_peak by returning only area of the channel with the highest area for the get_area function? This may be more unintuitive for things like get_fractional_area_latency which would return a latency and I do not think there's any reason one would want just the highest latency.

For that reason, I am leaning more towards all the new functions just returning the new measures for all channels even get_area

@larsoner
Copy link
Member

I am not familiar with what people actually do for ERPs -- for example averaging over a subset of channels then computing area, computing it for a single channel, computing it for multiple channels, etc.

It's a reasonable starting point to match get_peak at least. If people really want it for all channels they can always do a one liner like:

areas = {ch_name: evoked.copy().pick([ch_name]).get_area() for ch_name in evoked.ch_names}

or similar, too.

@drammock
Copy link
Member

I am not familiar with what people actually do for ERPs

same here; I've read some papers (but not recently) and don't have a complete grasp of the ERP literature. I tend to rank the goal state like this:

  1. make it easy to get measures that are super commonly used
  2. make it possible to get less commonly used measures
  3. make our API as consistent as possible while respecting (1) and (2)

Does knowing that resolve any questions in your mind @withmywoessner? If not, we could ping a couple other folks to weigh in on what should be included in "super common" vs "less common"

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Feb 16, 2024

@drammock @larsoner Thank you for the responses. I also think it's important to match existing functionality to make it less confusing, but my main question was, if we are going in the direction of returning a single value for all channels, what single value should I return for latency = get_fractional_latency. I guess it makes the most sense to return the SMALLEST latency which kinda matches the functionality of get_peak.

@drammock
Copy link
Member

if we are going in the direction of returning a single value for all channels, what single value should I return for latency = get_fractional_latency

Sorry if I've missed the discussion/resolution of this question, but are we leaning in the direction of returning a single value across all channels? I know that's what evoked.get_peak() does, and there was some discussion of the question here and in the 3 comments following that one. There I said basically "just use .copy().pick() first", but in hindsight, I'm feeling now like if we're going to go to the trouble of adding functions that are useful for ERP researchers, we should do what they would expect / find most useful. My (possibly wrong or outdated?) impression is that ERP researchers often look at each channel separately. If that's right, shouldn't these new functions operate separately on each channel?

From an API perspective, it's easy enough to have a switch like channelwise=True|False or average=True|False or combine=None|"mean"|"median" that lets the user decide whether to aggregate across channels or not --- but let's only offer that option if it's something the ERP researchers want / will find useful.

A further question is that if we do offer to aggregate, does that mean "combine channels first, then compute the metric" or does it mean "compute metric for each channel, and then combine the per-channel metrics"? If the latter makes more sense, we could at least consider not building in the aggregation (I think we can return a numpy array or pandas series and expect that people can use its .mean() method if they want to).

I guess it makes the most sense to return the SMALLEST latency which kinda matches the functionality of get_peak.

When dealing with latencies, yeah, returning the smallest could be one sensible way to aggregate across channels. So there are further questions around how each metric should be aggregated (is there one obviously right choice for each metric, or do we need to let the users pick? etc). But the question "should we aggregate at all" needs to be decided first.

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Feb 16, 2024

Thanks for the in-depth response @drammock! I am also confused on what approach is best. I did some looking and in this tutorial by Steven J. Luck who created ERPLAB. He mentions that it is common to average across channels (link):

However, it’s often valuable to include the data from multiple channels. One way to do that is to measure from multiple different channels and include channel as a factor in the statistical analysis. However, that adds another factor to the analysis, which increases the number of p values and therefore increases the probability of getting bogus-but-significant effects. Also, interactions between an experimental manipulation and electrode site are difficult to interpret (Urbach & Kutas, 2002, 2006). In most cases, I therefore recommend averaging the waveforms across the channels where the component is large, creating a cluster, and then obtaining the amplitude or latency scores from the cluster waveform. Averaging across channels in this manner tends to produce a cleaner waveform, which decreases the measurement error (as long as we don’t include channels where the ERP effect is substantially smaller). Also, it avoids the temptation to “cherry-pick” the channel with the largest effect.

However, talking to a member of my lab he stated that they prefer to find the peaks/areas of each channel then do a correction for multiple comparisons if they end up doing some statistical analysis.

Sorry if I've missed the discussion/resolution of this question, but are we leaning in the direction of returning a single value across all channels?

Personally, I am leaning more towards returning all the channels even though that does not match the functuality of get_peak(). I think it would aggregation much easier.

A further question is that if we do offer to aggregate, does that mean "combine channels first, then compute the metric" or does it mean "compute metric for each channel, and then combine the per-channel metrics"? If the latter makes more sense, we could at least consider not building in the aggregation (I think we can return a numpy array or pandas series and expect that people can use its .mean() method if they want to).

I think it would be good to include an option to aggregate across channels before computing measures from what I have read. If people are confused on how to do the later method of averaging, I could always write a tutorial later even if it is quite simple.

Also, I know this may be a bad idea, but would it be horrible if we made another get_peak function inside of the erp module that included the extra features like aggregating across channels? I guess these parameters could be added to the existing function, it just bothers me that the current function is called by an Evoked object while the ones I am making will be 'global' functions.

@drammock
Copy link
Member

If Steve Luck recommends (at least in some cases) to aggregate across channels before computing these metrics, then we should make it easy to do that. But I do lean toward making it optional via average=True|False or similar (as suggested above). I think both average and combine are parameters used elsewhere in our codebase for similar needs, so I'm not 100% sure which one to pick: combine is the most flexible as it doesn't bake in a single aggregation method, but if everyone always only uses mean anyway, we might as well just have a simple boolean like average=False|True.

would it be horrible if we made another get_peak function inside of the erp module that included the extra features like aggregating across channels?

horrible because of name clash? I.e., is it acceptable to have both evoked.get_peak() method and mne.stats.erp.get_peak() function? It's not ideal, but I feel like even if we called the function something else like get_erp_peak(), there's still a risk that people might accidentally discover/try evoked.get_peak() and be sad that it doesn't easily do what they want. So distinct names doesn't completely solve that problem. That said, calling all of these functions get_erp_*() might be a good idea anyway --- it would reinforce that these are (probably) the measures you want and are familiar with if you're an ERP researcher. I'm open to other opinions/ideas about naming too though.

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Feb 22, 2024

horrible because of name clash? I.e., is it acceptable to have both evoked.get_peak() method and mne.stats.erp.get_peak() function? It's not ideal,

Yes that's what I was referencing.

@drammock What if we have a parameter compute (This is probably not the best name so other suggestions would be very much appreciated maybe compute_on??) and compute specifies what sort of data the metric is computed on and what channels it returns.

So for instance mne.stats.erp.get_peak() would have compute='maximum' by default so that it matches the functionality of evoked.get_peak() that way users would not get confused because they would do the same thing by default. compute would also have options for all and average. mne.stats.erp.get_fractional_area_latency() could also have a special option for compute like compute='earliest' that sort of matches the way evoked.get_peak() works. However, this may be too much to place on the shoulders of one lone variable 😨

@withmywoessner
Copy link
Contributor Author

Hey @larsoner @drammock just pinging here to see if you had any thoughts on my suggested API implementation, no pressure to look at this now though. Thank you!

@drammock
Copy link
Member

drammock commented Mar 1, 2024

What if we have a parameter compute [...] and compute specifies what sort of data the metric is computed on and what channels it returns.

I'm struggling to keep track of what all the desired computations are, which makes it hard to weigh on on whether the get_peak(..., compute="whatever") is a good way to expose all the desired possiblities. Can you make a list or table showing all the computations you want to support, and then we can work from there to decide how best to group those computations into combinations of functions and parameter values? Example:

  • get fractional peak latency of each channel separately
  • get fractional peak latency after averaging channels
  • get fractional area latency ...
  • get mean amplitude over a window ...

Side note: remember that if it is best/easiest from a UX perspective for the user to have 10 distinct functions with clear, descriptive function names and few or no parameters to supply, then we can build that as the public API --- even if under the hood they all just wrap to a (more complicated) private function (if that's the most efficient / least redundant way to actually do the computations).

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 1, 2024

I'm struggling to keep track of what all the desired computations are, which makes it hard to weigh on on whether the get_peak(..., compute="whatever") is a good way to expose all the desired possiblities. Can you make a list or table showing all the computations you want to support...

Sure, no problem @drammock! Let me know if this makes it more clear.

  • get_fractional_area_lat(fractional_area, tmin, tmax, mode, compute...)

    • get_fractional_area_lat(fractional_area=.9, tmin, tmax, mode, compute='all'...)
    • get_fractional_area_lat(fractional_area, tmin, tmax, mode, compute='average'...)
    • get_fractional_area_lat(fractional_area, tmin, tmax, mode, compute='earliest'...)
    • get_fractional_area_lat(fractional_area, tmin, tmax, mode='intg', compute...)
    • get_fractional_area_lat(fractional_area, tmin, tmax, mode='abs', compute...)
    • get_fractional_area_lat(fractional_area, tmin, tmax, mode='neg', compute...)
    • get_fractional_area_lat(fractional_area, tmin, tmax, mode='pos', compute...)
  • get_area(tmin, tmax, mode, compute...)

    • get_area(tmin, tmax, mode, compute='all'...)
    • get_area(tmin, tmax, mode, compute='average'...)
    • get_area(tmin, tmax, mode, compute='maximum'...)
    • get_area(tmin, tmax, mode='intg', compute...)
    • get_area(tmin, tmax, mode='pos', compute...)
    • get_area(tmin, tmax, mode='neg', compute...)
    • get_area(tmin, tmax, mode='abs', compute...)
  • get_peak(tmin, tmax, mode, compute...)

    • get_peak(tmin, tmax, mode, compute='average'...)
    • get_peak(tmin, tmax, mode, compute='all'...)
    • get_peak(tmin, tmax, mode, compute='maximum'...)
    • get_peak(tmin, tmax, mode='abs', compute...)
    • get_peak(tmin, tmax, mode='pos', compute...)
    • get_peak(tmin, tmax, mode='neg', compute...)
  • get_fractional_peak_lat(fractional_peak, tmin, tmax, mode, compute, return_offset...)

    • get_fractional_peak_lat(fractional_peak=.9, tmin, tmax, mode, compute='all'...)
    • get_fractional_peak_lat(fractional_peak, tmin, tmax, mode, compute='average'...)
    • get_fractional_peak_lat(fractional_peak, tmin, tmax, mode, compute='earliest'...)
    • get_fractional_peak_lat(fractional_peak, tmin, tmax, mode='abs', compute...)
    • get_fractional_peak_lat(fractional_peak, tmin, tmax, mode='neg', compute...)
    • get_fractional_peak_lat(fractional_peak, tmin, tmax, mode='pos', compute...)
    • get_fractional_peak_lat(fractional_peak, tmin, tmax, compute, return_offset=True...)

Areas
mode='intg' -> compute across normal integral

Amps and areas
mode='pos' -> compute across positive vales
mode='neg' -> compute across negative values
mode='abs' -> compute across absolute value of values

compute='all' -> compute metric for each channel then return all channels
compute='avg' -> Average across all channels in Evoked object then compute the metric on the average

For amps and areas
compute='maximum' -> Compute the metric across all channels but only return the absolute value maximum of the values considered (neg, pos). This is probably confusing and should be changed. I just included this to match the functionality of Evoked.get_peak()

For latencies
compute='earliest' -> Return earliest Compute the metric across all channels but only return the earliest latency

return_offset -> return onset as well as the 'offset' - the first time after the peak that the ERP drops to X% as large as the peak.
image

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 1, 2024

fractional_peak will be renamed to fractional_amp

@withmywoessner
Copy link
Contributor Author

Hey @drammock! Just pinging here to see if you have a chance to look over my suggestions because I saw you edited one of my posts. Let me know if I am pinging too much. Thank you!

@drammock
Copy link
Member

drammock commented Mar 8, 2024

Hey @drammock! Just pinging here to see if you have a chance to look over my suggestions because I saw you edited one of my posts. Let me know if I am pinging too much. Thank you!

Edited other post because it cross-ref'd the wrong issue --- I look quickly at (nearly) every new issue/PR as it comes in if I can, just to mentally triage its urgency/difficulty and chime in as appropriate. Rest assured that this one is still on my radar, but there's a lot going on right now and I haven't had to wrap my head around the (long!) list of supported functionality. One thing slowing me down is that I asked for / expected a list of prose descriptions of the functionality you want to support (as a way of helping figure out what the API should look like). What you wrote is a mock-up of the API call for each case.

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 8, 2024

Hey @drammock! Just pinging here to see if you have a chance to look over my suggestions because I saw you edited one of my posts. Let me know if I am pinging too much. Thank you!

Edited other post because it cross-ref'd the wrong issue --- I look quickly at (nearly) every new issue/PR as it comes in if I can, just to mentally triage its urgency/difficulty and chime in as appropriate. Rest assured that this one is still on my radar, but there's a lot going on right now and I haven't had to wrap my head around the (long!) list of supported functionality.

Okay, I understand, thanks for letting me know!

One thing slowing me down is that I asked for / expected a list of prose descriptions of the functionality you want to support (as a way of helping figure out what the API should look like). What you wrote is a mock-up of the API call for each case.

I placed prose descriptions of every keyword and value that keyword can take at the bottom, I thought that would be easier to see everything laid out in the most succinct format instead of just having a prose description of each so that way if I miss something it is easier to see.

If I did a prose description there would be 12 possible combinations for some of them and I thought that would be hard to parse.

EDIT: I'Il add a prose description below each.

@withmywoessner

This comment was marked as outdated.

@drammock
Copy link
Member

drammock commented Mar 8, 2024

I placed prose descriptions of every keyword and value that keyword can take at the bottom

Yes, I saw that. It requires scrolling/scanning back and forth between the API descriptions and the "key".

I thought that would be easier to see everything laid out in the most succinct format instead of just having a prose description of each so that way if I miss something it is easier to see.

Could you do what you need to do locally to make sure you've included all possibilities, rather than imposing that structure here?

If I did a prose description there would be 12 possible combinations for some of them and I thought that would be hard to parse.

I appreciate the "trying to be helpful" aspect of this choice. But so there is no further confusion: it would be easier for me to see descriptions in prose when I'm trying to figure out what an API should look like. By describing what you want to do already in the form of an API makes that task harder for me, because I have to first "un-api" your proposal in order to imagine / visualize what other possible APIs might look like. Feel free to use nested bullets if that helps, or comments like "each of the next 5 bullets also should have a way to specify a restricted time window over which to operate" or whatever --- in other words you don't necessarily need to spell out every detail on every line, if it helps clarity.

@withmywoessner

This comment was marked as outdated.

@drammock
Copy link
Member

OK @withmywoessner sorry for the delay. Here is my take on what the API should be like:

  1. function names get_erp_peak, get_erp_area, get_erp_fractional_peak_latency, and get_erp_fractional_area_latency. I acknowledge that the latter two are very long, but I think there's an advantage in having get_erp_<TAB> bring up all four options in people's editors/iPython terminals.
  2. all four functions get tmin and tmax for bounding the computation in the time domain
  3. all four functions get a parameter mode for how to compute the peak or area.
    a. For peak functions the modes are pos, neg, abs (or maybe positive negative absolute? I'm waffling on that point; maybe best to ask for the full form of each word, but silently also accept the shorter forms too)
    - These determine whether to only consider extrema in one direction, other direction, or both directions
    - These affect both the method for finding the overall peak and also the method for finding the fractional value.
    b. For area functions the modes are pos, neg, abs, integrate. integrate effectively subtracts the negative areas from the positive ones.
    - as above for "peak", these affect the method for finding both overall area and fractional area.
  4. all four functions get a parameter average (boolean) indicating whether to average the signals together first, or operate channel-wise.
    • Also OK would be average: None | "mean" | callable if we think users will ever want to aggregate across channels in some other way besides arithmetic mean.
    • Probably fine to start with False | True and add the other options later only if users ask for it.
  5. Peak-related functions get a parameter strict ( boolean) indicating whether to error if a positive peak was requested but the whole signal is negative (or vice versa).
  6. I don't think we should add parameters / param values to return only the (biggest, earliest) area or peak across channels, because it's not obvious to me that e.g. "earliest" is always what users will want.
  7. In light of point (6), we should talk about what form the output takes. If we're going to force users to use np.min/np.argmin to find e.g. earliest latency (and the channel name where it occurs) we could consider returning something a bit more structured like a Pandas series, named tuple, dict, or similar.

WDYT about that proposal?

@drammock
Copy link
Member

Just to clarify is this a case where they would give a different result?

yes exactly, nice illustration :)

From ERPLab documentation

OK if ERPLab offers the equivalent of my which keyword then let's just do that for now, on the assumption that the side-style use case is probably less common. Does which make sense as a param name? I came up with it after just a few seconds, maybe another name would be clearer. Also are you OK with "onset/offset/both" as the supported param values, or is there something better there? One other approach maybe is to always return onset, and optionally return_offset=True to get both (but no option to get just offset). WDYT?

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 13, 2024

yes exactly, nice illustration :)

Thanks!

One other approach maybe is to always return onset, and optionally return_offset=True to get both (but no option to get just offset). WDYT?

This one matches evoked.get_peak() which has a boolean return_amplitude parameter, but I think which is better because it would leave room to add the functionality you mentioned in side.

Does which make sense as a param name?

I personally like return more, but either is fine!

How about we include a parameter that allows one to specify the number of times the signal is allowed to cross the specified amplitude fraction [before returning the latency]

Do you think accounting for something like this would ever be useful. Maybe I should look for papers of people needing this.

@drammock
Copy link
Member

How about we include a parameter that allows one to specify the number of times the signal is allowed to cross the specified amplitude fraction [before returning the latency]

Do you think accounting for something like this would ever be useful. Maybe I should look for papers of people needing this.

That sounds good. I don't want us to spend extra effort coding/reviewing/testing/maintaining this code path if nobody is going to use it.

One other approach maybe is to always return onset, and optionally return_offset=True to get both (but no option to get just offset). WDYT?

This one matches evoked.get_peak() which has a boolean return_amplitude parameter...

that's a good reason (internal consistency) to stick with return_* boolean params. We also use those in TFR computation, e.g., return_itc.

...but I think which is better because it would leave room to add the functionality you mentioned in side.

see above, I'm not convinced that the side approach is actually useful to folks. TBD based on your lit search whether we even need to leave room for it.

Does which make sense as a param name?

I personally like return more, but either is fine!

we can't use return by itself, it's a reserved keyword in Python.

@withmywoessner
Copy link
Contributor Author

see above, I'm not convinced that the side approach is actually useful to folks. TBD based on your lit search whether we even need to leave room for it.

I browsed a couple papers and most just mentioned using ERPLAB. I believe this is not something we need to support. I thought about emailing Steven Luck, but if he doesn't get back to me. I think I'll just go ahead and proceed with supporting this:

offset gives "the first time after the peak at which the signal drops below N percent of the peak value".

and

output="pandas"|"dict"

@drammock
Copy link
Member

ok so in summary I think the proposal on the table is:

def get_erp_peak(
        tmin,
        tmax,
        picks,
        mode,     # "pos" | "neg" | "abs"
        average,  # bool, whether to compute per-channel or avg across picks
        strict,   # bool, whether to error if e.g. mode is "pos" and signal is all neg
        output,   # "pandas" | "dict" (DF with ch_name as index, or dict of namedtuples)
    ):
    pass

def get_erp_area(
        tmin,
        tmax,
        picks,
        mode,     # "pos" | "neg" | "abs" | "intg"
        average,  # bool, whether to compute per-channel or avg across picks
        output,   # "pandas" | "dict" (DF with ch_name as index, or dict of namedtuples)
):
    pass

def get_erp_frac_peak_latency(
        frac,
        which,    # "onset" | "offset"
        tmin,
        tmax,
        picks,
        mode,     # "pos" | "neg" | "abs"
        average,  # bool, whether to compute per-channel or avg across picks
        strict,   # bool, whether to error if e.g. mode is "pos" and signal is all neg
        output,   # "pandas" | "dict" (DF with ch_name as index, or dict of namedtuples)
):
    pass

def get_erp_frac_area_latency(
        frac,
        which,    # "onset" | "offset"
        tmin,
        tmax,
        picks,
        mode,     # "pos" | "neg" | "abs" | "intg"
        average,  # bool, whether to compute per-channel or avg across picks
        output,   # "pandas" | "dict" (DF with ch_name as index, or dict of namedtuples)
):
    pass

The only bit I'm still hesitating over is output. For the case of average=True both a dataframe or a dict-of-namedtuples seem weird; a scalar, tuple or namedtuple seem adequate (and importantly, I think we clearly shouldn't return a dict in that case, because it breaks unpacking assignment --- the keys will get assigned instead of the values). That makes me lean toward something like this:

  • for get_peak(..., average=True) return a tuple (or maybe namedtuple) of (amplitude, latency)
  • for get_area(..., average=True) return a scalar area
  • for get_frac_peak_latency(..., average=True) return a scalar latency
  • for get_frac_area_latency(..., average=True) return a scalar latency
  • for get_peak(..., average=False) return a tuple (or maybe namedtuple) of arrays (one for amplitude, one for latency)
  • for get_area(..., average=False) return an array of area in same order as picks
  • for get_frac_peak_latency(..., average=False) return an array of latency in same order as picks
  • for get_frac_area_latency(..., average=False) return an array of latency in same order as picks
  • for get_peak(..., average=False, pandas=True) return a DataFrame with ch_names as index and columns amplitude, latency
  • for get_area(..., average=False, pandas=True) return a Series of area with ch_names as index
  • for get_frac_peak_latency(..., average=False, pandas=True) return a Series of latency with ch_names as index
  • for get_frac_area_latency(..., average=False, pandas=True) return a Series of latency with ch_names as index

@withmywoessner WDYT about how to handle output? (note I've called the param pandas=True instead of output="pandas" just to show another possibility, not sure which is better)

@larsoner @cbrnr this might be a good time for you to weigh in too, now that the proposal is (almost) complete

@larsoner
Copy link
Member

function names get_erp_peak, get_erp_area, get_erp_fractional_peak_latency, and get_erp_fractional_area_latency. I acknowledge that the latter two are very long, but I think there's an advantage in having get_erp_ bring up all four options in people's editors/iPython terminals.

My one major comment is that going with function names get_erp_* feels like opting for a pseudo-namespace when we could just create an actual one. It seems most straightforward just to have mne.stats.erp.get_<tab> give you all the ERP measures, compared to either 1) mne.stats.get_erp_<tab>, which provides a namespace-like-tab-completion without all the other nice features of a namespace (like dedicated module and code scopes etc.), or 2) mne.stats.erp.get_erp_<tab> which is redundant. But this overall proposal "make measures for ERPs specifically" seems like a great use case for a dedicated namespace to me.

Pandas is not a hard dependency of MNE-Python, so we should try to come up with a simpler approach that is still reasonably user-friendly. If we do support pandas output, we probably need an output="pandas"|"dict" or similar.

To simplify the output stuff it seems okay to me for these functions to always return a DataFrame (or maybe a DataFrame or Series). It creates more work in some situations, but changing the return type on people also creates (different types of) work in others. We can implement it this way first -- especially if it's simpler! -- and always later add non-DataFrame if there are people who have problems installing Pandas or whatever but want this functionality... but I bet it'll be YAGNI. Things like pandas are much easier to install optionally nowadays than they were 10 years ago.

@withmywoessner
Copy link
Contributor Author

def get_erp_frac_area_latency(
        frac,
        which,    # "onset" | "offset"
        tmin,
        tmax,
        picks,
        mode,     # "pos" | "neg" | "abs" | "intg"
        average,  # bool, whether to compute per-channel or avg across picks
        output,   # "pandas" | "dict" (DF with ch_name as index, or dict of namedtuples)
):
    pass

I forget, did we conclusively decide which is better over return_offset?

for get_peak(..., average=True) return a tuple (or maybe namedtuple) of (amplitude, latency)

Tuple seems like a good choice because people might not be as familar with the syntax of a namedtuple.

for get_area(..., average=False, pandas=True) return a Series of area with ch_names as index

What is the behavior of get_area(..., average=True, pandas=True)? Does it return a scalar or pandas data structure?

for get_frac_peak_latency(..., pandas=True) return a Series of latency with ch_names as index

Can we also return amplitude for this set of parameters?

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 19, 2024

it seems okay to me for these functions to always return a DataFrame (or maybe a DataFrame or Series)

Sounds good. If we go this route maybe we can get rid of the return_* paramaters and just return all the values then the user can just select the ones they want using normal pandas operations @larsoner @drammock:

get_erp_area

Area Channel
0 0.8 fz
1 0.9 cz

get_erp_peak

amplitude Channel latency
0 1.3 fz 0.6
1 1.5 cz 0.7

get_erp_fractional_peak_latency

fractional_peak_latency_onset Channel amplitude fractional_peak_latency_offset
0 1.3 fz 56 2.3
1 1.5 cz 78 2.5

get_erp_fractional_area_latency

fractional_area_latency Channel Area
0 1.3 fz 56
1 1.5 cz 78

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 19, 2024

it seems okay to me for these functions to always return a DataFrame (or maybe a DataFrame or Series)

get_area(..., average=True) would return a series with one element, is that fine?

@drammock
Copy link
Member

I forget, did we conclusively decide which is better over return_offset?

return_offset sort of implies you'll get both (at least that's often the case with other return_* parameters elsewhere in MNE). If that's what is intended (I think it is, based on your mock-up output tables above) then return_offset is fine. I'd also consider ditching the param and always including offset; if we're always returning a dataframe then I'm less concerned about things like unpacking assignments of the return value.

for get_peak(..., average=True) return a tuple (or maybe namedtuple) of (amplitude, latency)

Tuple seems like a good choice because people might not be as familar with the syntax of a namedtuple.

@larsoner is suggesting we don't need to be so cautious with requiring pandas, so let's scrap the whole output parameter and non-pandas options.

for get_area(..., average=False, pandas=True) return a Series of area with ch_names as index

What is the behavior of get_area(..., average=True, pandas=True)? Does it return a scalar or pandas data structure?

I meant to say (not just imply) before that pandas=True is ignored when average=True. But in light of @larsoner's willingness to make pandas required, that is now moot for all cases except possibly this one; get_area(average=True) has only one return value, so maybe that should just give a scalar rather than a one-element Series 🤔

for get_frac_peak_latency(..., pandas=True) return a Series of latency with ch_names as index

Can we also return amplitude for this set of parameters?

If that seems desirable I don't object. it's information that could be derived from the value of frac and a call to get_peak() but if we're all-in on pandas then no harm in adding that column.

Side note: your mock-up output tables all still show channel name in a column; I've said a couple of times I think it's preferable to put channel name as the Series/DataFrame Index. Are you not on board with that? (if not, why not?)

@drammock
Copy link
Member

It seems most straightforward just to have mne.stats.erp.get_<tab> give you all the ERP measures

👍🏻

@withmywoessner
Copy link
Contributor Author

Side note: your mock-up output tables all still show channel name in a column; I've said a couple of times I think it's preferable to put channel name as the Series/DataFrame Index. Are you not on board with that? (if not, why not?)

I am! Sorry forget to change it to include that.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 20, 2024

I don't have any detailed feedback due to limited time, but from browsing the discussion I'd like to mention the following four points:

  1. Please don't put anything in the index. Creating an actual column with channel names makes subsequent transformation much less complex and is also in line with tidy data principles.
  2. Not sure if I missed it, but in ERP analysis we're usually interested in the mean amplitude within a given time segment, and not so much in the peak amplitude. Is this part of this PR?
  3. I like the idea of creating a new module mne.stats.erp!
  4. Related to the previous point, this would then be a perfect place to include the SME (Add Standardized Measurement Error for ERP analysis #10647), or no?

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 20, 2024

I don't have any detailed feedback due to limited time, but from browsing the discussion I'd like to mention the following four points:

  1. Not sure if I missed it, but in ERP analysis we're usually interested in the mean amplitude within a given time segment, and not so much in the peak amplitude. Is this part of this PR?

I'm not sure if this was discussed in the issue or PR, but mean amplitude can be extracted by Evoked.crop(...).get_data().mean() because MNE only supports constant sampling rates so when don't need to worry about doing integration. There is already a tutorial on this, but I plan on updating it at some point to be more consistent with the other erp functions.

  1. Related to the previous point, this would then be a perfect place to include the SME (Add Standardized Measurement Error for ERP analysis #10647), or no?

I think @larsoner implied that this would be best to add as a separate PR to make it less onerous to review. But I plan on it!

@drammock
Copy link
Member

  1. Please don't put anything in the index. Creating an actual column with channel names makes subsequent transformation much less complex and is also in line with tidy data principles.

I'll acknowledge that this could be interpreted to go against the "each variable forms a column" maxim, but it is aligned with the more general principle of "structure datasets to facilitate analysis". If channel names are in the index, the resulting dataframe will be all numeric, so users can get separate stats for each channel and very easily do things like dataframe.mean() which will by default act columnwise. If the channel names are in a column, we instead get something like:

>>> import pandas as pd
>>> foo = pd.DataFrame(dict(foo=range(5), bar=list('abcde')), index=range(5))
>>> foo.mean()
TypeError: Could not convert ['abcde'] to numeric

So to me it's not clear that there's one "right" way to do it here. My proposal seems like it will save users a bit of extra effort in what is likely a very common use case... but ultimately it's a single operation to switch back and forth (.set_index("ch_name") or .reset_index(names="ch_name")) so it doesn't make a huge difference which way we go.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 20, 2024

@drammock if it doesn't make a huge difference then let's please use a regular column. All other data analysis packages (including polars) do not use indexes, because operations are generally much more readable without an index (see also https://docs.pola.rs/user-guide/migration/pandas/#polars-does-not-have-a-multi-indexindex). I appreciate the use case you mentioned, and in this case it is nice to have, but for me still not worth the downsides of an index.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 20, 2024

(And the added benefit of having an integer sequence as an index is that we automatically have channel numbers, which are also nice to have.)

@withmywoessner
Copy link
Contributor Author

I agree with @drammock that it would simplify things, but if you think consistency is most important @cbrnr I am fine with the channel column.

@drammock
Copy link
Member

All other data analysis packages (including polars) do not use indexes

what about pandas?! :)

But sure, as I said before I don't think it makes a huge difference, so since you feel strongly about it we can go with no index.

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 20, 2024

get_area(average=True) has only one return value, so maybe that should just give a scalar rather than a one-element Series 🤔

I think this is the last thing that needs to be addressed @drammock. Just a thought, but could we also have tmin and tmax included in the dataframe. This may be useful when someone wants to find several areas of an erp. This just saves a bit of time as you don't have to add the column yourself.

@drammock
Copy link
Member

get_area(average=True) has only one return value, so maybe that should just give a scalar rather than a one-element Series 🤔

eh, let's be consistent and return a Series there too. If I had to guess, the problems from expecting a Pandas object and not getting one will be more annoying than the problems from having a length-one Series when you expect a scalar.

could we also have tmin and tmax included in the dataframe

that feels like a slippery slope (why not mode too)? I'd prefer to only include values computed by the functions (plus ch_name, because that will nearly always be wanted).
If the user needs tmin/tmax it it's very short to add (df['tmin'] = tmin; df['tmax'] = tmax)

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 20, 2024

eh, let's be consistent and return a Series there too. If I had to guess, the problems from expecting a Pandas object and not getting one will be more annoying than the problems from having a length-one Series when you expect a scalar.

Okay! Sounds good! I think I will begin implementing everything if there is nothing else. Thanks everyone for the help!

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 20, 2024

Another thing @drammock, do you know what the purpose of merge_grads in Evoked.get_peak() I don't know much about MEG analysis.

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 21, 2024

The behavior of Evoked.get_peak(...) is currently that if the mode is 'pos' or 'neg' and there are no positive or negative values it returns an error:
No positive values encountered. Cannot " "operate in pos mode.
No negative values encountered. Cannot " "operate in neg mode.

because we are now passing multiple channels of which a few may not contain positive or negative values it may be annoying to raise an error. That is why I am thinking it is best to return NaN in the pandas dataframe for the corresponding channels and then display a warning that specifies the channels without positive or negative values.

Is this fine with everyone?

@cbrnr
Copy link
Contributor

cbrnr commented Mar 21, 2024

eh, let's be consistent and return a Series there too. If I had to guess, the problems from expecting a Pandas object and not getting one will be more annoying than the problems from having a length-one Series when you expect a scalar.

I disagree. We cannot expect our users to be familiar with pandas, so they will expect a single number for the average area and not a single-element pandas series. I don't think it's a good idea to be consistent with respect to the return type, I'd rather be consistent with users' expectations.

@withmywoessner
Copy link
Contributor Author

withmywoessner commented Mar 21, 2024

I disagree. We cannot expect our users to be familiar with pandas, so they will expect a single number for the average area and not a single-element pandas series. I don't think it's a good idea to be consistent with respect to the return type, I'd rather be consistent with users' expectations.

I don't know if I agree with this. I find it very unlikely that users would be doing erp analysis that only required average area so if they're going to be exposed to pandas anyway we might as well keep it consistent.

I may have said it previously, but when one is examining an erp component it is often the case that you are interested in finding several areas per erp (200-300 ms, 450-550 ms...) and pandas operations seem easier to work with when you have several subjects each with several areas and would like to combine their data in some way.

@cbrnr
Copy link
Contributor

cbrnr commented Mar 21, 2024

OK, let me put it another way. If the function returns a scalar via average=True, I would expect to get a scalar and not a pd.Series with a single entry. This has nothing to do with experience with pandas, which I think I have some.

@drammock
Copy link
Member

drammock commented Mar 21, 2024

We cannot expect our users to be familiar with pandas

if that is true, they won't know how to handle the output of any of the other functions (or the output of get_area(..., average=False). So they'll have to learn.

I don't think it's a good idea to be consistent with respect to the return type, I'd rather be consistent with users' expectations.

How do we know what those expectations are? You're citing your own introspection as evidence of what users in general will expect. I don't think that's valid. If the docstring says "returns a pd.Series if average=True and a pd.Dataframe if average=False" then any user who reads the docs will have the right expectations. We can also easily add a note saying "in the case of average=True, if you just want the scalar, do result.item()". But as I indicated earlier, I think in many cases a one-element series will cause fewer problems than a scalar:

import pandas as pd

area_scalar = 5.1
area_series = pd.Series([5.1], index=["area"])
peak_series = pd.Series([3.0, 0.98], index=["amplitude", "latency"])

2 * area_series + 1  # arithmetic works
pd.concat([area_series, peak_series])  # combining with output from other funcs works

pd.concat([area_scalar, peak_series])  # TypeError:
# cannot concatenate object of type '<class 'float'>'; only Series and DataFrame objs are valid

...so I think many users won't even need to call .item()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants