Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

separation average #271

Open
adematti opened this issue Jan 30, 2022 · 7 comments
Open

separation average #271

adematti opened this issue Jan 30, 2022 · 7 comments

Comments

@adematti
Copy link

adematti commented Jan 30, 2022

Currently the separation average in each bin (thetaavg, ravg, savg, rpavg) is computed without taking weights into account.
I think, for usual purposes, we would like to include weights (when provided) in this computation?
Typically, we would expect the same average separation, whether a particle is weighted by 0 or is totally removed from the catalog.
It would always be possible, for someone truly interested in the non-weighted separation average, to run the computation without weights.
What do you think?

Best

@lgarrison
Copy link
Collaborator

Yes, I think you're right! Everything should be computed using the weights. Would you like to open a PR?

@manodeep
Copy link
Owner

@adematti Do you need the weighted average separations for your Science case or is this more of a consistency/implementation choice question?

@lgarrison We had a long discussion about this when you first started implementing the weights within Corrfunc. We came to the conclusion that it was better to leave the average separations as is - since that would be the most likely use-case. And that if any user required a weighted separation, then they could "easily" add a custom weight function and keep track of the weighted mean separation.

@adematti
Copy link
Author

adematti commented Jan 30, 2022

@manodeep well, I'd think the weighted version would be the most intuitive convention for the separation average ; this is the definition we will use in DESI --- though we will most certainly make sure cosmological results do not depend on this choice.
In case one'd not want the weighted-average version, it would be straightforward to recompute the pair counts without weights.

Anyway, it's just fine to have this change only in the branch we are currently using in DESI :)

@lgarrison
Copy link
Collaborator

@manodeep, thanks, I had forgotten our discussion! It's coming back to me now. However, since adding the weights over 5 years ago (!), I think I've changed my mind on the issue. It seems to me that assigning a particle a weight of 2 ought to behave as if that particle appeared twice in the input data, at least if the user has specified the pair_product weighting scheme.

Or, the more flexible thing would be to add a new column to the output called weighted_ravg, so that the user has access to both the unweighted and weighted version, just like they have access both npairs and weightavg. But that's more work than just making ravg weighted.

My vote would be to make ravg weighted in Corrfunc proper, but it's also not a tiny amount of work. So if you would like to keep Corrfunc as-is @manodeep or if @adematti you don't have the spare effort for a PR right now, then keeping the change in your fork is fine by me too.

@manodeep
Copy link
Owner

manodeep commented Feb 2, 2022

Alright great - as always I prefer to add a user-option that allows either weighted or unweighted average separations. Would the plan be the same - provide the default PAIR_PRODUCT weighted average for separations? And anything else the user has to code those up?

Regardless, if we do change the behaviour, we should try to keep backwards compatibility by default. That way this is not a breaking change but rather a functionality update.

What do y'all think?

@lgarrison
Copy link
Collaborator

Yeah, we already have an output_ravg field that we normally set to True, but we could allow the user to set that to the string 'weighted' instead. Then the average separation would just inherit whatever weighting scheme the user is using.

@adematti
Copy link
Author

adematti commented Feb 3, 2022

To me this seems like a good idea.
I had weighted average separation implemented upon the current master version in https://github.com/adematti/Corrfunc/tree/separation-avg. Yet, this branch does not allow for the option that you suggest for output_ravg and some things would need to be modified if #258 is accepted.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants