Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Epoch folding with large datasets #805

Open
rodrigruiz opened this issue Feb 22, 2024 · 4 comments
Open

Epoch folding with large datasets #805

rodrigruiz opened this issue Feb 22, 2024 · 4 comments

Comments

@rodrigruiz
Copy link

Hi,
We are using Stingray to analyse time data, and we are interested on performing an epoch folding search with a large dataset (much larger than what can be fit in the RAM of the computer). Our strategy would be to split the dataset into smaller subsets, fold each of them using the fold_events function, sum the folded profiles, and then perform the analysis on the sum of the folded profiles as it is done in the epoch_folding_search function.
However, we think that this is not possible because some of the functions called from within epoch_folding_search are not public (for example, _folding_search or _profile_fast).
A solution would be to make these functions public, but maybe there's a better strategy to perform this analysis without modifying the Stingray software that we didn't think about. Any advice from your side would be very appreciated.
Thanks!

@matteobachetti
Copy link
Member

Hi @rodrigruiz, thanks for the Issue.
First of all... having private functions doesn't mean they can't be used. You can import them and use them without issue! Being private usually means that they serve a very specific role (like, optimize a minor part of a computation which would make the code unreadable if in the user-facing function), and they are less documented and more prone to interface changes over time. But if you know what you're doing, importing private functions to solve specific tasks is something that can be done. I do that all the time :).

Another thing you might try for your specific need is passing the data with memory mapped arrays. I never tried those particular functions with memory maps, but they should work.

Note also that HENDRICS has an algorithm for fast frequency/fdot searches (it's called quasi fast-folding algorithm). It can be used through HENzsearch or directly from the API

@rodrigruiz
Copy link
Author

Hi @matteobachetti , thanks so much for the detailed answer and the advice!
For now we will try using the private functions, as this is the easiest solution for us given our time constraints. The other solutions that you propose sound interesting as well and since we plan to continue doing time series analyses, I will definitely look into the mapped arrays and other alternative algorithms. But for now we will focus on understanding the details of the epoch folding as it is implemented in stingray.

@matteobachetti
Copy link
Member

BTW, if you think that a private function should really be made public because it's useful on its own, feel free to let us know!

@rodrigruiz
Copy link
Author

Sure, thanks!
So far, the plan to use these private functions seems to work out well. I will come back to in two or three weeks and let you know what we will have exactly used, and how it worked out.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants