Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Addition of peak_list definition #1255

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

vvilhelmus
Copy link
Contributor

Greetings again after a long & tough Summer. 👋
Instead of coding the requested compute_hwhm_at_max_power in the Perror #1221 thread, I instead coded the peak_list table definition.
The reason for this is, because the SciPy's find_peaks function does not compute the HWHM at max_power: it instead computes the HWHM of the Prominence, which is a different value than the max_power. Therefore I foresaw potential problems for future users trying to plot the HWHMs in graphs, hence why I coded the peak_list table instead.
Also I have made the requested View-option invisible and auto-detectable instead, so the peak_list table should now work automatically with both Box Least Square and Lomb Scargle periodograms.
I did include a powerlimit-option, which is basically the lower (and as an array also the upper) threshold of peak detection.

I am now going to try to figure out how to write a tutorial with at the moment the preliminary title "Comparison and Combination of TESS data with Kepler data", where the peak_list function will be shown off how it properly works and can be used to its maximal potential. No idea when that will be finished, because I have never done something like that before, but looking forward to any of your feedback in the mean time. 😁

@vvilhelmus vvilhelmus mentioned this pull request Nov 22, 2022
@vvilhelmus
Copy link
Contributor Author

Okay, I have completed and uploaded the "Comparing and Combining of TESS data with Kepler data" tutorial, which took longer to create than expected. This is the tutorial I had in mind when requested and I hope, when going through my tutorial, you will understand why I needed the uncertainties in the periodicity. I also wanted those periodicity uncertainties for my M.Sc. project last year, but was forced to skip them due to time constraints. I tried to stuff as much Python things (which I had learned last year and thought were missing in your current tutorials) into this tutorial, while also trying to keep it as brief as possible. I also wrote this tutorial with citizen scientists in mind, who are new to the field, so there is a lot of hand holding, hopefully you are not bothered by this.

When uploading the docs/source/reference/api/lightkurve.periodogram.Periodogram.peak_list.rst file I received the message that the docs/source/reference/api path is ignored by one of my .gitignore files, so I skipped adding that file. If I need to upload this file too, please let me know. I think I can force it by adding '-f' after the 'git add' command.

I also see now the initial peak_list definition addition upload one month ago got an error message here too and I am not sure why this has happened here, when for the Perror definition there were no issues at all. It seems to have fixed it by itself now... weird! 🤷

Happy Thanksgiving wishes from the Netherlands everyone! 🦃

@orionlee
Copy link
Collaborator

@vvilhelmus The function could be handy! I think adding it would be helpful, and is in line with Lightkurve's spirit. But it'd be great if some domain experts could review / comment on the PR.

Here are some high level feedback. Detailed feedback (more on styles, API consistency, etc.) is added inline.

  1. Consider to add a FWHM column as well for convenience.
  2. Lower HWHM is reported as negative. It surprises me. It seems to break the typical convention for reporting lower error limit (as positive number), unless there is some special convention for HWHM that I am not aware.
  3. Prominence column: what it conveys seems to be similar to what SDE (in BLS literature) does. I don't know if it might confuse people familiar with the domain. It is probably okay, given it is documented in scipy, but it's probably better to hear feedbacks from some domain experts.

The docs/source/reference/api/lightkurve.periodogram.Periodogram.peak_list.rst is generated by doc build system.
To add the function to the documentation, one should add it to docs/source/reference/periodogram.rst . The documentation system will generate the correspond doc based on the docstring.

Copy link
Collaborator

@orionlee orionlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Feedback on miscellaneous style / API consistency.

@@ -507,6 +508,67 @@ def __div__(self, other):
def __rdiv__(self, other):
return self.__rtruediv__(other)

def peak_list(self, powerlimit=None):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

naming: given this is a wrapper over find_peaks(), why not name it as find_peaks() ?
find_peaks() also has the advantage of following the convention that the methods usually start with verbs.

uhwhm_period = uhwhm_int_down + uhwhm_int_remainder * (uhwhm_int_up - uhwhm_int_down)
uhwhm = uhwhm_period - view[peaks]
result = Table(data=[stats['peak_heights'], view[peaks], stats['prominences'], lhwhm, uhwhm],
names=('Power', x, 'Prominence', 'Lower HWHM', 'Upper HWHM'))
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

in Lightkurve, column names for table like structures are usually lower cased without spaces.
So consider rename the column from Power Ratio to power_ratio, lower_hwfm, etc.

result['Lower HWHM'].format = '.5g'
result['Upper HWHM'].format = '+.5g'
result[x+' Ratio'].format = '.3f'
result[x+' Ratio'].unit = 'top peak:peak'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

top peak:peak is a (helpful, clarifying) description, but is not a unit per se. It should be
None or astropy.unit.dimensionless_unscaled .

result.sort('Prominence',
reverse=True)
result[x+' Ratio'] = result[x][0] / result[x]
result['Power'].format = y
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the unit of Power should be dented as well:

    result["power"].unit = self.power.unit

uhwhm_int_remainder = stats["right_ips"] - np.floor(stats["right_ips"])
uhwhm_period = uhwhm_int_down + uhwhm_int_remainder * (uhwhm_int_up - uhwhm_int_down)
uhwhm = uhwhm_period - view[peaks]
result = Table(data=[stats['peak_heights'], view[peaks], stats['prominences'], lhwhm, uhwhm],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider to use QTable instead of Table and make the columns to be "natively" unit-aware by using Quantity rather than plain Column . This will make the resulting table's API consistent with the majority of Lightcurve API / table-like data structures.

if self.default_view == 'period':
view = self.period
x = 'Periodicity'
y = '.1f'
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

consider rename y to be something more meaningful, e.g. power_format

@vvilhelmus
Copy link
Contributor Author

Thank you for reviewing my code, Sam @orionlee and my apologies for all the rookie mistakes I made (I haven't been schooled as a software developer).
I agree with all the changes and next week I will try to start implementing them along with some other minor changes I wanted to make myself for a while now.
The reason I did not use QTable is because the entire periodogram.py file was not using it. Should other periodogram functions, which use the Table functionality, also be upgraded from Table to QTable to reduce bloat? For now I will only stick to my own definition.

Copy link
Collaborator

@orionlee orionlee left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to handle the case power has units.

x = 'Frequency'
y = None
if powerlimit is None:
powerlimit = float(self.max_power)/20
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line raised Error if the periodgram'x max_power has units such as electron/s (e.g., if the source lightcurve is not normalized).

@vvilhelmus
Copy link
Contributor Author

Okay, I hope I have fixed the bug by changing float(self.max_power) into self.max_power.value.
Also I have implemented all the suggestions and everything seems to be working on my end, but that's not saying much, because I am already seeing now that "Some checks were not successful" again... 😞

I also tried to find a better way to format the values in the table. because at the moment I am setting the format of every table column individually, wasting like 5 lines of code, which I would like to replace with 1 line of code into something in the spirit of:
QTable(data=bladibla, name=bladibla, formats=bladibla)
or
result.info.format = ['bla', 'di', 'bla']
However, there is no reference of this kind of modification existing in neither the AstroPy documentation nor the often helpful stackoverflow webpages. If you know how this can be done, could you please lemme know? I have been searching for this solution to this minor problem for days!

I haven't modified my tutorial yet by the way. Two weeks ago I was watching Becky Smethurst's Monthly Night Sky News and discovered I made a major mistake in my tutorial with the brightness/faintness calculation by thinking that the log scale of the difference in apparent magnitude is 10^(dm_V), when the right way to calculate that is actually: 10^(0.4*dm_V). Thank you for showing how it is done correctly, Dr. Becky, you are the greatest! Anyway, the tutorial should also be adapted for working with QTables now, and I also want to add two more periodogram graphs showing how to label the peaks with the ratios from the find_peaks QTable, and most likely there will also be other edits, which I am going to attempt to start tomorrow. I have no idea how long it will take though, but it shouldn't take two months... I hope not at least... 😟

@vvilhelmus
Copy link
Contributor Author

Hello again Sam @orionlee
I have now recoded and rewritten my tutorial and during that revision, it suddenly occurred to me where in my find_peaks definition my coding was going wrong. I was using quantities elsewhere in my find_peaks computation instead of values, so I added some more .value attributes and now I have green check marks all over the board, huzza, farewell red cross! 🎉 🥳

I have also made some more changes:
In the find_peaks QTable I have swapped the 'power' column with the 'periodicity' column, because it makes more sense for the x-axis values to be in front of the y-axis values in that QTable. My revised tutorial is using now QTables instead of the old Tables, which made things a lot more simpler. I have also added two more plots explaining my find_peaks function more visually, and I have applied the correct apparent magnitude calculations in my writing.
There are probably still mistakes in my tutorial, which I might have overlooked. Please lemme know if you see any.

Who are those domain experts you mentioned, who can review / comment on this PR by the way?
Could you perhaps @ them, so they will be aware of this thread, or would that be impropriate?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants