Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Porting the line ID tool from tardisanalysis #1890

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

jamesgillanders
Copy link
Member

This PR moves the tardis_line_id tool across from tardisanalysis to the main TARDIS repository. The TARDIS line ID tool was originally written by @unoebauer.

Description
Code has been updated to work with the newest version of TARDIS, but behaves as before.

The tool can be used to extract a list of the N most prominent transitions (default has N = 20) within a user-specified wavelength range (default is the entire wavelength range of the simulation). This can be particularly helpful for diagnosing the species with the largest contribution within a wavelength range, which perhaps has a blend of different absorption features.

Motivation and context
This PR simply relocates the old tool, and makes it compatible with the new version of TARDIS.

How has this been tested?

  • Testing pipeline.
  • Other. Ran the tardis_example.yml on the old and new versions. Output plots were similar.

Examples
Here is the output plot from the old tool:

test_line_ID_K2.pdf

And here is the new version:

test_output.pdf

Type of change

  • Bug fix.
  • New feature.
  • Breaking change.
  • None of the above.

Checklist

  • My change requires a change to the documentation.
    • I have updated the documentation accordingly.
    • (optional) I have built the documentation on my fork following the instructions.
  • I have assigned and requested two reviewers for this pull request.

@tardis-bot
Copy link
Contributor

Before a pull request is accepted, it must meet the following criteria:

  • Is the necessary information provided?
  • Is this a duplicate PR?
    • If a new PR is clearly a duplicate, ask how this PR is different from the original PR?
    • If this PR is about to be merged, close the original PR with a link to this new PR that solved the issue.
  • Does it pass existing tests and are new tests provided if required?
    • The test coverage should not decrease, and for new features should be close to 100%.
  • Is the code tidy?
    • No unnecessary print lines or code comments.

@jamesgillanders
Copy link
Member Author

I moved this tool and the opacities.py tool to a folder titled 'analysis_tools' - that's why 2 new files are appearing instead of 1

@codecov
Copy link

codecov bot commented Feb 7, 2022

Codecov Report

Merging #1890 (d73f310) into master (c705946) will not change coverage.
The diff coverage is n/a.

Impacted file tree graph

@@           Coverage Diff           @@
##           master    #1890   +/-   ##
=======================================
  Coverage   62.55%   62.55%           
=======================================
  Files          68       68           
  Lines        7127     7127           
=======================================
  Hits         4458     4458           
  Misses       2669     2669           

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c705946...d73f310. Read the comment docs.

Copy link
Member

@jaladh-singhal jaladh-singhal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much @jamesgillanders - this looks like a nice addition. I've few comments - let me know if you need help with any of it.

Besides, could you please create a notebook (or rst file) to demonstrate the use of this tool so that it shows up on docs. Having unit-tests will also be nice but I think we might wanna save them for another PR.

Comment on lines +238 to +245
f = open(output_filename, "w")
f.write(
f"# Line Transitions in Wavelength Range "
f"{self.lam_min.value:.1f} - {self.lam_max.value:.1f}"
f" Angstroms\n"
)
dataframe.to_csv(f, sep="\t", index=False)
f.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
f = open(output_filename, "w")
f.write(
f"# Line Transitions in Wavelength Range "
f"{self.lam_min.value:.1f} - {self.lam_max.value:.1f}"
f" Angstroms\n"
)
dataframe.to_csv(f, sep="\t", index=False)
f.close()
with open(output_filename, "w") as f:
f.write(
f"# Line Transitions in Wavelength Range "
f"{self.lam_min.value:.1f} - {self.lam_max.value:.1f}"
f" Angstroms\n"
)
dataframe.to_csv(f, sep="\t", index=False)

It's cleaner to do file operations this way. It ensures auto-closing.

Comment on lines +260 to +263
dataframe_collapsed = dataframe.groupby(["Species"]).sum()
dataframe_collapsed = dataframe_collapsed.sort_values(
by=["Total no. of transitions"], ascending=False
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since this reuses dataframe that we have calculated already in previous if block, let's put it in above block after L245.

Comment on lines +265 to +274
f = open(
f"{output_filename[:-4]}_collapsed{output_filename[-4:]}", "w"
)
f.write(
f"# Line Transitions in Wavelength Range "
f"{self.lam_min.value:.1f} - {self.lam_max.value:.1f}"
f" Angstroms\n"
)
dataframe_collapsed.to_csv(f, sep="\t", index=True)
f.close()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will move above as well. While doing so, can you please change it to with syntax like I suggested in earlier comment?

self.lam_max = lam_max

def plot_summary(
self, nlines=None, lam_min=None, lam_max=None, output_filename=None
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMO we should separate out line contributions file writing from plot_summary() by creating another method like save_summary(). It's not clear to me why it has to be done in plotting function - let me know if I'm missing something

self._lines_ids = None
self._lines_ids_unique = None

@property
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you please add docstring to this non-private property (an others below it) so that docstring coverage pipeline passes?

def plot_summary(
self, nlines=None, lam_min=None, lam_max=None, output_filename=None
):

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A docstring should be added here as well, not only because of pipeline but because it's most important function of this class.

from tardis.visualization.tools.sdec_plot import SDECData


class line_identifier:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class line_identifier:
class LineIdentifier(object):

blatant PEP 8 violation 🤣

logger = logging.getLogger(__name__)


class opacity_calculator(object):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
class opacity_calculator(object):
class OpacityCalculator:

@jamesgillanders
Copy link
Member Author

I will revisit this soon and try to get it to a merge-able state

@andrewfullard
Copy link
Contributor

@jamesgillanders can this still be fixed up to merge?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants