Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal for common data format #12

Open
dfm opened this issue Jun 12, 2013 · 9 comments
Open

Proposal for common data format #12

dfm opened this issue Jun 12, 2013 · 9 comments

Comments

@dfm
Copy link
Member

dfm commented Jun 12, 2013

We need a way of comparing the output of all of the detrending algorithms. This will (probably) involve something including but not limited to:

  • running a standardized search algorithm on all the different outputs
  • visualizing the results of the different methods in the same way/simultaneously
  • other things?

Anything that we do will benefit a common output format for the codes (for obvious reasons).

I see 2 main options:

  1. ASCII tables (gasp!) with specified columns (kbjd, detrended_flux, detrended_flux_uncert, ...)
  2. FITS tables with the same format as the original Kepler data products (including the relevant metadata) with added columns with the same information as above

The first option is far easier to implement in any programming language (lowering the barrier to entry) so I'm probably inclined to go with that but the second one seems more useful (and self-contained) for the search phase depending on what we decide to do.

Thoughts?

@rwolpert
Copy link
Member

Is there a way to take ASCII option, plus an auxiliary function or perl script to convert?

Dan Foreman-Mackey notifications@github.com wrote:

We need a way of comparing the output of all of the detrending
algorithms. This will (probably) involve something including but not
limited to:

  • running a standardized search algorithm on all the different outputs
  • visualizing the results of the different methods in the same
    way/simultaneously
  • other things?

Anything that we do will benefit a common output format for the codes
(for obvious reasons).

I see 2 main options:

  1. ASCII tables (gasp!) with specified columns (kbjd, detrended_flux,
    detrended_flux_uncert, ...)
  2. FITS tables with the same format as the original Kepler data
    products (including the relevant metadata) with added columns with the
    same information as above

The first option is far easier to implement in any programming language
(lowering the barrier to entry) so I'm probably inclined to go with
that but the second one seems more useful (and self-contained) for the
search phase depending on what we decide to do.

Thoughts?


Reply to this email directly or view it on GitHub:
#12

On the go from my phone...

@pdbaines
Copy link
Member

ASCII! 👍

@dfm
Copy link
Member Author

dfm commented Jun 12, 2013

Yeah. That's a good idea! Sounds like ASCII + auxiliary script is a good idea.

What columns do we need? I mentioned kbjd, detrended_flux, and detrended_flux_uncert above. Any others?

@eford
Copy link
Member

eford commented Jun 13, 2013

What about an integer flag, where 0 = this point was used for calculating
detrending, and non-zero values provide information about why a point was
excluded?

On Wed, Jun 12, 2013 at 12:25 PM, Dan Foreman-Mackey <
notifications@github.com> wrote:

Yeah. That's a good idea! Sounds like ASCII + auxiliary script is a good
idea.

What columns do we need? I mentioned kbjd, detrended_flux, and
detrended_flux_uncert above. Any others?


Reply to this email directly or view it on GitHubhttps://github.com//issues/12#issuecomment-19337457
.

@aprsa
Copy link
Member

aprsa commented Jun 13, 2013

We need a way of comparing the output of all of the detrending
algorithms. This will (probably) involve something including but not
limited to:
[snip]

Do we have a list of sandbox KICs somewhere on git? If so, I can run
our detrender on them tonight/tomorrow.

Cheers,
Andrej

@jessielchristiansen
Copy link
Member

Yes, they are in the detrending/documents directory - a bunch of text files containing all the KICs in the skygroups, all the TCEs identified by the pipeline in Q1-Q12 in the skygroups, three quiet 12th magnitude G stars, three bright variable stars, and a couple of other poster children (Kepler-37 eg).

@benmontet
Copy link
Member

Have we decided what will be our test suite will be to compare algorithms
on? If not, I propose we use at least a subset of the variable stars using
a selection from each sky group.

On Thursday, June 13, 2013, jessielchristiansen wrote:

Yes, they are in the detrending/documents directory - a bunch of text
files containing all the KICs in the skygroups, all the TCEs identified by
the pipeline in Q1-Q12 in the skygroups, three quiet 12th magnitude G
stars, three bright variable stars, and a couple of other poster children
(Kepler-37 eg).


Reply to this email directly or view it on GitHubhttps://github.com//issues/12#issuecomment-19429412
.

@jessielchristiansen
Copy link
Member

Depends how your detrending algorithms work. If they need the ensemble of stars to identify common modes, then I would use the whole set of KICs in each skygroup. Otherwise hit up the variable stars!

@dfm
Copy link
Member Author

dfm commented Jun 13, 2013

The sandbox data are here and an example output is here.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

7 participants