Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Saturation step should increase flagging for group 3 saturation #8413

Open
stscijgbot-jp opened this issue Apr 4, 2024 · 26 comments · May be fixed by #8499
Open

Saturation step should increase flagging for group 3 saturation #8413

stscijgbot-jp opened this issue Apr 4, 2024 · 26 comments · May be fixed by #8499

Comments

@stscijgbot-jp
Copy link
Collaborator

stscijgbot-jp commented Apr 4, 2024

Issue JP-3593 was created on JIRA by David Law:

Edited based on discussion thread below.

Examination of a few extremely-large outliers in PID 1908 (NIRSpec IFU) indicates that these are being caused by cosmic rays that are not being properly flagged in the jump step.

In this case, ramps are 22 groups long.  Since the data is grouped though (nframes/group=5) cosmic rays that cause saturation in group 3 are also significantly biasing group 2 half-way to saturation.  Since all other groups are saturated the pipeline assumes that groups 1 and 2 are good, resulting in extremely bright ramps.

This isn't as much of a problem when saturation is first flagged in group 4 or greater, as there are multiple differences with which to find and reject the abnormally large slope.  Likewise, it's not an issue when saturation is first flagged in group 2, as this simply results in no slope at all.  A solution can thus concentrate specifically on the case of saturation in group 3.

At the same time, we want to be careful not to introduce excessive flagging when saturation occurs due to normal counts received from a source, as this would decrease the effective dynamic range of some observations.

The proposed solution is therefore to modify the saturation step.  For any data obtained with nframes/group > 1, look for pixels for which SATURATION is set in group 3.  For each of these pixels, look at the counts in group 1.  If the count value in group 1 is less than 1/10 of the value that would be required to trigger SATURATION (i.e., it isn't obviously a very bright source), then also set the SATURATION flag for this pixel in group 2.

Once this has been coded it will be necessary for testing to occur on both NIRSpec IFU data to ensure that it fixes the original problem, and on NIRISS/NIRCAM data to ensure that it doesn't produce any unintended consequences.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Tagging Melanie Clarke , Bryan Hilbert , and Rachel Plesha to be aware and chime in if necessary.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

Instead of having the jump step add saturation flagging to group N-1 when NFRAMES>1, perhaps it would be cleaner and more obvious to have that done already in the saturation step?

@stscijgbot-jp stscijgbot-jp changed the title Jump step should increase flagging for group 3 saturation Saturation step should increase flagging for group 3 saturation Apr 5, 2024
@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Thanks Howard Bushouse ; I agree that makes more sense.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Adding a crude ds9 plot just to make the point for completeness; hacking this change into the pipeline as a post-hook indicates that it should clean up spectra substantially.  spec.png left panel is the spectrum in a sky region in the old cube, right is the new cube with this saturation flag change.  Previously ultra-bright artifacts messed up the ability to see the spectrum in default scaling, whereas the fixed version is substantially improved.  Still some gains to be had, but this should hit the tallest poles.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Also tagging Jane Morrison to be aware of this.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

The description given above for the desired change suggests that any time NFRAMES>1 the group before the one that actually crosses the defined saturation threshold should also be flagged as saturated. Just wondering if there would be any advantage to making this dependent on the exact value of NFRAMES. For example, I can imagine that if NFRAMES is only a few (2-4), then a hit in one of the latter frames could have a much larger impact on the frame-averaged group value, than if NFRAMES were much larger (say > 8). Thoughts?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

I think we want to do the flagging whenever NFRAMES > 1.  Even if there are 8 frames, the slope would still be bad and shouldn't be used.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Kevin Volk on JIRA:

This needs more discussion if it is to be a general change that applies to all modes not just to the NIRSpec IFU case.  So yes this issue with cosmic rays is a known problem, which for example causes many snowballs to not get flagged in the jump step for the frame-averaged cases depending upon where within the group the snowball appears.  However, the proposal to flag group N-1 if group N is over the linearity limit will have an adverse effect on the stars in the field and will change the bright limit down to fainter stars by a factor of 2.  This is not a good thing when we need the stars to align to Gaia, for example.  My own feeling is that for imaging mode generally it is much better to NOT make such a change; the cosmic ray events that pass through the jump step are removed in level 3 when then different dither positions are combined, so this issue does not generally have an adverse effect on the output science products.  We see cosmic ray and snowball residuals in the rate images in NIS read-out all the time, but these are nearly always removed very well in the level 3 products.  I assume that the same is true for NIRCam.

 

Another case where this definitely is not a good idea is the NIRISS SOSS mode where we wish to retain as many groups as possible below the linearity correction threshold (although that normally does not use frame averaging).  Flagging another group as "saturated" is very bad for SOSS mode especially because they are often pushing the bright limit and short ramps are common.

 

For NIRISS there is also the question of how this change might interact with the charge migration correction, although the charge migration threshold is somewhat lower than the linearity limit for most of the pixels, and one could hope that the gap is large enough that such a change would not be an issue. 

 

If this change is specific to the NIRSpec IFU case then all this is moot, but the ticket seems to suggest a general change.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Kevin Volk Following up on this to understand a little better.  The issue really seems to be for very-bright sources in readout modes optimized for faint sources.  Further, saturation in group 2 is irrelevant (since no slopes), and group 4+ doesn't matter as much (since the abnormally-bright slope can be detected as a jump), so we can focus most on cases where saturation is currently being detected in group 3.  This can happen two ways; either from a cosmic ray, or from a very bright source being observed with a readout pattern optimized for faint sources.

For the 4-read groups in NIS, I think the readout would be 1234-5678-9abc for such three-group ramps.  If the third group is flagged as saturated but the first is not, I'd assume that would mean that saturation actually occurred  in either frame 6,7,8, or 9.  If it occurred in frame a, I would think that the group average of 9abc would be 97.5% of saturation level.  I.e., most of the time the group 2 average of frames 5678 would also be biased, regardless of whether saturation occurred due to a CR or the source itself.

For purposes of alignment stars, I assume a small bias in the slope is still a good enough result, and preferable to losing the pixel entirely?  Is that also the case for SOSS though, where a likely-incorrect slope is preferable to no slope?

It may be possible to split by imaging/spectroscopy; are there any observations that have been obtained by SOSS using the NIS read pattern?  I thought this mode required the NISRAPID pattern?

@bhilbert4
Copy link
Collaborator

I think the difference between hard and soft saturation may be important here. For NIRCam, our saturation levels are defined as the levels where the signal becomes non-linear by some amount. So after a pixel crosses the saturation threshold, it's signal can continue to increase. So in the case above where the signal reaches the saturation threshold in frame a, depending on what happens to the signal in frames b and c, the group average could conceivably be above or below the saturation threshold.

@stscijgbot-jp
Copy link
Collaborator Author

stscijgbot-jp commented Apr 17, 2024

Comment by Kevin Volk on JIRA:

FIrst off, the selection of NIS mode may be due to data volume constraints and not primarily because the science targets are faint.  A  lot of NIRISS observations are in parallel with NIRCam and use NIS even for relatively short ramps because of the data volume issue.  Secondly, one still wants to see the brighter objects especially stars along with the faint objects.  This is not only due to aligning images to Gaia, although that is important, but also because existing catalogues like 2MASS or WISE only give data for the brighter objects and one will typically want to make some comparison with existing catalogues for the photometry in the images.  NIRISS almost always has stars that exceed the linearity limit in group 3 when NIS observations are made.

 

Your statement that if the "saturation" happened in frame 10 for a NIS read-out ramp that the previous group would have the data compromised is not correct.  If the linearity limit is exceeded in frame 10 then there is no reason to say that there has to be an issue with the linearity correction in frames 5 through 8.  Yes the linearity correction is less accurate with the frame averaging, but quantifying that is a bit difficult.  Bryan Hilbert did some work on the effects of the frame averaging on the linearity correction some time ago.  I have not tried to quantify this.

 

This is why I really do not like using the term "saturation" for this linearity limit.  These are two different things.  It is a really bad idea to call it saturation when it is something very different.  For NIRISS, roughly half the pixels on the detector reach the A/D limit before they reach actual saturation, and in some cases the linearity correction remains valid almost right to the A/D limit.  So we have not just the linearity limit value but also two additional possible limits to worry about.

 

For your example, if you have a 3-group ramp in NIS readout with frames denoted 1234-5678-9abc as you put it, for uniform illumination if group 3 is above the linearity limit this means that at least frames bc are above the limit.  There is however no way to tell solely from the "saturation" flag whether the situation is that frames 89abc or frames 9abc or frames abc or just frames bc are above the limit.  It depends on the shape of the non-linearity curve around the limit, which varies from one pixel to another, and the actual slope needed to cross the limit in less than 12 data frames but more than 7 data frames (I am assuming that if the limit is exceeded in both frames 7 and 8 the result for this case will be a value over the limit in group 2...one might have an edge case where this is not what happens if the frame 7 value is just slightly over the limit).

 

Finally for SOSS mode the exoplanet people have told me that they would rather have groups that consistently have a small systematic in it from the linearity correction, say, rather than having no data at that frame or group.  One can use NIS readout for any of the NIRISS modes, although for AMI and SOSS observations this is rather unusual.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Kevin Volk Thanks for the explanation, that's helpful.

After discussing a little with Michael Regan, one possible approach would be to revise the algorithm to more carefully target the specific CR failure case in question.

  1. We focus specifically on looking for saturation in group 3 for data with nframes/group > 1.

  2. For any pixel that meets that criterion, look at the counts recorded in group 1.  If the number of counts is more than 1/8 (value TBC) of the way to the saturation limit, then do nothing.  If the number of counts is less than that, set the SATURATED bit in GROUPDQ for that pixel in group 2 as well.

That might allow us to tease apart the case where saturation is happening due to a CR vastly above the normal count rate (which we want to flag), vs due to a bright source (which we don't want to flag).  For saturation due to CRs in anything larger than group 3 we don't make any changes, as the existing code has enough logic to catch these via jump detection.  This approach could potentially fix the NIS residuals at the detector1 stage as well without compromising the cases that you described above.

Thoughts?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Kevin Volk on JIRA:

It may be that this proposed revision will work and not flag pixel core pixels stars in imaging observations.  I am still very leery of making a change to the linearity limit flagging for NIRISS imaging to fix a problem that does not actually affect imaging observations, but at least we can test this change and see if it does what you want for the NIRSpec case without causing issues for NIRISS imaging or WFSS observations.

I would also like to verify that this is specific to the case of saturation in group 3 of a ramp with frame averaging; the description states that the idea is to flag group N-1 as "saturated" when group N is above the "saturation" limit value, not that this is specific to the 3rd group in such ramps.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Kevin Volk The suggestion above was originally to fix the general case of all such groups N, but given this discussion it seems better to refine to the specific case of saturation in just group N=3.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Howard Bushouse on JIRA:

David Law Can you please clarify your statement in the original description of the issue "Likewise, it's not an issue when saturation is first flagged in group 3, as this simply results in no slope at all." If saturation is first flagged in group 3, then groups 1 and 2 are classified as OK and can be used to compute a slope. Did you by any chance mean to say "first flagged in group {}2{}"?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Howard Bushouse Thanks; that was meant to say it's not an issue when saturation is first flagged in group 2.  Fixed in the original text.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Eddie Schlafly on JIRA:

For Roman we have this same issue in the sense that a read can saturate within a group, such that the group does not appear to be saturated but it contains reads which are saturated.  There we address this by adding an extra "read_pattern" argument to the saturation flagging

https://github.com/spacetelescope/stcal/blob/main/src/stcal/saturation/saturation.py#L87-L93

which adjusts the saturation threshold for each group according to how diluted a read at the end of the group would be by earlier unsaturated pixels.  That dilution factor is naturally 1 for nframes = 1 (i.e., no difference there), so it behaves correctly for that case.  It depends on the number of frames in the group, and so it becomes more aggressive at flagging saturation when there are a lot of reads—I think that's correct.  I kind of think this approach is the right one to use for Webb as well.  Sorry for being late to the party.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Eddie Schlafly I'm not sure I fully understand that method.  Let's say nframes=5 per group, and ngroups=3 in the ramp.  Say saturation (in the nframes=1 case) is 50,000 DN.  The dilution factor saturation would then be 50,000/5=10,000 DN, right?

But what if you're looking at a bright source that accumulates at the rate of 2500 DN/frame?  Then you get group1=7500 DN, group2=20,000 DN, group3=32,500 DN.  Nominally both groups 2 and 3 are saturated according to the dilution factor, although if you were to actually look at the final frame in group 3 it would have just 37,500 DN and not actually be saturated?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Eddie Schlafly on JIRA:

In this case the dilution factors are:
group 1: mean(1,2,3,4,5)/5 = 0.6
group 2: mean(6,7,8,9,10)/10 = 0.8
group 3: mean(11,12,13,14,15)/15 = 0.87
So the saturation cuts would be at 30k, 40k, and 43k and no group would be marked saturated. I'm on my phone so I should check my math tomorrow, but that looks right to me?

@drlaw1558
Copy link
Collaborator

Ah, I see, I misunderstood how the dilution factor worked. Thanks for clarifying. In that case I agree for the bright source case but would have the opposite question, in the faint-source with cosmic limit. Suppose an empty part of the field with 1 DN/frame, and an instantly-saturating CR (value 60,000 DN say, with flagging at 50,000) in frame 10.
Then group 1 mean is 3 DN, diluted CR threshold is 0.650,000 = 30,000 = All good
group 2 mean is 12,006, diluted CR threshold is 0.8
50,000 = 40,000 = Not flagging a case that it should
group 3 mean is 60,000, so obviously flags ok

In order to successfully flag the CR in group 2 you need to saturate in frame 6 or 7, but it doesn't catch frames 8/9/10?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Maria Pena-Guerrero on JIRA:

I still don't quite understand the dilution method. I was trying to run a test with jwst data but I got an error because the code is expecting this:

read_pattern : List[List[float or int]] or None

        The times or indices of the frames composing each group

Is this how Roman has read_pattern in the datamodel or is there something special that you do to create that list?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Eddie Schlafly on JIRA:

Ugh, thanks, sorry for being dense.  Yes, I agree that that doesn't get handled well by the dilution factor (though I think the dilution factor does make sense for Webb, independent of this issue).

How do you like the pseudocode proposal below?  It tries to use read_pattern to generate appropriate cuts along the lines of your proposal, and adds an additional cut that the value in the second group must be high before being considered as potentially affected by saturation.

for group in range(groups):

    if group == 2:

        mask = data[0] / mean(read_pattern[0]) * read_pattern[2][-1] < sat_thresh

        # identify groups which we wouldn't expect to saturate by the third group, on the basis of the first group

        mask &= data[1] > sat_thresh / len(read_pattern[1])

        # identify groups with suspiciously large values in the second group.

        mask &= flag[2] & SATURATION != 0

        # identify groups that are saturated in the third group

        flag[1, mask] |= SATURATION

        # flag the 2nd group for the pixels passing that gauntlet.

 

IMO you could generate the read_pattern from nframes by

[[x + 1 + groupstart * nframe for x in range(nframe)] for groupstart in range(ngroup)]

or derive the relevant terms directly; I'm obviously partial to read_pattern for Roman only because we want to be general there.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by David Law on JIRA:

Eddie Schlafly I think that should work.  Testing out a quick implementation of your pseudocode it does essentially what the algorithm proposed above was doing, but without the rather hacky and arbitrary factor of 1/10 that I'd suggested which should make it more general to different numbers of nframes per group.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Eddie Schlafly on JIRA:

Great, yes, I was clearly taking your proposal and trying to make it slightly more generic to be able to accommodate Roman's uneven MA tables!  Slightly nice feature that nframe = 1 is no longer a special case since then data[1] > sat_thresh / len(read_pattern[1]) is just the normal saturation cut.

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Maria Pena-Guerrero on JIRA:

Eddie Schlafly are you putting in a PR for the changes in this ticket with your changes to the code? should I close the current WIP PR?

@stscijgbot-jp
Copy link
Collaborator Author

Comment by Eddie Schlafly on JIRA:

Thanks Maria.  I wasn't planning to put in a PR; I was mostly just looking for a way to implement the changes such that they would work for both Roman and Webb.  I think my suggestion satisfies that need, so if you were able to update the PR along those lines then I would stop objecting on grounds that I don't know what things mean for Roman.  Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants