Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Concat_images result not consistent between load/save cycles #986

Open
spolcyn opened this issue Jan 2, 2021 · 0 comments
Open

Concat_images result not consistent between load/save cycles #986

spolcyn opened this issue Jan 2, 2021 · 0 comments

Comments

@spolcyn
Copy link

spolcyn commented Jan 2, 2021

Describe the bug
After using concat_images to concatenate two NIfTI images, then saving the new image to disk and loading it back into memory, the values of the data arrays are no longer equal.

In my case, the source and destination NIfTI file both have uint16 data, so there should be no issue in floating point loss, and any conversions to float32 or greater along the way should not result in a loss of precision.

To Reproduce

  1. Use concat_images on two NIfTI images (one of which should have a integer data type, e.g., uint16)
  2. Save the result
  3. Load the result
  4. Comparing the images' data arrays with np.array_equal

See the test written into the pull request, removing the patch in funcs.py to trigger the failure. I'd expect this issue to generally occur with NIfTI's using integers as their data format.

Alternatively, use the script in this gist with your own NIfTI file.

Expected behavior
The post-concat in-memory representation of the data should be equal to the data after it's been saved/re-loaded from disk.

Actual behavior
The data was no longer equal.

Likely Reason
It is almost certain that this is related to the datatype of the dataobj after concat_images. My analysis of the reason is as follows:

  1. Images with uint16 dataobj's enter the function
  2. Out_data is created with the correct shape, but with default dtype=np.float64
  3. Out_data gets the data, upcasting it to a float64
  4. Out_data is used to form the new image
  5. The new image, when written to disk, finds a conflict between the datatype it should be on disk (uint16) and the current dataobj type (float64).
  6. To get around this problem, a scaling slope is applied -- however, with the 32 bit float in the header, the scaling is not perfect (at least with values in my test images).
    a. Note: This can be verified by examining the NIfTI header binary directly, and for my test I found an scl_slope of 0x3d4e00ce = 0.0502937361598, and each data element was off by exactly 19 + 5/6. Doing the math, 1/(19 + 5/6) = 0.050420168067227, which is close but not quite exactly the scl_slope value. (This may suggest choosing the scl_slope value in the save() routine could be tweaked, but without really digging into that code myself, I'm not sure -- it would appear the most accurate approximation of that number for a float32 is 0.0504201687872409820556640625, or 0x3d4e8561 in hex).
  7. Accordingly, when the image is re-read in, the float64 result is slightly off (slightly less than the true value for my test images), and the uint16 result is also incorrect (at least for my test images, as the values typically work out to be slightly less than the true value, which results in a floor to a full integer beneath the true value)

Potential Fix
Including dtype in out_data = np.empty(out_shape, dtype=img0.header.get_data_dtype()) in funcs.py fixes the problem (alternatively, .dataobj.dtype could be used, but using the header seems more maintainable).

I've put in a PR with a test modification and my proposed fix, happy to discuss it either there or here.

Related to #1001

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant