Documentation suggestion: recommendations for working with integer NIFTI images #1001

pauldmccarthy · 2021-03-11T19:54:27Z

Hi @effigies and @matthew-brett (pinging you both directly, because you've had the most involvement in the relevant bits of the nibabel code).

A colleague of mine, very competent, but fairly new to nibabel and numpy, recently found himself in a very confusing situation, ultimately brought about by the scl_slope / scl_inter rescaling logic baked into nibabel, and I'm wondering if another section should be added to the nibabel documentation which gives recommendations on how to work with NIFTI images of integral type.

A full example is given below, but in brief, he was working with a NIFTI1 image of type int32, but working with float64 arrays in memory. He was repeatedly loading an image in, manipulating it, and saving it back out as an int32 image. After a couple of iterations, the floating point imprecision in the scl_slope and scl_inter fields had accumulated enough that the integral values of his image data were no longer being re-constructed, which caused problems when he tried to create and use a binary mask.

The problem boils down to a combination of two coding patterns that are in common use:

Using get_fdata to retrieve the image data, so that it is always loaded and manipulated as float64, regardless of the storage type.

When creating a new image from an existing image (e.g. after some processing step), using the header from the original image to preserve the NIFTI header metadata (not that there's much of use in there, but that's another matter):

image1 = nib.load('image1.nii.gz')
data   = image1.get_fdata()

# do stuff to data

# Save new image, preserving header metadata
# The catch here is that the data is float64,
# but the header specifies int32 (or some
# integral type), so the data gets re-scaled
# to use up the full range of the storage type.
image2 = nib.Nifti1Image(data, None, image1.header)
image2.to_filename('image2.nii.gz')

Using the above patterns with images of integer type can lead to problems, as demonstrated in the example below. So what I'm wondering is whether a section should be added to the nibabel documentation, with some recommended strategies for working with integral data. I'd be happy to attempt a first pass if you think it's a good idea.

I can think of a few solutions/patterns to avoiding this problem, any of which could be the best approach depending on the situation:

When loading, loading the input data as its storage type
When loading, rounding rather than casting, to avoid truncation
When writing, converting the data to the desired storage type

To reproduce the situation that my colleague found himself in:

import numpy  as np
import nibabel as nib

# Create some random int32 data
# and save it as a NIFTI image
nib.Nifti1Image(
    np.random.randint(0, 100, (20, 20, 20)).astype(np.int32),
    np.eye(4)).to_filename('image1.nii.gz')

# Now load that image in using
# the recommended get_fdata
image1 = nib.load('image1.nii.gz')
data1  = image1.get_fdata()

# Create a copy of the image, preserving
# the heaader info from the original image.
# Because data1 is float64, and the image
# header specifies int32, the data will be
# rescaled to use the full int32 range.
image2 = nib.Nifti1Image(data1, None, header=image1.header)
image2.to_filename('image2.nii.gz')

# Load that image in, and create a binary
# mask from it.
#
# (Here, because scl_slope and scl_inter
# are set on the image, it doesn't matter
# whether we use get_fdata or dataobj -
# in either case we are given the data as
# float64).
#
# Again, we use the same header, and the
# data is thus rescaled again and saved
# as int32.
image3             = nib.load('image2.nii.gz')
data3              = np.asarray(image3.dataobj)
data3[data3 <= 50] = 0
data3[data3 >  50] = 1
image3             = nib.Nifti1Image(data3, None, header=image1.header)
image3.to_filename('image3.nii.gz')

print(f'number of voxels in binary mask: {data3.sum()}')

# Re-load the mask from file. We're
# using it as a mask, so we'll cast it
# to an integer type
image4 = nib.load('image3.nii.gz')
data4  = np.asarray(image4.dataobj).astype(np.int8)

# But now we have a problem - the re-scaling
# which took place when image3.nii.gz was
# loaded have produced values which are
# slightly less than 1, so the cast to int8
# has truncated them and produced a binary
# mask containing all zeros.
print(f'number of voxels in binary mask: {data4.sum()}')
print(f'                          Slope: {image4.dataobj.slope}')
print(f'                          Inter: {image4.dataobj.inter}')

Example output (note the teensy scaling factor):

number of voxels in binary mask: 3980.0
number of voxels in binary mask: 0
                          Slope: 2.3283064365386963e-10
                          Inter: 0.5

The text was updated successfully, but these errors were encountered:

joshuacwnewton · 2022-11-11T21:58:07Z

Hi there! This issue is of great interest to our project (as indicated by the number of pings in the history above).

We work with integer-typed images quite frequently (e.g. integer-valued labels for image landmarks, binary masks, etc.). When our integer data arrays are coerced to float via get_fdata (recommended per the deprecation of get_data), we get into situations where the header will still list int as the datatype, but the array itself is float-typed. This mismatch will then trigger float -> int scaling, which can sometimes result in changing values from, say, 3 to 3.0000457763671875.

To highlight one of the proposed solutions from the issue description:

When loading, loading the input data as its storage type

Right now, to implement this suggestion, we're considering using a pattern akin to:

im.get_fdata().astype(im.header.get_data_dtype())

The idea here being to use this pattern on the very first call to .get_fdata in the example above, to avoid the problematic rescaling.

Though, this assumes that the header metadata is truthworthy. I don't know if this is a safe assumption to make...

effigies · 2022-11-13T18:31:15Z

I guess we did not get around to writing documentation, so here are some notes:

If you want the old img.get_data() behavior, just use np.asanyarray(img.dataobj). This will keep the original data type unless scaling is applied. If scaling factors are present in the data, you will get floats.
If you want a predictable type, use np.int16(img.dataobj) (or whatever type you want). You are welcome to put in whatever checks you want or determine your dtype dynamically. You could equivalently use np.asanyarray(img.dataobj, np.int16).
You may wish to do some slicing. It is most efficient on the dataobj. img.dataobj[:, :, :, 0] will give you the first volume. np.uint8(img.dataobj[:, :, :, 0]) will cast it to a uint8.

effigies added documentation good-hackathon-project labels Apr 13, 2021

effigies added this to Documentation Improvements in Nilearn Dev Days 2021 Apr 26, 2021

spolcyn mentioned this issue May 3, 2021

Concat_images result not consistent between load/save cycles #986

Open

joshuacwnewton mentioned this issue May 4, 2021

as_closest_canonical fails for Nifti1Images created using header and affine=None #1012

Closed

joshuacwnewton mentioned this issue Oct 21, 2022

PAM50 atlas warped to the subject's space gives corrupted values spinalcordtoolbox/spinalcordtoolbox#3843

Closed

joshuacwnewton mentioned this issue Nov 11, 2022

Deal with nibabel's deprecation of get_header and get_data spinalcordtoolbox/spinalcordtoolbox#3854

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Documentation suggestion: recommendations for working with integer NIFTI images #1001

Documentation suggestion: recommendations for working with integer NIFTI images #1001

pauldmccarthy commented Mar 11, 2021

joshuacwnewton commented Nov 11, 2022 •

edited

effigies commented Nov 13, 2022

Documentation suggestion: recommendations for working with integer NIFTI images #1001

Documentation suggestion: recommendations for working with integer NIFTI images #1001

Comments

pauldmccarthy commented Mar 11, 2021

joshuacwnewton commented Nov 11, 2022 • edited

effigies commented Nov 13, 2022

joshuacwnewton commented Nov 11, 2022 •

edited