Skip to content

Latest commit

 

History

History
516 lines (416 loc) · 24.8 KB

SEP-0012.md

File metadata and controls

516 lines (416 loc) · 24.8 KB

SEP-0012 -- NDCube 2

SEP 12
title NDCube 2
author(s) Stuart Mumford
contact email stuart@cadair.com
date-creation 2020-10-13
type standard
discussion
status accepted

Introduction

This SEP proposes a redesigned API for the NDCube class and its associated classes in the next major version of the ndcube package. The primary objective of this SEP is to gather as much community feedback on the proposed API before it is released. This is also a prelude to a future change, long discussed, to have the GenericMap class in sunpy core inherit from NDCube. This SEP should therefore also be considered as a discussion of future changes to the Map API, as the API proposed here would eventually (with long deprecation) be adopted by map for the functionality provided here. That being said, details of the migration of Map to this API are out of scope for this SEP, just a discussion of the eventual API to be adopted.

Background

The creation of ndcube was originally motivated by the desire to represent and interrogate solar datasets with more than two dimensions, for example two spatial and one spectral (wavelength) dimension. The fundamental objective of the package's primary class, NDCube, is to provide a user interface to this data which would unite the data with the coordinate information describing the data. The benefit of NDCube is its ability to do array-like operations which also update the coordinate system. In addition, NDCube works hard to provide information to the user that describes their data based on the coordinate system information. NDCube also provides simple summary visualisation support included in the object. This uses matplotlib to produce quick, coordinate-aware, and interactive visualisations of the data. The final major component added in NDCube is the concept of "extra coords" which allows the specification and integration with coordinates describing the data not specified by the primary WCS for the cube.

ndcube 2

ndcube 1.0 was built on the functionality which existed in astropy at the time and provides a very useful foundation for tools such as the sunraster package as well as other efforts to handle multi-dimensional data in Python. During the process of writing NDCube 1.0 a lot of potential functionality which could be added to astropy to improve the WCS support was identified. In addition to this in-between the release of ndcube 1.0 and the development of 2.0, various long running projects such as the acceptance and implementation of APE 14: "A shared Python interface for World Coordinate Systems" were completed. This means that once many of these brand new features had been implemented in Astropy and gWCS it provided an excellent opportunity to reassess the state of NDCube and to rebase it onto this new WCS functionality.

The main motivation for updating ndcube to use the APE 14 interface was to provide support for gWCS which is being used by the DKIST user tools. A lot of the other changes presented in this SEP are a consequence of this decision.

Summary of Changes from ndcube 1

The fundamental change since the original version of ndcube is that the WCS type has been changed to be the High Level API as described by astropy's APE 14. This opens a lot of possibilities to the future uses of NDCube as the APE 14 API allows the mapping between pixel and world coordinates to be described by almost any functional transform. This could be the commonly used FITS WCS standard, or gWCS or custom implementations for specific facilities with different requirements.

Having made this change, which was a large breaking change from the original API, we reviewed all the existing functionality and API as it was adapted to use the high level WCS interface and the result is presented below.

Detailed Description

The API presented here is limited to the three core classes NDCubeABC, ExtraCoordsABC and GlobalCoordsABC. This is primarily an attempt to limit the discussion in this SEP to the core API and not implementation decisions taken in the development of ndcube. If people are interested in the implementation of the proposed API large amounts of it are implemented on the master branch of the ndcube repository or open pull requests.

The NDCube class inherits the astropy NDDataBase class, including inheriting the requirement that the .wcs property be a BaseHighLevelWCS instance. In the following code snippets of the API use the Python type hint syntax as appropriate to highlight accepted types and return types, these are not enforced at runtime.

Overview

Before jumping into the details of the API, this section will provide a brief high level summary of the way the following three classes interact. For a detailed introduction to the whole ndcube package please see the ndcube documentation, a high level familiarity with the package is assumed in the rest of this SEP.

The NDCube object inherits from astropy.nddata.NDData and is a container which holds an array, a WCS and extra information about the array like a mask, unit, uncertainty, etc. One of the main bits of functionality that NDCube adds on top of NDData is some API which allows the user to add "extra" coordinate information to describe the array. This information is stored in the ExtraCoords object which is accessible from the NDCube.extra_coords property. The main use case for these "extra" coords is when an axis of the array has one coordinate described by the primary WCS but that axis also has a secondary coordinate. A concrete example of this in solar physics is a rastering spectrograph, where one of the spatial axes (the array axis along which the raster is built) also has a time coordinate, this coordinate is not normally included in the WCS.

The second added component, new in 2.0, is the GlobalCoords class. This class, in the same way as ExtraCoords holds additional coordinate information, however, the main difference is that GlobalCoords are not associated with an axes of the array. This, almost always, means they are scalars; for example a point in space or a time. This gobal_coords attribute could hold the HPC coordinate of a line spectra, or the time of a map in a way which allows predictable programmatic access to this metadata. The other thing which is implemented in ndcube, is that when the array is slice such that a dimension is dropped, the coordinate from that axis of the WCS is added automatically to GlobalCoords, meaning this metadata is not lost.

On WCSes

The API specification below includes three distinct WCS like objects (all conforming to the Astropy WCS API). These are .wcs, .extra_coords and .combined_wcs. The .wcs attribute is the main WCS object for the array; it is required that it has the same number of pixel dimensions as there are array dimensions, the number of world dimensions is arbitrary. The .extra_coords object implements the WCS API but is not required to be a valid WCS for the whole array; the number of pixel dimensions do not have to match the number of array dimensions. This is because the user can add as many coordinates to the extra coords as is needed, the ExtraCoords object then keeps a mapping of the pixel dimensions in its WCS to the array dimensions in the NDCube object. Finally, there is the .combined_wcs, this is an APE 14 WCS object which combines the coordinates in the .wcs and the .extra_coords properties.

In an ideal world, this API would propose using the .combined_wcs property as the default for all the WCS operations supported by the NDCube API. This however, is not currently practical, due to implementation challenges. While a robust and performant implementation of .combined_wcs is something to be desired, blocking the release of ndcube 2.0 on it would be foolish, therefore this API in its current form makes its use optional.

NDCube

Below is a class definition for the Abstract Base Class (ABC) for NDCube. Its purpose is to specify the API as completely as possible, including input and return types. The behaviour of all the methods should be described by their included docstrings, a version of this (likely out of sync) has been PRed to ndcube.

class NDCubeABC(astropy.nddata.NDDataBase):

    @abc.abstractmethod
    def __init__(self,
                 data: numpy.typing.ArrayLike,
                 uncertainty: Any = None,
                 mask: numpy.typing.ArrayLike = None,
                 wcs: Union[BaseLowLevelWCS, BaseHighLevelWCS] = None,
                 meta: Mapping = None,
                 unit: Union[str, u.Unit] = None,
                 copy: bool = False):
        """
        This signature is inherited from NDData.

        The only difference is that the wcs argument should be considered mandatory.
        """

    @abc.abstractmethod
    def __getitem__(self, item: Union[int, slice, Iterable[Union[slice, int]]]) -> "NDCubeABC":
        """
        Slice the NDCube object by array index.

        An ndcube object must be sliceable in a similar way to a numpy array,
        however, support for a `step` is not required and no "fancy indexing" is allowed.

        A step is prohibited due to the ambiguity inherent in applying a step to the WCS
        object; does the step mean a sampling so the coordinate axis becomes discontinuious
        or does it mean that the size of the pixels should be increased by a factor of the step.

        Fancy indexing, or
        `advanced indexing <https://numpy.org/doc/stable/reference/arrays.indexing.html#advanced-indexing>`__
        is not supported both due to similar issues with applying it to the WCS, as
        well as it being poorly supported outside of numpy arrays.

        When a slice is applied to a NDCube object it must be applied to the following attributes:

          * data
          * wcs
          * uncertainty
          * mask
          * extra_coords

        the following attributes must be copied unaltered to the returned NDCube:

          * unit
      The following attribute should be sliced if they have a magic attribute, ``.__ndcube_can_slice__``, that evaluates to ``True``.
      Otherwise they should be copied unaltered:
          * meta

        """

    @abc.abstractmethod
    @property
    def extra_coords(self) -> ExtraCoordsABC:
        """
        Coordinates not described by ``NDCubeABC.wcs`` which vary along one or more axes.
        """

    @abc.abstractmethod
    @property
    def global_coords(self) -> GlobalCoordsABC:
        """
        Coordinate metadata which applies to the whole cube.
        """

    @abc.abstractmethod
    @property
    def wcs(self) -> BaseHighLevelWCS:
        """
        The WCS transform for the NDCube.

        If a `BaseLowLevelWCS` is provided to the constructor it should be
        "upgraded" to a high level object.
        """

    @abc.abstractmethod
    @property
    def combined_wcs(self) -> BaseHighLevelWCS:
        """
        The WCS transform for the NDCube, including the coordinates specified in ``.extra_coords``.

        This transform should implement the high level wcsapi, and have
        `pixel_n_dim` equal to the number of array dimensions in the
        `.NDCube`. The number of world dimensions should be equal to the
        number of world dimensions in ``self.wcs`` and in ``self.extra_coords`` combined.
        """

    @abc.abstractmethod
    @property
    def array_axis_physical_types(self) -> Iterable[Tuple[str, ...]]:
        """
        Returns the WCS physical types that vary along each array axis.

        Returns an iterable of tuples where each tuple corresponds to an array axis and
        holds strings denoting the WCS physical types associated with that array axis.
        Since multiple physical types can be associated with one array axis, tuples can
        be of different lengths. Likewise, as a single physical type can correspond to
        multiple array axes, the same physical type string can appear in multiple tuples.

        The physical types returned by this property are drawn from the
        `~NDCube.combined_wcs` property so they include the coordinates contained in
        `~.NDCube.extra_coords`.
        """

    @abc.abstractmethod
    def axis_world_coords(self,
                          *axes: Union[int, str],
                          pixel_corners: bool = False,
                          wcs: Optional[Union[BaseHighLevelWCS, ExtraCoordsABC]] = None
                          ) -> Iterable[Any]:
        """
        Returns objects representing the world coordinates of all pixel centers.

        Parameters
        ----------
        axes
            Axis number(s) in numpy ordering or unique substring of
            `.NDCube.wcs.world_axis_physical_types` or
            `.NDCube.wcs.world_axis_names` of axes for which real world
            coordinates are desired. Not specifying axes implies all axes will be
            returned.

        pixel_corners
            If `True` then instead of returning the coordinates of the pixel
            centers the coordinates of the pixel corners will be returned, this
            increases the size of the output by 1 as all corners are returned.

        wcs
            The WCS object to used to calculate the world coordinates.
            Although technically this can be any valid WCS, it will typically be
            ``self.wcs``, ``self.extra_coords``, or ``self.combined_wcs``, combing both
            the WCS and extra coords.
            Default=self.wcs

        Returns
        -------
        axes_coords
            An iterable of "high level" objects giving the real world
            coords for the axes requested by user.
            For example, a tuple of `~astropy.coordinates.SkyCoord` objects.
            The types returned are determined by the WCS object.
            The dimensionality of these objects should match that of
            their corresponding array dimensions.

        Example
        -------
        >>> NDCube.all_world_coords(('lat', 'lon'))
        >>> NDCube.all_world_coords(2)

        """

    @abc.abstractmethod
    def axis_world_coords_values(self,
                                 *axes: Union[int, str],
                                 pixel_corners: bool = False,
                                 wcs: Optional[Union[BaseHighLevelWCS, ExtraCoordsABC]] = None
                                 ) -> Iterable[u.Quantity]:
        """
        Returns world coordinate values of all pixel centers.

        Parameters
        ----------
        axes
            Axis number in numpy ordering or unique substring of
            `.NDCube.wcs.world_axis_physical_types` or
            `.NDCube.wcs.world_axis_names` of axes for which real world
            coordinates are desired. ``axes=None`` implies all axes will be
            returned.

        pixel_corners
            If `True` then instead of returning the coordinates of the pixel
            centers the coordinates of the pixel corners will be returned, this
            increases the size of the output by 1 as all corners are returned.

        wcs
            The WCS object to used to calculate the world coordinates.
            Although technically this can be any valid WCS, it will typically be
            ``self.wcs``, ``self.extra_coords``, or ``self.combined_wcs``, combing both
            the WCS and extra coords.
            Default=self.wcs

        Returns
        -------
        axes_coords
            `~astropy.units.Quantity` or iterable thereof for all requested
            world axes, units determined by the wcs.

        Example
        -------
        >>> NDCube.all_world_coords_values(('lat', 'lon'))
        >>> NDCube.all_world_coords_values(2)

        """

    @abc.abstractmethod
    def crop(self,
             *points: Iterable[Any],
             wcs: Optional[Union[BaseHighLevelWCS, ExtraCoordsABC]] = None
             ) -> NDCubeABC:
        """
        Crop to the smallest cube in pixel space containing the world coordinate points.

        Parameters
        ----------
        points
            Tuples of high level coordinate objects e.g. SkyCoord.
            These points are passed to ``wcs.world_to_array_index``
            so their number and order must be compatible with the API of that method.

        wcs
            The WCS to use to calculate the pixel coordinates based on the input.
            Will default to the ``.wcs`` property if not given. While any valid WCS
            could be used it is expected that either the ``.wcs`` or
            ``.extra_coords`` properties will be used.

        Returns
        -------
        result: NDCube
        """

    @abc.abstractmethod
    def crop_by_values(self,
                       *points: Iterable[Union[u.Quantity, float]],
                       units: Optional[Iterable[Union[str, u.Unit]]] = None,
                       wcs: Optional[Union[BaseHighLevelWCS, ExtraCoordsABC]] = None
                       ) -> NDCubeABC:
        """
        Crop to the smallest cube in pixel space containing the world coordinate points.

        Parameters
        ----------
        points
            Tuples of coordinate values, the length of the tuples must be
            equal to the number of world dimensions. These points are
            passed to ``wcs.world_to_array_index_values`` so their units
            and order must be compatible with that method.

        units
            If the inputs are set without units, the user must set the units
            inside this argument as `str` or `~astropy.units.Unit` objects.
            The length of the iterable must equal the number of world dimensions
            and must have the same order as the coordinate points.

        wcs
            The WCS to use to calculate the pixel coordinates based on the input.
            Will default to the ``.wcs`` property if not given. While any valid WCS
            could be used it is expected that either the ``.wcs`` or
            ``.extra_coords`` properties will be used.

        Returns
        -------
        result: NDCube
        """

Extra Coords

The extra coords property of the NDCube class provides a core unique selling point of the package. It is an easy way to add extra variables which vary with one, or more, of the axes of the data. The proposed implementation of ExtraCoords is heavily dependent on gWCS, however, the goal for this SEP is to formulate the high level API and not the implementation details. This will hopefully allow future iteration on the implementation without breaking the functionality of the NDCube class.

class ExtraCoordsABC(abc.ABC):
    """
    A representation of additional world coordinates associated with pixel axes.

    ExtraCoords can be initialised by either specifying a
    `~astropy.wcs.wcsapi.LowLevelWCS` object and a ``mapping``, or it can be
    built up by specifying one or more lookup tables.

    Parameters
    ----------
    wcs
        The WCS specifying the extra coordinates.
    mapping
       The mapping between the array dimensions and pixel dimensions in the
       extra coords object. This is an iterable of ``(array_dimension, pixel_dimension)`` pairs
       of length equal to the number of pixel dimensions in the extra coords.

    """

    @abc.abstractmethod
    def __init__(self, ndcube: Optional[NDCubeABC] = None):
        self._ndcube = ndcube

    @abc.abstractmethod
    def add(self,
            name: str,
            array_dimension: Union[int, Iterable[int]],
            lookup_table: Any,
            **kwargs):
        """
        Add a coordinate to this ``ExtraCoords`` based on a lookup table.

        Parameters
        ----------
        name
            The name for this coordinate(s).
        array_dimension
            The pixel dimension(s), in the array, to which this lookup table corresponds.
        lookup_table
            The lookup table.
        """

    @abc.abstractmethod
    def keys(self) -> Iterable[str]:
        """
        The world axis names for all the coordinates in the extra coords.
        """

    @abc.abstractmethod
    @property
    def mapping(self) -> Iterable[Tuple[int, int]]:
        """
        The mapping between the array dimensions and pixel dimensions.

        This is an iterable of ``(array_dimension, pixel_dimension)`` pairs
        of length equal to the number of pixel dimensions in the extra coords.
        """

    @abc.abstractmethod
    @property
    def wcs(self) -> BaseHighLevelWCS:
        """
        A WCS object representing the world coordinates described by this ``ExtraCoords``.

        .. note::
            This WCS object does not map to the pixel dimensions of the data array
            in the `.NDCube` object. It only includes pixel dimensions associated
            with the extra coordinates.  For example, if there is only one extra coordinate
            associated with a single pixel dimension, this WCS will only have 1 pixel dimension,
            even if the `.NDCube` object has a data array of 2-D or greater.
            Therefore using this WCS directly might lead to some confusing results.

        """

    @abc.abstractmethod
    def __getitem__(self, item: Union[str, int, slice, Iterable[Union[int, slice]]]) -> "ExtraCoordsABC":
        """
        ExtraCoords can be sliced with either a string, or a numpy like slice.

        When sliced with a string it should return a new ExtraCoords object
        with only those coordinates with the given names. When sliced with a
        numpy array like slice it should return a new ExtraCoords with the
        slice applied. Supporting step is not required and "fancy indexing" is
        not supported.
        """

Global Coords

The final part of the NDCube API is a class to provide structured metadata for coordinate values which are applicable to the whole cube. One possible use (not mandated by this SEP) is to have any coordinates dropped by a slicing operation on a higher dimensionality cube be tracked here. This class is only a slight extension of the API for an immutable dictionary, to enable the tracking of the physical type of the coordinate as well as its name and value. The class behaves like a mapping between coordinate name and coordinate object for all read operations.

from collections.abc import Mapping


class GlobalCoordsABC(Mapping):
    """
    A structured representation of coordinate information applicable to a whole NDCube.

    This class acts as a mapping between coordinate name and the coordinate object.
    In addition to this a physical type is stored for each coordinate name.
    A concrete implementation of this class must fulfill the `Mapping` ABC,
    including methods such as `__iter__` and `__len__`.

    Parameters
    ----------
    ndcube : `.NDCube`, optional
        The parent ndcube for this object. Used to extract global coordinates
        from the wcs and extra coords of the ndcube. If not specified only
        coordinates explicitly added will be shown.
    """
    def __init__(self, ndcube: Optional[NDCubeABC] = None):
        self._ndcube = ndcube

    def add(self,
            name: str,
            physical_type: str,
            coord: Any
            ):
        """
        Add a new coordinate to the collection.

        Parameters
        ----------
        name
            The name for the coordinate.
        physical_type
            An `IOVA UCD1+ physical type description for the coordinate
            <http://www.ivoa.net/documents/latest/UCDlist.html>`__. If no matching UCD
            type exists, this can instead be ``"custom:xxx"``, where ``xxx`` is an
            arbitrary string. If not known, can be `None`.
        coord
            The object describing the coordinate value, for example a
            `~astropy.units.Quantity` or a `~astropy.coordinates.SkyCoord`.
        """

    def remove(self, name: str):
        """
        Remove a coordinate from the collection.
        """

    @property
    def physical_types(self):
        """
        A mapping of names to physical types for each coordinate.
        """

    def __getitem__(self, item: str):
        """
        Indexing the object by name should return the coordinate object.
        """