Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add support for list comprehensions / loops in expression language #2067

Open
rly opened this issue Apr 15, 2024 · 1 comment
Open

Add support for list comprehensions / loops in expression language #2067

rly opened this issue Apr 15, 2024 · 1 comment
Labels
arrays enhancement New feature or request

Comments

@rly
Copy link
Contributor

rly commented Apr 15, 2024

Is your feature request related to a problem? Please describe.
The LinkML expression language has a lot of useful features, but it does not support for loops and list comprehensions. This would be useful when generating lists and arrays.

I would also like range to be supported.

Describe the solution you'd like
For example, I want to define a class for a list of timestamps with a regular interval. It has an array attribute values_in_s that is defined by a starting time, sampling rate, and length. In Python, this would look like
timestamp_values = [1 / sampling_rate + start_time for i in range(length)]

Here is a more concrete example

  RegularlySampledTimestampSeries:
    description: >-
      A 1D array of timestamps, represented efficiently using a sampling rate, starting time,
      and number of elements (length).
    attributes:
      sampling_rate_in_Hz:
        range: float
        required: true
        unit:
          ucum_code: Hz
      starting_time_in_s:
        range: float
        required: true
        unit:
          ucum_code: s
      length:
        range: integer
        required: true
        implements:
          - linkml:length
      values_in_s:
        range: float
        required: true
        unit:
          ucum_code: s
        array:
          exact_number_dimensions: 1
        # I would like to do the following:
        equals_expression: "[i / {sampling_rate_in_Hz} + {starting_time_in_s} for i in range({length})]"

This is important for adoption by the NWB team and for any other array cases where you don't want to store a million timestamps that can be simply computed by an equation.

How important is this feature?
• Medium - can do work without it; but it's important (e.g. to save time or for convenience)

When will use cases depending on this become relevant?
• Long-term - 6 months - 1 year

@rly rly added the arrays label Apr 15, 2024
@rly
Copy link
Contributor Author

rly commented Apr 15, 2024

Here is a more complex 2D case. This case is not necessary to support in LinkML (we can just add this as a convenience function to the API), but it would be nice.

The 1D case above is important because the number of values that are stored go from a million to three.

  ElectrodeRecordingData:
    is_a: NwbObject
    implements:
      - linkml:Array
    description: >-
      A 2D array of voltage measurements from electrodes over time.
      This class is designed to represent either:
      1) raw data from a data acquisition system in ADC units (ADU),
          e.g., int16 values that span a range of -32768 to 32767, that need to be converted to volts,
          e.g., float values from -150 mV to 250 mV, using a conversion factor (e.g., 200/32768)
          and offset (e.g., 50 mV).
      2) data that has already been converted to volts.
      Storage of the raw ADC values is preferred over conversion and then storage in volts
      to be more efficient and represent the resolution of the original data.
      See ElectrodeRecording for its usage with axes labels.
    attributes:
      per_electrode_conversion_factor:
        range: float
        multivalued: true  # length must match range({values}.shape[1])
        # default value is a list of 1s
      conversion_factor:
        range: float
        # default value is 1
      offset_in_V:
        range: float
        unit:
          ucum_code: V
        # default value is 0
      raw_values:
        range: float
        required: true
        array:
          exact_number_dimensions: 2
      values_in_V:
        range: float
        required: true
        unit:
          ucum_code: V
        array:
          exact_number_dimensions: 2
        equals_expression: "[{per_electrode_conversion_factor}[i] * {conversion_factor} * {values}[:,i] + {offset_in_V}
                            for i in range({values}.shape[1])]"

@nlharris nlharris added the enhancement New feature or request label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arrays enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants