Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support for BrainVoyager VTC, MSK, VMR and NR-VMP files #216

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

thomastweets
Copy link

After talking to Emanuele Olivetti and Yaroslav Halchenko about it I submit this pull request for implementing support for BrainVoyager QX files. I started out with VTC, MSK, and NR-VMP files and read and write does work. As this is my first major pull request I would really appreciate some comments or help. I did not write nose tests yet and the code is not thoroughly tested yet either.

I hope that my implementation is not too "dirty" as it was a little difficult for me to implement the variable-length header of BrainVoyager files...

@matthew-brett
Copy link
Member

Thanks a lot - it would be good to support these files.

We really need tests - the problem is that soon becomes very hard to alter the code if we can't confirm it is still working - and it's also much harder to see how the code works without tests.

Can you find some very small (< 30K) example files to test against - and include in the repo?

@thomastweets
Copy link
Author

I will compile some test files and include them. I will also try to dabble in nose tests.

Good to hear that you guys appreciate the new features :)

@thomastweets
Copy link
Author

OK, I added three test files (one for each format) that show up correctly in BrainVoyager.
I am not sure when I will able to look into writing tests for it but it is definitely on my ToDo list.

BTW: The data are just noise, so do not expect any brain-like images.

@matthew-brett
Copy link
Member

Sorry, I just got to look at this one.

Generally the code looks nicely done, thank you. The striking issue is the lack
of tests - well - you know.

I personally prefer to put my constants like the data definitions for the header
at the module global level or at least in the class so they are visible if
necessary for subclassing, say for later versions of the format.

You may want to look at nibabel/tests/test_image_api.py for the definition
of the image API and a rig to run the API conformance for the image formats that
we support.

You've very sensibly added 'order' to the ArrayProxy init - in fact I did the
same thing in the current 'add-proxy-slicing' PR that will soon be merged. I
used a class attribute in order not to change the init for ArrayProxy. In
your case I think you (will) want:

class CArrayProxy(ArrayProxy):
    order = 'C'

class BvFileImage(SpatialImage):
    ...
    ImageArrayProxy = CArrayProxy

Those were the thoughts that came to me from a quick read - thanks.

@thomastweets
Copy link
Author

Thanks for your comments!

I refactored the ArrayProxy implementation.

Regarding the constants: When I started writing the code I was of course inspired by analyze.py. In there the constants are kept on a global level. I like that and it looks much cleaner, indeed. However, BrainVoyager file format have almost always variable-length headers so the constants are no real constants anymore. Would you have a suggestion for a clean solution to that?

I now look at the tests!

@matthew-brett
Copy link
Member

Thanks for this work - I like the changes.

I don't know if it is worth copying so much of the 'from_header' class method
-- is this just for 'as_bv_map'? I put 'as_analyze_map()' in to allow different
analyze-type headers (Analyze, SPM Analyze, Nifti1, Nifti2) to give matching
information to each other. I also imagined that formats like MINC could
generate matching information if someone could be bothered to write an
'as_analyze_map' method for MINC, if you see what I mean. So 'as_bv_map' could
be useful if there's a set of key, value pairs that match between BV header
types, and that would be cleaner than special casing the conversion at the top
of the method, then calling 'super' or similar.

By the way, I can't see 'as_bv_map' in the code.

Do you need 'get_slope_inter', 'set_slope_inter'? For the header object?
Analyze types need this because some scale and some don't - but we don't use
these method currently outside the Analyze types.

Can you point to some documentation for the format somewhere?

Doesn't 'mask_header_dtd' belong at the module global level?

How about putting 'vmp_header_dtd' at the module global level, and the submap
dtd similarly, then use copies of these globals when compiling the full header?

I guess you could keep a template for 'vtc_header_dtd' and fill in the 'prts'
string when you create the real dtd.

@thomastweets
Copy link
Author

Thanks again for your comments!

OK, now the 'as_analyze_map' makes more sense for me! Actually it was still a left-over from adapting analyze.py. I have not implemented 'as_bv_map' myself. I might get rid of it for now, however, it might something nice for having an inter-compatibility between BV filetypes like you suggested.

I actually do not need get/set_slope_inter for BV filetype. This is also just an adaptation left-over. I will get rid of it.

There is some documentation:
VTC files: http://support.brainvoyager.com/installation-introduction/23-file-formats/379-users-guide-23-the-format-of-vtc-files.html
NR-VMP files: http://support.brainvoyager.com/installation-introduction/23-file-formats/377-users-guide-23-the-format-of-nr-vmp-files.html
for the MSK files I adapted a Matlab function, I think.

I will try to refactor the dtype templates so that they reside at the modules global level.

I had a look at test_image_api.py and tried to write some tests. I ran into problems regarding the dtypes - I need to fix the function of get/set_data_dtype. I am on it.

@matthew-brett
Copy link
Member

Thanks a lot for the links. Did you already put them somewhere in the code

  • like at the top-level docstring?

If you find you're having to do dumb stuff to get BV format to work we've
probably got the design wrong, so do let me know and I'll see if I can
refactor.

On Tue, Dec 17, 2013 at 8:20 AM, Thomas Emmerling
notifications@github.comwrote:

Thanks again for your comments!

OK, now the 'as_analyze_map' makes more sense for me! Actually it was
still a left-over from adapting analyze.py. I have not implemented
'as_bv_map' myself. I might get rid of it for now, however, it might
something nice for having an inter-compatibility between BV filetypes like
you suggested.

I actually do not need get/set_slope_inter for BV filetype. This is also
just an adaptation left-over. I will get rid of it.

There is some documentation:
VTC files:
http://support.brainvoyager.com/installation-introduction/23-file-formats/379-users-guide-23-the-format-of-vtc-files.html
NR-VMP files:
http://support.brainvoyager.com/installation-introduction/23-file-formats/377-users-guide-23-the-format-of-nr-vmp-files.html
for the MSK files I adapted a Matlab function, I think.

I will try to refactor the dtype templates so that they reside at the
modules global level.

I had a look at test_image_api.py and tried to write some tests. I ran
into problems regarding the dtypes - I need to fix the function of
get/set_data_dtype. I am on it.


Reply to this email directly or view it on GitHubhttps://github.com//pull/216#issuecomment-30749833
.

@matthew-brett
Copy link
Member

Anything I can do to help here?

@thomastweets
Copy link
Author

Sorry for not responding in a while - I have been travelling and only came back two days ago. I will continue to work on the pull request at the end of this week.

@JasperVanDenBosch
Copy link
Contributor

Looking forward to this. Let me know if you need help, I know Brainvoyager and Python. Also you might want to checkout NeuroElf, the matlab API for brainvoyager files, see how they do this.

@thomastweets
Copy link
Author

OK, this is not abandoned - I had some other work to do and it took me some time to figure out how some things in nose tests work.
I added tests in test_image_api and most pass. However, I still have some issues:

  • dtype conversion is not handled very smoothly; I throw errors when a dtype is used that is not supported (actually only VTC files allow two dtypes, the other file formats allow only one). Would you have an idea how to handle that?
  • same goes for dimensions: VTC and VMP files have almost always four, MSK files have three. Automatic testing fails when it tries to assign a shape with another number of dimensions. What makes it more difficult is that the order of dimensions of the actual data array in BV files is different: ZYXT (T := volumes) for VTC files, NZYX (N := number of submaps) for VMP files.

I will also write more format-specific tests (test_bv.py).

I did some major refactoring to put the header dtype definition at the modules top level (actually putting it into functions in some cases). Indeed, I think this helps readability a lot! I also tried to inherit more from the main BvFileHeader and BvFileImage classes to unify things more.

I put the link to file format informations in the doctstring of each module.

I also got my head around affine transformations for BV files. This document provides quite some information on how axes in BV files are aligned: http://support.brainvoyager.com/documents/Available_Tools/Available_Plugins/niftiplugin_manual_180610.pdf
I think I got it as far as it gets. In the end there is no information on real-world coordinates in the file formats that I implemented.
There is a voxel size information that is relative to an anatomical (VMR) file (so for example voxels in VTC files are set to be twice the size as in VMR files). I implemented that into the zooms.
There is also some translation information in the files: they give information about the start and end points of a bounding box inside a (not directly referenced) anatomical (VMR) file. This bounding box positions the data array itself. I took the vector from the center of the (guessed) coordinate system of the VMR file to the center of this bounding box as a translation for the affine transformation.

Have I understood it right that the affine transformation should produce data in RAS orientation? So that the first axis has values left-to-right (from subjects perspective), the second is posterior-to-anterior, and the third is inferior-to-superior? Are there more things to take into account?

Could you hint me to a straightforward way of visualising such a transform with sample data to compare it to the way that BrainVoyager displays it?

@ilogue Thank you for the hint to NeuroElf! Indeed, I use it on a daily basis. However, MATLAB can feel like a cage if you know the world of Python... ;)

@thomastweets
Copy link
Author

Would you have any new comments for me?

@matthew-brett
Copy link
Member

Ah very sorry to be slow - I have something I have to finish these next few
days - but I have it next on my list. Thanks for reminding me.

On Tue, Mar 11, 2014 at 12:37 AM, Thomas Emmerling <notifications@github.com

wrote:

Would you have any new comments for me?

Reply to this email directly or view it on GitHubhttps://github.com//pull/216#issuecomment-37269894
.

@thomastweets
Copy link
Author

No problem, take your time!

@thomastweets
Copy link
Author

Sorry for bumping this again, but would you have a look and some comments to my open questions?

@matthew-brett
Copy link
Member

A thousand apologies for being so slow - I will have a look now.

@matthew-brett
Copy link
Member

OK - I see my fear that I would have to do some serious reading was justified :)

I'm starting to grasp that you want to support three or four of the many BV formats:

  • VTC: probably stands for 'voyager time course' for image time course data (bv_vtc.py)
  • ?VMR: probably stands for 'voyager MR' for anatomical images. Is this suppported in the 'bv.py' file?
  • VMP : voyager volume map. Two types NR-VMP (native-resolution) and AR-VMP (anatomical resolution). Is it just NR-VMP you support in 'bv_vmp.py'?
  • MSK : mask files. Only documented in an email 'bv_msk.py'.

I will get down to more reading - but it would help a lot if you had time to spend 10 minutes or so on Skype to talk me through it. Is that possible? If you think you might have time, I'll suggest a developer hangout for nibabel to get things moving a little quicker.

from .spatialimages import HeaderDataError, HeaderTypeError
from .batteryrunners import Report

def _make_vtc_header_dtd(fmrlt,prts):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be neater to specify an int length, tuple of int lengths here? Then add the 'S" at the beginning for the field definition, and the field names for the prts? That way you wouldn't have to do that in the calling code.

@thomastweets
Copy link
Author

Thank you for your comments!
I would actually appreciate a skype session or hangout very much! I think that could clarify a lot about the file formats. My time zone is UTC+1 (Netherlands). I would be pretty much free the next days. Maybe you can propose a date and I will send you my skype name.

@matthew-brett
Copy link
Member

How about tomorrow at 2000 your time 1100 my time?

Meanwhile I will do some documentation of the image format.

Do you mind if I invite anyone on the list who wants to join?

@thomastweets
Copy link
Author

That sounds great!
Sure, invite anyone who wants to join.
Maybe this helps you with documentation/getting your head around the formats (from a comment above):
VTC files: http://support.brainvoyager.com/installation-introduction/23-file-formats/379-users-guide-23-the-format-of-vtc-files.html
NR-VMP files: http://support.brainvoyager.com/installation-introduction/23-file-formats/377-users-guide-23-the-format-of-nr-vmp-files.html

I just sent you an email with my contact details.

See you tomorrow!

@matthew-brett
Copy link
Member

I was thinking, that you could define a new format for your header information - maybe a ordered dict, and pass this format into the constructor, as in:

vtc_header_info = OrderedDict( .. )
vtc_header = VTCHeader(vtc_header_info)

The VTCHeader class would implement from_fileobj which would construct the ordered dict from the data in a file, and write_to which would write the ordered dict to the binary format.

I guess you could just use the struct module to read and write the binary format, instead of the numpy structured types: https://docs.python.org/2/library/struct.html

@matthew-brett
Copy link
Member

Alternatively you could have the header being contained by a numpy structured array with the stuff that is fixed, and a dictionary for the strings that are not - such as the strings. Maybe then you wouldn't need the IPython traitlets.

@thomastweets
Copy link
Author

I thought a lot about the different ways there are to make the code more clean. However, they all have their drawbacks:

  • If I use an (ordered) dict I will have to do a line-by-line parsing, which will be quite ugly (compared to the one-timer structured array readout). Additionally I am not sure if a nested dict will play well with the struct to write a binary format. I would need such nested dicts to model multiple sub-maps in VMP files.
  • For the struct itself I need to provide a format just like with the structured array.
  • If use a structured array + a dict for the strings then I run into problems merging them again for writing out a binary stream (the strings would need to be filled in at arbitrary positions of the binary string).

I am just not sure if the final version would be any more readable than right now. Do you share my concerns? ;)

@mtakemiya
Copy link

Is it possible to read VTC files using nibabel at this point?

@thomastweets
Copy link
Author

@mtakemiya Yes that should be possible with the fork on my github account. Indeed, I used it for my own data. However, it is not thoroughly tested yet.

@matthew-brett
Copy link
Member

For the format - are these the main problems?

  1. BV likes null-terminated strings, so the C structs can't be defined beforehand, because they depend on the lengths of the strings.
  2. There can be variable numbers of strings (VTC PRT files)
  3. VMR files have variable numbers of maps, timepoints and parameters.

At some point the data has to be written out in binary format, and its convenient to use the numpy structured array format to do that.

The question is whether that's the best way to store the header data - and read it in.

For example, I think I'm right that - at the moment - it's not easy to add a file to the list of PRT files? Or change a name? Maybe I'm not understanding.

To make it easier to do something like that, I could imagine encoding the string data - say as python strings, recorded as 'O' (object) type in the structured array defining the header. Similarly the PRT files could be of 'O' type, but this time be a list.

For changing things in memory, this would just work as expected, appending to a list, changing the string for another string. For writing you could convert the 'O' field on the fly to to an S field, or the list of strings to a sequence of S fields - I guess. I mean - use a different structured array for storing in memory and writing out to files.

But - you have a much better feeling for the problem than me, so go with what you think is best.

@matthew-brett
Copy link
Member

Where are we with this one?

I'm planning a release soon - maybe we should defer this one until the next release? I will do another release as soon as this is ready.

You probably saw that the OrderedDict PR is merged now.

I think I said that I would do a doc on coordinate systems, and that I would make a better image testing rig. I will get onto those now. Anything else I can help with?

@thomastweets
Copy link
Author

Sorry for the long delay - after HBM there were some project tasks to catch up with that took me quite some time.

The new doc on coordinate systems is definitely helpful and the image testing would be great, too! Apart from that I will have look during the next week whether I can store data in a more convenient/pythonic way between reading and writing. I had already added some facilities to allow for example adding and removing of submaps in VMP files (that would need an on-the-fly header and corresponding structured numpy array change) but the solution is indeed not very clean.

I hope I can finish that over the next week. However, maybe it is wiser to include it after the next release, also to allow more time for testing.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.05%) to 95.901% when pulling a87853d on thomastweets:master into 42959a3 on nipy:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.05%) to 95.9% when pulling 9a13fd2 on thomastweets:master into 42959a3 on nipy:master.

@coveralls
Copy link

Coverage Status

Coverage decreased (-0.05%) to 95.9% when pulling d2929fc on thomastweets:master into 42959a3 on nipy:master.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.05%) to 95.998% when pulling aaa9a95 on thomastweets:master into 42959a3 on nipy:master.

@thomastweets thomastweets changed the title Support for BrainVoyager VTC, MSK, and NR-VMP files Support for BrainVoyager VTC, MSK, VMR and NR-VMP files Aug 18, 2016
@thomastweets
Copy link
Author

@matthew-brett OK, finally coverage increased (and is very high for the BV-related files). This is now definitely ready for review :)

@matthew-brett
Copy link
Member

Thanks - I will try and get to this ASAP.

@thomastweets
Copy link
Author

@matthew-brett I know that this became quite a monster of a PR, but do you think it can be merged for the next release? Just to give you a heads-up: I will start a new job in September and might be (even) more unresponsive in the future unfortunately because of time constraints.

@matthew-brett
Copy link
Member

Sorry to take a while to get back to you, I just had the first day of a course I am teaching, and I got a bit overwhelmed. Have time now, but, is it too late for you? Do you have a day or so to respond to stuff, or will you be maxed out?

@thomastweets
Copy link
Author

Now it took me a while to get back to you ;)
I am now a bit more settled already (move + new job). Over the next week I will be travelling but next Tuesday I should be ready to respond to comments. I will probably respond slower as I can only work on this during the night, though.

Copy link
Member

@matthew-brett matthew-brett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a tiny first set of suggestions, more tomorrow.

str_list : generator of string(s)
"""
currentPos = f.tell()
if strip:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider suffix = b'' if strip else b'\x00'

-------
str_list : generator of string(s)
"""
currentPos = f.tell()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PEP8 naming conventions current_pos for currentPos etc.

if rewind:
f.seek(currentPos)
else:
offset = 0
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

    offsets = [len(lines[s]) + 1 for s in range(nStrings)]
    f.seek(current_pos + sum(offsets))

)


def readCString(f, nStrings=1, bufsize=1000, startPos=None, strip=True,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Consider read_c_string for PEP8 naming?

Parameters
----------
f : fileobj
File object to use
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mention that object should implement tell, seek, read.

# handle conditional fields
elif isinstance(def_or_name, tuple):
if hdr_dict[def_or_name[1]] == def_or_name[2]:
bytes = fileobj.read(calcsize(format))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a builtin for Python 3 - rename to bytes_ or similar?

hdr_dict: OrderedDict
hdr_dict that contains the fields and values to for the respective
BV file format.
parent_hdr_dict: OrderedDict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document None, and is optional, as for parse_BV_header above.

'NrOfLags' in the VMP file header).
"""
hdr_dict = OrderedDict()
for name, format, def_or_name in hdr_dict_proto:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

format is a builtin function, rename to format_ or similar?

hdr_dict: OrderedDict
hdr_dict that contains the fields and values to for the respective
BV file format.
parent_hdr_dict: OrderedDict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Document None, and that it's optional, as for parse_BV_header above?

n_values = hdr_dict[def_or_name]
else:
n_values = parent_hdr_dict[def_or_name]
for i in range(n_values):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe

hdr_size = sum(calc_BV_header_size(format_, value[i], hdr_dict) for i in range(n_values))

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I changed all other comments (from this post) but here I run into failed tests when changing to this. I will still look into it once more, however, for now I left it like it was before.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't follow my own logic here, so not changing is very reasonable.

Copy link
Member

@matthew-brett matthew-brett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Another bite-size bit of review.

if hdr_dict[def_or_name[1]] == def_or_name[2]:
part = pack('<' + format, value)
else:
continue
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you mind putting a comment here to explain this case?

"""
for name, format, def_or_name in hdr_dict_proto:
# handle nested loop fields
if isinstance(format, tuple):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should the format be ignored in this case? Maybe a comment here? To reduce indentation and make intention clear, mabye

    if not isinstance(format, tuple):
        continue
    ...

if format == 'z':
hdr_size += len(value) + 1
# handle array fields
elif isinstance(format, tuple):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This pattern seems to repeat a lot. Can you think of a way of making this more general, to share code between the functions with these same sets of conditionals? I was thinking vaguely about mapping these conditionals to actions with a dictionary or something that. Maybe a silly idea, asking in hope you can come up with something.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree that this is not the most beautiful style here. However, there are a lot of weird edge cases with BV headers and I fear that while trying to make things more general one would make things really hard to understand, too. At least this repetition is confined to these module level functions in bv.py and do not repeat for different file formats.
I will give it some more thought though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I understand, and I agree, if nothing clean and nice comes to mind, the repetition is OK.

hdr_dict before any changes.
hdr_dict_new: OrderedDict
hdr_dict with changed fields in n_fields_name or c_fields_name fields.
parent_old: OrderedDict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note can be None, and is optional.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in:

parent_old: None or OrderedDict, optional

When update_BV_header() is called recursively the not yet updated
(parent) hdr_dict is passed to give access to n_fields_name fields
outside the current scope (see below).
parent_new: OrderedDict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can be None, is optional.

if isinstance(format, tuple):
# calculate the change of array length and the new array length
if def_or_name in hdr_dict_old:
delta_values = hdr_dict_new[def_or_name] - \
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Vague preference for parentheses round expression to allow line breaks, rather than continuation character \ : https://www.python.org/dev/peps/pep-0008/#id19


Parameters
----------
st_array: array
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess this is a 3D array shape (n, 4, 4)?

-------
combined_st : array of shape (4, 4)
"""
if len(st_array) == 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How about:

from functools import reduce
if inv:
     st_array = [np.linalg.inv(aff) for aff in st_array]
return reduce(np.dot, st_array)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on suggested edit here?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't remember where we were with the inverses and ordering? Is this right - to multiply all the affines in the order A[0] @ A[1] @ .... A[N-1] (where @ is matrix multiplication)? So A[N-1] applied first, A[0] applied last? And same for inverses?: inv(A[0]) @ inv(A[1]) @ ... inv(A[N-1])?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thomas - what do you think about this one? Is there any way to check what the order should be?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And - what do you think of my suggested edit above?

array filled with transformation matrices of shape (4, 4)

inv: boolean
Set to true to invert the transformation matrices before
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a slightly odd transformation - can you comment more? I was expecting you to invert the result of the multiplication, not the individual elements.

Is it right that the transformation you want is st_array[-1] followed by st_array[-2] ... etc? I mean, to do mean the order of the arrays to be the order of the transformation, or the opposite (as you get from taking the dot product from left to right).

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I am waiting for a connection flight right now that was delayed 5 hours and I am already jetlagged, so I should definitely not make up my mind about spatial transformation matrices, BUT:
What I want here is to compute a transformation matrix from the orientation of the data in the BV file to their native orientation by reversing all the transformations that were performed on the data. These transformations are stored (e.g.) in the VMR file header in an ordered fashion. So I guess if taking the dot product from left to right gets me there, this is what I want.
But again, I will better try this out with a clear mind :)

@coveralls
Copy link

Coverage Status

Coverage increased (+0.04%) to 95.994% when pulling fdd34b5 on thomastweets:master into 42959a3 on nipy:master.

Copy link
Member

@matthew-brett matthew-brett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So sorry for long delay - was teaching this term.

Nearly there now.

Have you got any other BV files lying around we can add as more extensive tests, using http://nipy.org/nibabel/devel/add_test_data.html#adding-as-a-submodule-to-nibabel-data ?

hdr_dict_proto: tuple
tuple of format described in Notes below.
fileobj : fileobj
File object to use. Make sure that the current position is at the
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for edits

n_values = hdr_dict[def_or_name]
else:
n_values = parent_hdr_dict[def_or_name]
for i in range(n_values):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't follow my own logic here, so not changing is very reasonable.

hdr_dict before any changes.
hdr_dict_new: OrderedDict
hdr_dict with changed fields in n_fields_name or c_fields_name fields.
parent_old: OrderedDict
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As in:

parent_old: None or OrderedDict, optional

# handle only nested loop fields
if not isinstance(pack_format, tuple):
continue
else:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can drop this else: and deindent the block after it.

-------
combined_st : array of shape (4, 4)
"""
if len(st_array) == 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any thoughts on suggested edit here?

[0., 0., 1., 1.],
[0., 0., 0., 1.]]
assert_array_equal(combinedST, correctCombinedST)
combinedST = combine_st(STarray, inv=False)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this test sensitive to order?

with InTemporaryDirectory():
# create a tempfile
path = 'test.header'
fwrite = open(path, 'wb')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you can be bothered, nicer as:

with open(path, 'wb') as fwrite:
    fwrite.write(binary)

Similarly for open usage below.

@@ -281,7 +281,7 @@ def set_data_dtype(self, datatype):
try:
code = self._data_type_codes[datatype]
except KeyError:
raise MGHError('datatype dtype "%s" not recognized' % datatype)
raise HeaderDataError('datatype dtype "%s" not recognized' % datatype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why these error changes?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the added supported_np_types check in spatialimages the missing HeaderDataError produced errors in the test_image_api and test_image_load_save tests. HeaderDataError seems to be the "correct" error to raise here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - thanks.

@@ -50,11 +50,6 @@ def copy(self):
return FunkyHeader(self.shape)


class CArrayProxy(ArrayProxy):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did you mean to move this to the BV module?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(I think based on your suggestion) I included the CArrayProxy class to arrayproxy.py because I need it for BV file formats. As this class was defined in test_arrayproxy.py as well I rather imported it and deleted the duplicate. But maybe I did not understand the logic here (or your question :)).

supported_dtypes = supported_np_types(img.header_class())
if new_dtype not in supported_dtypes:
new_dtype = supported_dtypes.pop()
except:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you specify the expected errors here? Bare except is a bit frightening...

@matthew-brett
Copy link
Member

@thomastweets - will you have any time to respond to my comments here (final stretch)? Or would you prefer I merged this as-is, and then submitted some more PRs with edits?

@thomastweets
Copy link
Author

@matthew-brett I am very sorry for the late response: I am quite busy with my new job, however, I will try to make some time for your comments next weekend. Looking forward to having this completed :)

@matthew-brett
Copy link
Member

@thomastweets - sorry to press you - will have any time to work on this? It's so close ...

@thomastweets
Copy link
Author

Sorry for taking so long!
I implemented most of your comments. For some of them I would need additional feedback from you (see the comments). Also, I do not have access to a BrainVoyager license at the moment but I will try to get my hands on one soon. Therefore, creating more test files is tricky (that is, manually converting existing open datasets into BrainVoyager files as I not sure whether I can share datasets of mine here). I will look into this as soon as possible but my time is a bit restricted these days.

@coveralls
Copy link

Coverage Status

Coverage increased (+0.1%) to 96.056% when pulling 351ca44 on thomastweets:master into 42959a3 on nipy:master.

@nibotmi
Copy link
Contributor

nibotmi commented Mar 7, 2017

☔ The latest upstream changes (presumably #249) made this pull request unmergeable. Please resolve the merge conflicts.

Copy link
Member

@matthew-brett matthew-brett left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a few more comments.

-------
combined_st : array of shape (4, 4)
"""
if len(st_array) == 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thomas - what do you think about this one? Is there any way to check what the order should be?

-------
combined_st : array of shape (4, 4)
"""
if len(st_array) == 1:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And - what do you think of my suggested edit above?

default_endianness = '<' # BV files are always little-endian
allowed_dtypes = [1, 2, 3]
default_dtype = 2
allowed_dimensions = [3]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This attribute appears to be unused?

endianness=default_endianness,
check=True,
offset=None):
"""Initialize header from binary data block.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this doesn't initialize from a binary data block, but from a hdr_dict OrderedDict instance.

Am I right in think that, on order to know the length of a BV header, you need to parse the header? So it's not easy to pass in a given binary block to this routine, and then parse it, you have to parse it and then pass it in (here as hdr_dict).

In which case, this call isn't compatible with the Nifti1 etc header creation calls, and I think you can / should also drop the endianness parameter, because it's not doing anything at the moment - except getting checked.

if header is None:
return obj
try: # check if there is a specific conversion routine
mapping = header.as_bv_map()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure. Each header defines its preferred map generation method, with default None. So:

Header.map_method_name == None
AnalyzeHeader.map_method_name = 'as_analyze_map'
BvFileHeader.map_method_name = 'as_bv_map'.

Then the Header class has the following method (moved from the AnalyzeHeader class):

    @classmethod
    def from_header(klass, header=None, check=True):
        ''' Class method to create header from another header

        Parameters
        ----------
        header : ``Header`` instance or mapping
           a header of this class, or another class of header for
           conversion to this type
        check : {True, False}
           whether to check header for integrity

        Returns
        -------
        hdr : header instance
           fresh header instance of our own class
        '''
        # own type, return copy
        if type(header) == klass:
            obj = header.copy()
            if check:
                obj.check_fix()
            return obj
        # not own type, make fresh header instance
        obj = klass(check=check)
        if header is None:
            return obj
        # Check for method that returns a mapping this header understands
        map_method = (None if self.map_method_name is None
                                  else getattr(self.header, self.map_method_name, None))
        if map_method is not None:
            # header is convertible from a field mapping
            mapping = map_method()
            for key in mapping:
                try:
                    obj[key] = mapping[key]
                except (ValueError, KeyError):
                    # the presence of the mapping certifies the fields as being
                    # of the same meaning as for this class, so we can
                    # safely discard fields with names not known to this header
                    # type on the basis they are from the wrong dialect
                    pass
            # set any fields etc that are specific to this format (overriden by
            # sub-classes)
            obj._clean_after_mapping()
        # Fallback basic conversion always done.
        # More specific warning for unsupported datatypes
        orig_code = header.get_data_dtype()
        try:
            obj.set_data_dtype(orig_code)
        except HeaderDataError:
            raise HeaderDataError('Input header %s has datatype %s but '
                                  'output header %s does not support it'
                                  % (header.__class__,
                                     header.get_value_label('datatype'),
                                     klass))
        obj.set_data_dtype(header.get_data_dtype())
        obj.set_data_shape(header.get_data_shape())
        obj.set_zooms(header.get_zooms())
        if check:
            obj.check_fix()
        return obj

This the variant where the method goes in the Header class, but it could also be a mixin class, just containing this method, where AnalyzeHeader and BvFileHeader both inherit from both of Header and HeaderMapMixin, as in:

class BvFileHeader(Header, HeaderMapMixin):

if any([d > fc for d in (x, y, z)]):
continue
else:
return fc, fc, fc
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens if some d > 1024 ?


def get_data_offset(self):
"""Return offset into data file to read data."""
self.set_data_offset(calc_BV_header_size(
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Your call. You can make set_data_offset raise an error if you like (parrec.py does this) - or you can let people do a set if they want, hoping they understand the risk.

Do people tend to store stuff in BV files after the header and before the data? That's not uncommon in Analyze, but very rare for Niftis for example. If it's rare, maybe raising an error is better.

By the way, you're using this in the header constructor, so I guess the offset in the header constructor is getting thrown away, in fact. Maybe you can remove it from the header parameter list?

binaryblock = pack_BV_header(self.hdr_dict_proto, self._hdr_dict)
fileobj.write(binaryblock)

def check_fix(self, logger=None, error_level=None):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well - could use a Mixin I guess - as above. Not sure whether it's worth it.

@@ -281,7 +281,7 @@ def set_data_dtype(self, datatype):
try:
code = self._data_type_codes[datatype]
except KeyError:
raise MGHError('datatype dtype "%s" not recognized' % datatype)
raise HeaderDataError('datatype dtype "%s" not recognized' % datatype)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK - thanks.

@effigies
Copy link
Member

@thomastweets Sorry to see this one fall by the wayside. Is there any interest in trying to pick it back up for the 2.5 (July) or 3.0 (October) releases? If you need help resolving conflicts, I'm happy to give it a go.

@effigies effigies added this to New data formats in Nilearn Dev Days 2021 Apr 26, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Nilearn Dev Days 2021
New data formats
Development

Successfully merging this pull request may close these issues.

None yet