-
Notifications
You must be signed in to change notification settings - Fork 25
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Import nexus file as ObservationMatrix #3929
Import nexus file as ObservationMatrix #3929
Conversation
…m a file This commits the result of `bin/rails generate job import_nexus --queue import_nexus` Testing that import_nexus jobs can be created and run: * From `bin/rails c`, do `::ImportNexusJob.perform_later()` * From an app shell, do `QUEUE=import_nexus rake jobs:workoff` - it should report a nexus_import job was run (if you `puts` some text from your job you should see the text in your console)
… Import Nexus task I tried as much as I could to write things so they could be reused in a Filter Documents task (in case that ever comes to be and if the facet I've written is desired there). In that event maybe we'll want to filter by document type as well, and maybe the user should be able to enter an extension and/or type.
… radio list So you can select an extension and then go back to no extensions selected (which returns the most recent documents, which may or may not be available from other tabs depending on recent documentation activity).
… import For now the import code is in a function called `runit` on the ObservationMatrix controller, for ease of testing. Still many TODOs and decisions about how things should work, but this version will import a basic nexus matrix into TW.
As I noted at the defineModel() for the options ref, and as the vue docs point out: if you define your model ref default in defineModel, the default is not passed to the parent - so if your options, like matrix name, are never touched, then you wind up passing an empty options hash to the controller and it raises on an empty required param object. The solution is simply to assign the default outside of defineModel().
…tu taxon matching This is what was initially intended of course.
…names That's what we were already doing, this just makes it optional.
…ng a new one This is required for Qualitative Observations since they have a unique validation.
Note that the particular names assigned to gap states has implications for which existing character states will match the ones being imported (if the user chooses the option to match imported characters to existing ones), in ways users might not be able to guess unless they know the algorithm used to generate gap names.
…not just the first There can be multiple descriptors with the same name
…e, use it for preview That's what we were already doing, now there's an option to turn it off
…requests On GET, as opposed to POST, booleans are passed as strings, not bools
…or preview as import This supports repeated taxa, though the ui doesn't currently differentiate repeats.
…th preview and import This supports duplicate nexus descriptors, though the matrix ui won't differentiate duplicates.
… a little With the file read error combined with the parse error, I think the message 'Nexus parse error: Couldn't find Document with 'id'=0' works fine.
I never felt comfortable putting those in the model, but I'm still kinda fuzzy on why it's okay to have them in the controller just because they rely on current_session_id (does that get passed into a method in the model sometimes or should code that relies on session_current_project_id really not be in the model?).
…s gap states Per @mjy (thanks!).
They'll get discarded if they ever make it back to nexus_parser again.
Ignoring whitespace on those attributes is not good. We'll have to find another route. Can we pre-process the nexus names with the same string cleaning method before match? I'm actually working on some related improvements, I think it should be possible. |
@kleintom I haven't looked at the UI here at all yet, but we can plan for a 2 stage release, first is just match as best as possible, second (post release) is a UI that lets users manually choose target when no match, or over-ride choice that was made as well. I think a "create all as new" is going to be useful as well. Again, I haven't looked at UI, so apologies if this is variously addressed. |
Yes, definitely, a matrix renderer is another issue- that reminds me of another OS project I was involved in a long time ago.. for another story/issue. |
…ame as for Convert A little awkward that those options are only listed for Convert, below the Preview button.
Yes that worked out fine (I think). Importing those 15 morphobank nexus files and then importing them again with otus and descriptors matched results in no new otus, descriptors, character states, or observations.
No problem, that's helpful. We do already support 'create all as new' if you just don't select any of the matching options, but I can see where being able to adjust the matching that does occur could be very helpful in the future. One other related comment: currently you can select 'match otu by taxon' and 'match otu by name' at the same time: it matches by taxon first, then by name. That's what made sense to me, but I'm not actually sure matching both at the same time is useful, and if so if we should give an option in which order to match. Feedback can wait though.
Thanks, looks great to me! |
@kleintom Thanks for the matching, sounds good. In your mind are the next steps to have us do some testing and get you feedback? |
Yes, I think things have pretty much settled on my end, testing and feedback would be great, thanks! |
@kleintom two requests upon first pass (looking great):
|
@jlpereira Can you take a very quick look at the vue here? @kleintom please see the comment on the capitalized symbols, other than that things look ready for merge from the back-end side. A non-blocking minor comment would be that we're going to ultimately try and move away from controller specs, and towards feature if we want to test functionality. I don't think it's worth blocking this merge with a port of the controller to feature here (they are significantly trickier to implement sometimes), but a note for future reference. |
app/javascript/vue/tasks/observation_matrices/import_nexus/components/ImportOptions.vue
Outdated
Show resolved
Hide resolved
app/javascript/vue/tasks/observation_matrices/import_nexus/components/CitationOptions.vue
Outdated
Show resolved
Hide resolved
app/javascript/vue/tasks/observation_matrices/import_nexus/components/ImportPreview.vue
Outdated
Show resolved
Hide resolved
app/javascript/vue/tasks/observation_matrices/import_nexus/components/ImportPreview.vue
Outdated
Show resolved
Hide resolved
app/javascript/vue/tasks/observation_matrices/import_nexus/components/ImportList.vue
Outdated
Show resolved
Hide resolved
My misunderstanding. This reverts commit e8bb42e.
Thanks for the suggestion José!
The one catch here is that if you send an empty options hash param to the controller, rails doesn't include a parameter for it, and then your params.require(:options) fails (it would also fail if options *was* sent as {}...). My old workaround was to set the default values for options in vue (where really to solve the empty options hash issue I would have only needed to set one of them) - which I've removed here in favor of (re)creating an empty hash in the controller and then permitting it as the options hash. I like that better - I no longer need to list out all of my options in vue so that I can assign them defaults (the matching and the citation params are now encapsulated in their own components), and the new rails workaround just acknowledges that rails discards #empty? params.
Feeling good from my end as long as you're happy with the new fixes. |
@kleintom Thanks, this is a fantastic contribution. |
Some questions and comments
["ActiveRecord::RecordNotFound", "Mongoid::Errors::DocumentNotFound", "AbstractController::ActionNotFound", "ActionController::RoutingError", "ActionController::UnknownFormat", "ActionController::UrlGenerationError"]
-ActiveRecord::RecordNotFound
is one I can theoretically hit in the background if I pass a bad document id (which I don't), so currently I'm converting that error to TaxonWorks::Error. I just wanted to check that it's intentional that we ignore RecordNotFound in background jobs (even if it's not intentional it may not be worth changing just for this).