Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[feature/enhancement] abstract measurement types #62

Open
gottacatchenall opened this issue May 3, 2021 · 9 comments
Open

[feature/enhancement] abstract measurement types #62

gottacatchenall opened this issue May 3, 2021 · 9 comments

Comments

@gottacatchenall
Copy link
Member

gottacatchenall commented May 3, 2021

Hi all,

As I start to migrate the list of items in #13 to individual issues, I've found it useful to better define the functionality
for VEL.jl to better plan how the interface between the two will work. This issue mostly relates to my plan to implement
different types of Measurements

As far as I can tell, all this would require from EcoSISTEM to integrate is shifting how storage works storage in simulate_record!(storage::AbstractArray...).

Right now this is filled via,storage[:, :, i] = eco.abundances.matrix

Would it be possible to shift the abundance matrix type in Ecosystem to something that stores an arbitrarily long list of matrices (parameterized by data type) corresponding to different measurements (e.g. abundance and a trait unrelated to the traits being selected on in the EcoSISTEM trait-relationship)?

@richardreeve
Copy link
Member

Generalising how we store data in an Ecosystem is definitely a good idea - at the moment (@claireh93 will correct me on the details!) I think Ecosystems have two arrays, one for the (many) species under study, and one for any pathogens (can be nothing), but this is really just a stopgap while we work out how we need to generalise it properly like you suggest.

However, it's not as simple as you hope. In particular, we still haven't fixed the parallelisation in the move to transitions. Fundamentally, we can't store things as straight matrices once we have more than one process, because we end up with too many scatter/gather commands and it gets very inefficient. From that point of view this is currently blocked by #63.

It's also not as simple as you might hope because we need to work out how to reconcile this with EcoBase.jl (and Diversity.jl), which are currently predicated on the idea of one set of Things and one set of Places. I don't think we currently have a problem with the idea of only one set of places (though I can imagine a situation where we did, for instance nested places within places), but it sounds like you may be thinking about multiple types of thing? Is that right, or are you thinking of something else?

Talking of simulate_record!(), we need to generalise how we record stuff too, but that may be another issue.

@richardreeve richardreeve added this to To do in EcoSISTEM development via automation May 3, 2021
@gottacatchenall
Copy link
Member Author

gottacatchenall commented May 3, 2021

Although I'm not entirely familiar with how parallelization is implemented currently, re the way I'm implementing mechanisms I've found that transitions tend to fall into the categories of 1) independent across location, 2) independent across category (e.g. the categories for SIR-type models, or species in biodiversity) and 3) independent across both location and category.

Implementing parallelization of transitions across any given axis of the current state could possibly solve this?

@gottacatchenall
Copy link
Member Author

gottacatchenall commented May 3, 2021

It's also not as simple as you might hope because we need to work out how to reconcile this with EcoBase.jl (and Diversity.jl), which are currently predicated on the idea of one set of Things and one set of Places.

perhaps i misunderstand what you mean, but i think bundling each individual Thing with a Place makes sense, and so each set of measurements within an Ecosystem could correspond with its own set of Places

@richardreeve
Copy link
Member

i agree about the categorisation of the transitions (if I understand you correctly) - we have a similar breakdown, but that imposes some constraints on how we store the data if we want to parallelise efficiently. I'm not sure what you mean about the bundling though if you need multiple matrices for storage... it may be that I'm not understanding what your measurements are - some seem to be species traits, and some location (e.g. abiotic) data, both of which we handle separately from species-in-location data, like abundance / biomass / occupancy. All of these have to be handled differently depending on how they are used in the code because only some information needs to be present on some processes. Unfortunately to manage memory efficiently and minimise inter-process communication for large simulations we have to care about when and where different information is used, and it's not currently clear to me how that aligns with how you're thinking about things.

I suspect that there's a reasonable chance we could have a discussion at cross purposes for quite a while about any of these issues - it's not necessarily simple to resolve how to integrate two similar but unrelated frameworks. It might be useful to have a face-to-face chat about this at some point soon when we're all free? It may in particular help us to break this down into smaller pieces that we can implement more easily.

@claireh93
Copy link
Member

Hi both, I think a discussion sounds good - it's always tricky to discuss these things via github issues! Shall we arrange something for in a few weeks time?

@gottacatchenall
Copy link
Member Author

yeah sure, i'm working to get VEL.jl further along so it becomes more clear what interfaces are required

@richardreeve
Copy link
Member

I'm pretty booked out next week and unavailable the week after, but w/c 24th May could be possible or the following week if someone wants to set up a doodle poll?

@gottacatchenall
Copy link
Member Author

i can send a doodle poll out for the week of the 24th soon. i'm currently reviewing the DynamicGrids.jl and Dispersal.jl paper which has at least partial solutions to the problems posed in this thread with high parallelization efficiency.

might be worth having a discussion with them as well

@richardreeve
Copy link
Member

Great. I'm pretty much booked out till the end of May now, but free after that still.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

No branches or pull requests

3 participants