Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

parse_gctx: don't sort returned values #36

Open
dllahr opened this issue Mar 30, 2018 · 6 comments
Open

parse_gctx: don't sort returned values #36

dllahr opened this issue Mar 30, 2018 · 6 comments

Comments

@dllahr
Copy link
Contributor

dllahr commented Mar 30, 2018

Hi @oena @levlitichev

I was thinking about doing a pull request where I modified parse_gctx to not return the dataframes sorted by index/column. The reason I propose this is if you read them out and get them in the order that they appear in the file, you can then choose the ones you are interested in, figure out their index id, and then use the ridx/cidx option to load them, which is much faster.

Also, could make it an option to do the sort. What do you think?

@oena
Copy link
Contributor

oena commented Apr 2, 2018

Hi @dllahr! Not sure I totally follow. Do you mean just for the metadata only options? Otherwise the IDs are subsetted before hyperslab selection occurs.

@dllahr
Copy link
Contributor Author

dllahr commented Apr 9, 2018

Sorry, no I mean that right now when you get the metadata back (and I think when you get it all back) all of the ID's have been sorted. The use-case I ran into was:

  1. got just the row metadata back
  2. identified the overlap between the genes I wanted an those that were present
  3. identified the indices of the genes I wanted in the row metadata
  4. attempted to load using the ridx option, got a completely different set of genes
  5. realized that the row metadata had been sorted, rather than returned as it appears in the file

@oena
Copy link
Contributor

oena commented Apr 9, 2018

Ok, gotcha. That does seem like a useful thing to do. Maybe we can start with having it as an option and see how things go?

@saksham219
Copy link
Contributor

@oena Has this been taken up? I was thinking of working on this.

@oena
Copy link
Contributor

oena commented Jun 3, 2019

@dllahr did you follow up on this? No worries if not, just checking

@ghost
Copy link

ghost commented Jun 10, 2019

Sorry I'm late replying, I did not get to doing anything with this.

tnat1031 added a commit that referenced this issue Nov 8, 2019
Issue #36 (add sort option for parse gctx)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants