Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Buffer: fix discrepancy in slicing order #1090

Open
MischaPanch opened this issue Apr 3, 2024 · 7 comments
Open

Buffer: fix discrepancy in slicing order #1090

MischaPanch opened this issue Apr 3, 2024 · 7 comments
Assignees
Labels
Batch and Buffer Improvements in internal data structures, temporary label breaking changes Changes in public interfaces. Includes small changes or changes in keys refactoring No change to functionality

Comments

@MischaPanch
Copy link
Collaborator

It doesn't make sense for buffer[:].obs to be so wildly different from buffer.obs[:]. One is retrieving the full buffer, filled up with zeros, the other just the filled entries.

We should probably never retrieve the full buffer through slicing. There could be a new method like get_full_entry_sequence that is documented to retrieve the non-filled entries as well if it's actually needed.

While addressing this issue, buffer and collector related tests should be adjusted and improved.

@MischaPanch MischaPanch added refactoring No change to functionality breaking changes Changes in public interfaces. Includes small changes or changes in keys Batch and Buffer Improvements in internal data structures, temporary label labels Apr 3, 2024
@MischaPanch MischaPanch added this to To do in Overall Tianshou Status via automation Apr 3, 2024
@dantp-ai
Copy link
Contributor

dantp-ai commented Apr 5, 2024

I can look into this.

Currently this is the signature of ReplayBuffer's __getitem__(index: slice | int | list[int] | np.ndarray).

So the idea here is to not support full-buffer (filled and non-filled entries) retrieval, when slice is provided, for buffer[:].obs because it can be computationally expensive?

What about the cases when user gives a list[int] or np.ndarray with all the indices? Practically it is the same as slice. Should we allow retrieving the full buffer in this case?

@MischaPanch
Copy link
Collaborator Author

I'm not sure when we should ever be retrieving the full buffer. Tbh I haven't given too much thought about the best way of resolving this, it just seems very confusing and arbitrary that buffer.obs[:] and buffer[:].obs would be semantically wildly different entities, right?

@dantp-ai
Copy link
Contributor

I have reviewed this issue again. The different way of retrieving with slicing seems arbitrary to me. The user can check the maximum size via the buffer.max_size attribute, no need to return the empty values as well. A special method to retrieve the full buffer (if anyone ever needs it!?) as you mentioned is more appropriate.

We should probably update buffer.obs[:] (and other slicing methods like start:stop) to retrieve only non-empty values like buffer[:].obs does.

Here are some examples of how indexing is currently different between the two:

In [52]: dummy_buf = ReplayBuffer(size=10)
    ...: for i in range(6):
    ...:     dummy_buf.add(
    ...:         Batch(obs=i, act=i, rew=i, terminated=0, truncated=0, done=0, obs_next=i + 1, info={}),
    ...:     )
    ...: 

In [53]: dummy_buf.obs[2:8]
Out[53]: array([2, 3, 4, 5, 0, 0])

In [54]: dummy_buf[2:8].obs
Out[54]: array([2, 3, 4, 5])

In [55]: dummy_buf.obs[:]
Out[55]: array([0, 1, 2, 3, 4, 5, 0, 0, 0, 0])

In [56]: dummy_buf[:].obs
Out[56]: array([0, 1, 2, 3, 4, 5])

In [57]: dummy_buf.obs[np.arange(10)]
Out[57]: array([0, 1, 2, 3, 4, 5, 0, 0, 0, 0])

In [58]: dummy_buf[np.arange(10)].obs
Out[58]: array([0, 1, 2, 3, 4, 5, 0, 0, 0, 0])

@MischaPanch
Copy link
Collaborator Author

MischaPanch commented Apr 21, 2024

Glad you agree with me on this ^^. I'm not sure whether anywhere in the code the retrieval of the slice with empty values is used. For me it's fine to completely remove it, however, many tests will need to be adjusted, as now many of them rely on this somehow weird retrieval mechanism.

We could live without the full retrieval method until someone actually needs it. It's a good practice to keep the public interface as small as possible

@MischaPanch
Copy link
Collaborator Author

Created the issue accidentally, sry for that

@dantp-ai
Copy link
Contributor

dantp-ai commented Apr 23, 2024

image

Something is off here. As seen above, index is of type str so Batch.__getitem__ immediately returns. How can we get access on the last .[] ?

EDIT: dummy_buffer.obs is of type np.ndarray so we can't have direct access on that slicing.

@MischaPanch
Copy link
Collaborator Author

Sorry, I'm afraid I don't understand what you are asking. We can have a call on Friday if you want to :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Batch and Buffer Improvements in internal data structures, temporary label breaking changes Changes in public interfaces. Includes small changes or changes in keys refactoring No change to functionality
Development

No branches or pull requests

2 participants