Buffer: fix discrepancy in slicing order #1090

MischaPanch · 2024-04-03T11:29:33Z

It doesn't make sense for buffer[:].obs to be so wildly different from buffer.obs[:]. One is retrieving the full buffer, filled up with zeros, the other just the filled entries.

We should probably never retrieve the full buffer through slicing. There could be a new method like get_full_entry_sequence that is documented to retrieve the non-filled entries as well if it's actually needed.

While addressing this issue, buffer and collector related tests should be adjusted and improved.

The text was updated successfully, but these errors were encountered:

dantp-ai · 2024-04-05T09:19:37Z

I can look into this.

Currently this is the signature of ReplayBuffer's __getitem__(index: slice | int | list[int] | np.ndarray).

So the idea here is to not support full-buffer (filled and non-filled entries) retrieval, when slice is provided, for buffer[:].obs because it can be computationally expensive?

What about the cases when user gives a list[int] or np.ndarray with all the indices? Practically it is the same as slice. Should we allow retrieving the full buffer in this case?

MischaPanch · 2024-04-12T10:21:08Z

I'm not sure when we should ever be retrieving the full buffer. Tbh I haven't given too much thought about the best way of resolving this, it just seems very confusing and arbitrary that buffer.obs[:] and buffer[:].obs would be semantically wildly different entities, right?

dantp-ai · 2024-04-21T12:22:43Z

I have reviewed this issue again. The different way of retrieving with slicing seems arbitrary to me. The user can check the maximum size via the buffer.max_size attribute, no need to return the empty values as well. A special method to retrieve the full buffer (if anyone ever needs it!?) as you mentioned is more appropriate.

We should probably update buffer.obs[:] (and other slicing methods like start:stop) to retrieve only non-empty values like buffer[:].obs does.

Here are some examples of how indexing is currently different between the two:

In [52]: dummy_buf = ReplayBuffer(size=10)
    ...: for i in range(6):
    ...:     dummy_buf.add(
    ...:         Batch(obs=i, act=i, rew=i, terminated=0, truncated=0, done=0, obs_next=i + 1, info={}),
    ...:     )
    ...: 

In [53]: dummy_buf.obs[2:8]
Out[53]: array([2, 3, 4, 5, 0, 0])

In [54]: dummy_buf[2:8].obs
Out[54]: array([2, 3, 4, 5])

In [55]: dummy_buf.obs[:]
Out[55]: array([0, 1, 2, 3, 4, 5, 0, 0, 0, 0])

In [56]: dummy_buf[:].obs
Out[56]: array([0, 1, 2, 3, 4, 5])

In [57]: dummy_buf.obs[np.arange(10)]
Out[57]: array([0, 1, 2, 3, 4, 5, 0, 0, 0, 0])

In [58]: dummy_buf[np.arange(10)].obs
Out[58]: array([0, 1, 2, 3, 4, 5, 0, 0, 0, 0])

MischaPanch · 2024-04-21T12:54:16Z

Glad you agree with me on this ^^. I'm not sure whether anywhere in the code the retrieval of the slice with empty values is used. For me it's fine to completely remove it, however, many tests will need to be adjusted, as now many of them rely on this somehow weird retrieval mechanism.

We could live without the full retrieval method until someone actually needs it. It's a good practice to keep the public interface as small as possible

MischaPanch · 2024-04-21T12:56:18Z

Created the issue accidentally, sry for that

dantp-ai · 2024-04-23T14:22:20Z

Something is off here. As seen above, index is of type str so Batch.__getitem__ immediately returns. How can we get access on the last .[] ?

EDIT: dummy_buffer.obs is of type np.ndarray so we can't have direct access on that slicing.

MischaPanch · 2024-04-23T21:52:24Z

Sorry, I'm afraid I don't understand what you are asking. We can have a call on Friday if you want to :)

MischaPanch added refactoring No change to functionality breaking changes Changes in public interfaces. Includes small changes or changes in keys Batch and Buffer Improvements in internal data structures, temporary label labels Apr 3, 2024

MischaPanch added this to To do in Overall Tianshou Status via automation Apr 3, 2024

MischaPanch assigned dantp-ai Apr 23, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Buffer: fix discrepancy in slicing order #1090

Buffer: fix discrepancy in slicing order #1090

MischaPanch commented Apr 3, 2024

dantp-ai commented Apr 5, 2024 •

edited

MischaPanch commented Apr 12, 2024

dantp-ai commented Apr 21, 2024

MischaPanch commented Apr 21, 2024 •

edited

MischaPanch commented Apr 21, 2024

dantp-ai commented Apr 23, 2024 •

edited

MischaPanch commented Apr 23, 2024

Buffer: fix discrepancy in slicing order #1090

Buffer: fix discrepancy in slicing order #1090

Comments

MischaPanch commented Apr 3, 2024

dantp-ai commented Apr 5, 2024 • edited

MischaPanch commented Apr 12, 2024

dantp-ai commented Apr 21, 2024

MischaPanch commented Apr 21, 2024 • edited

MischaPanch commented Apr 21, 2024

dantp-ai commented Apr 23, 2024 • edited

MischaPanch commented Apr 23, 2024

dantp-ai commented Apr 5, 2024 •

edited

MischaPanch commented Apr 21, 2024 •

edited

dantp-ai commented Apr 23, 2024 •

edited