Skip to content

v0.4.0

Latest
Compare
Choose a tag to compare
@vmoens vmoens released this 25 Apr 20:14
· 38 commits to main since this release

New Features:

  • Better video rendering
    • [Feature] A PixelRenderTransform by @vmoens in #2099
    • [Feature] Video recording in SOTA examples by @vmoens in #2070
    • [Feature] VideoRecorder for datasets and replay buffers by @vmoens in #2069
  • Replay buffer: sampling trajectories is now much easier, cleaner and faster
    • [Benchmark] Benchmark slice sampler by @vmoens in #1992
    • [Feature] Add PrioritizedSliceSampler by @Cadene in #1875
    • [Feature] Span slice indices on the left and on the right by @vmoens in #2107
    • [Feature] batched trajectories - SliceSampler compatibility by @vmoens in #1775
    • [Performance] Faster slice sampler by @vmoens in #2031
  • Datasets: allow preprocessing datasets after download
  • Losses: reduction parameters and non-functional execution
  • Environment API: support "fork" start method in ParallelEnv, better handling of auto-resetting envs.
    • [Feature] Use non-default mp start method in ParallelEnv by @vmoens in #1966
    • [Feature] Auto-resetting envs by @vmoens in #2073
  • Transforms
    • [Feature] Allow any callable to be used as transform by @vmoens in #2027
    • [Feature] invert transforms appended to a RB by @vmoens in #2111
    • [Feature] Extend TensorDictPrimer default_value options by @albertbou92 in #2071
    • [Feature] Fine grained DeviceCastTransform by @vmoens in #2041
    • [Feature] BatchSizeTransform by @vmoens in #2030
    • [Feature] Allow non-sorted keys in CatFrames by @vmoens in #1913
    • [Feature] env.append_transform by @vmoens in #2040
  • New environment and improvements:

Other features

  • [Feature] Add time_dim arg in value modules by @vmoens in #1946
  • [Feature] Batched actions wrapper by @vmoens in #2018
  • [Feature] Better repr of RBs by @vmoens in #1991
  • [Feature] Execute rollouts with regular nn.Module instances by @vmoens in #1947
  • [Feature] Logger by @vmoens in #1858
  • [Feature] Passing lists of keyword arguments in reset for batched envs by @vmoens in #2076
  • [Feature] RB MultiStep transform by @vmoens in #2008
  • [Feature] Replace RewardClipping with SignTransform in Atari examples by @albertbou92 in #1870
  • [Feature] reset_parameters for multiagent nets by @matteobettini in #1970
  • [Feature] optionally set truncated = True at the end of rollouts by @vmoens in #2042

Miscellaneous

  • Fix onw typo by @kit1980 in #1917
  • Rename SOTA-IMPLEMENTATIONS.md to README.md by @matteobettini in #2093
  • Revert "[BugFix] Fix Isaac" by @vmoens in #2118
  • Update getting-started-5.py by @vmoens in #1894
  • [BugFix, Performance] Fewer imports at root by @vmoens in #1930
  • [BugFix,CI] Fix Windows CI by @vmoens in #1983
  • [BugFix,CI] Fix sporadically failing tests in CI by @vmoens in #2098
  • [BugFix,Refactor] Dreamer refactor by @BY571 in #1918
  • [BugFix] Adaptable non-blocking for mps and non cuda device in batched-envs by @vmoens in #1900
  • [BugFix] Call contiguous on rollout results in TestMultiStepTransform by @vmoens in #2025
  • [BugFix] Dedicated tests for on policy losses reduction parameter by @albertbou92 in #1974
  • [BugFix] Extend with a list of tensordicts by @vmoens in #2032
  • [BugFix] Fix Atari DQN ensembling by @vmoens in #1981
  • [BugFix] Fix CQL/IQL pbar update by @vmoens in #2020
  • [BugFix] Fix Exclude / Double2Float transforms by @vmoens in #2101
  • [BugFix] Fix Isaac by @vmoens in #2072
  • [BugFix] Fix KLPENPPOLoss KL computation by @vmoens in #1922
  • [BugFix] Fix MPS sync in device transform by @vmoens in #2061
  • [BugFix] Fix OOB TruncatedNormal LP by @vmoens in #1924
  • [BugFix] Fix R2Go once more by @vmoens in #2089
  • [BugFix] Fix Ray collector example error by @albertbou92 in #1908
  • [BugFix] Fix Ray collector on Python > 3.8 by @albertbou92 in #2015
  • [BugFix] Fix RoboHiveEnv tests by @sriramsk1999 in #2062
  • [BugFix] Fix _reset data passing in parallel env by @vmoens in #1880
  • [BugFix] Fix a bug in SliceSampler, indexes outside sampler lengths were produced by @vladisai in #1874
  • [BugFix] Fix args/kwargs passing in advantages by @vmoens in #2001
  • [BugFix] Fix batch-size expansion in functionalization by @vmoens in #1959
  • [BugFix] Fix broken gym tests by @vmoens in #1980
  • [BugFix] Fix clip_fraction in PO losses by @vmoens in #2021
  • [BugFix] Fix colab in tutos by @vmoens in #2113
  • [BugFix] Fix env.shape regex matches by @vmoens in #1940
  • [BugFix] Fix examples by @vmoens in #1945
  • [BugFix] Fix exploration in losses by @vmoens in #1898
  • [BugFix] Fix flaky rb tests by @vmoens in #1901
  • [BugFix] Fix habitat by @vmoens in #1941
  • [BugFix] Fix jumanji by @vmoens in #2064
  • [BugFix] Fix load_state_dict and is_empty td bugfix impact by @vmoens in #1869
  • [BugFix] Fix mp_start_method for ParallelEnv with single_for_serial by @vmoens in #2007
  • [BugFix] Fix multiple context syntax in multiagent examples by @matteobettini in #1943
  • [BugFix] Fix offline CatFrames by @vmoens in #1953
  • [BugFix] Fix offline CatFrames for pixels by @vmoens in #1964
  • [BugFix] Fix prints of size error when no file is associated with memmap by @vmoens in #2090
  • [BugFix] Fix replay buffer extension with lists by @vmoens in #1937
  • [BugFix] Fix reward2go for nd tensors by @vmoens in #2087
  • [BugFix] Fix robohive by @vmoens in #2080
  • [BugFix] Fix sampling without replacement with ndim storages by @vmoens in #1999
  • [BugFix] Fix slice sampler compatibility with split_trajs and MultiStep by @vmoens in #1961
  • [BugFix] Fix slicesampler terminated/truncated signaling by @vmoens in #2044
  • [BugFix] Fix strict-length for spanning trajectories by @vmoens in #1982
  • [BugFix] Fix strict_length=True in SliceSampler by @vmoens in #2037
  • [BugFix] Fix unwanted lazy stacks by @vmoens in #2102
  • [BugFix] Fix update in serial / parallel env by @vmoens in #1866
  • [BugFix] Fix vmas stacks by @vmoens in #2105
  • [BugFix] Fixed import for importlib by @DanilBaibak in #1914
  • [BugFix] Make KL-controllers independent of the model by @vmoens in #1903
  • [BugFix] Make sure ParallelEnv does not overflow mem when policy requires grad by @vmoens in #1909
  • [BugFix] More robust _StepMDP and multi-purpose envs by @vmoens in #2038
  • [BugFix] No grad on collector reset by @matteobettini in #1927
  • [BugFix] Non exclusive terminated and truncated by @vmoens in #1911
  • [BugFix] Refactor reductions by @vmoens in #1968
  • [BugFix] Remove split_trajectories's reference to ("next", "done"). by @initmaks in #2094
  • [BugFix] Remove reset on last step of a rollout by @matteobettini in #1936
  • [BugFix] Robust sync for non_blocking=True by @vmoens in #2034
  • [BugFix] Set default value for normalize_advantage to False. by @DobromirM in #2050
  • [BugFix] Set strict=False in tensordict.select() calls for objective classes by @albertbou92 in #2004
  • [BugFix] SliceSampler device and index mesh by @vmoens in #1996
  • [BugFix] Solve recursion issue in losses hook by @vmoens in #1897
  • [BugFix] Update cql docstring example by @BY571 in #1951
  • [BugFix] Update iql docstring example by @BY571 in #1950
  • [BugFix] Use same signature for append_transform in all cases by @vmoens in #2091
  • [BugFix] Use setdefault in _cache_values by @vmoens in #1910
  • [BugFix] Use traj_terminated in SliceSampler by @Cadene in #1884
  • [BugFix] Vmap randomness for value estimator by @BY571 in #1942
  • [BugFix] better device consistency in EGreedy by @vmoens in #1867
  • [BugFix] check_env_specs seeding logic by @vmoens in #1872
  • [BugFix] fix formatting for VideoRecorder docstring by @sriramsk1999 in #1985
  • [BugFix] fix trunc normal device by @vmoens in #1931
  • [BugFix] missing annotations import by @vmoens in #2074
  • [BugFix] state typo in RNG control module by @vmoens in #1878
  • [BugFix] to_observation_norm now works with keys which are not strings by @maxweissenbacher in #2045
  • [BugFix] union -> intersection in _StepMDP check by @vmoens in #2039
  • [CI,Doc] Sanitize version by @vmoens in #2120
  • [CI] Doc on release tag by @vmoens in #2116
  • [CI] Fix CI issues by @vmoens in #2084
  • [CI] Fix Doc CI by @matteobettini in #2106
  • [CI] Fixes sympy error by fixing mpmath version by @vmoens in #1988
  • [CI] Install ffmpeg in Robohive tests by @vmoens in #2063
  • [CI] Install stable torch and tensordict for release tests by @vmoens in #1978
  • [CI] Remove all macos x86 jobs by @vmoens in #2117
  • [CI] Remove x86 OSX jobs by @vmoens in #2112
  • [CI] Schedule workflows for releases by @vmoens in #2114
  • [CI] Temporarily remove snapshot from CI by @vmoens in #2000
  • [CI] Unpin mpmath by @vmoens in #1997
  • [CI] Upgrade 3.8 to 3.10 GPU jobs by @vmoens in #2013
  • [Deprecation] Deprecate in prep for release by @vmoens in #1820
  • [Doc,Feature] Better doc for modules and list of kwargs when possible by @vmoens in #1990
  • [Doc] Fix tutos by @vmoens in #1863
  • [Doc] Getting started tutos by @vmoens in #1886
  • [Doc] Improve PrioritizedSampler doc and get rid of np dependency as much as possible by @vmoens in #1881
  • [Doc] Installation instructions in API ref by @vmoens in #1871
  • [Doc] Per-release doc by @vmoens in #2108
  • [Documentation] Correct MaskedEnv Example in ActionMask Transform Documentation by @Jonathanace in #2060
  • [Examples] Move examples to sota-implementations by @vmoens in #2016
  • [Minor] Add env.shape attribute by @vmoens in #1938
  • [Minor] Lint by @vmoens in #2096
  • [Minor] Move distributed examples to examples by @vmoens in #2097
  • [Minor] Remove duplicate if statement in storages by @vmoens in #2066
  • [Minor] Remove warnings in test_cost by @vmoens in #1902
  • [Minor] Support init lazy storages with add by @vmoens in #2028
  • [Minor] Use the main branch for the M1 build wheels by @DanilBaibak in #1965
  • [Performance] Faster DMC by @vmoens in #2002
  • [Quality] Capture errors in specs transforms by @vmoens in #2092
  • [Quality] Make sure deprec warnings are displayed by @vmoens in #2088
  • [Refactor,Feature] Refactor collector shapes and stack_result in sync collector by @vmoens in #1994
  • [Refactor] Clearer separation between single_task and share_individual_td by @vmoens in #2026
  • [Refactor] Faster and more generic multi-agent nets by @vmoens in #1921
  • [Refactor] Refactor split_trajectories by @vmoens in #1955
  • [Refactor] Remove remnant legacy functional calls by @vmoens in #1973
  • [Refactor] Use filter_empty=False in apply for params by @vmoens in #1882
  • [Refactor] Use filter_empty=True in apply by @vmoens in #1879
  • [Tutorial] PettingZoo Parallel competitive tutorial by @matteobettini in #2047
  • [Versioning] Deprecations for 0.4 by @vmoens in #2109
  • [Versioning] New torch version by @vmoens in #2110
  • [Versioning] v0.4.0 by @vmoens in #1860

New Contributors

A big thanks to our dear contributors as well as the entire user base for helping with this lib!

Full Changelog: v0.3.0...v0.4.0