update user guide #446

KohlerHECTOR · 2024-04-03T15:47:59Z

Description

PR for issues #325 #353
I remove some place holders in the user guide that we would probably never do and I did a user guide page for plotting and statitstics.

Checklist

My code follows the style guideline
To check :
black --check examples rlberry *py
flake8 --select F401,F405,D410,D411,D412 --exclude=rlberry/check_packages.py --per-file-ignores="init.py:F401",
I have commented my code, particularly in hard-to-understand areas,
I have made corresponding changes to the documentation,
I have added tests that prove my fix is effective or that my feature works,
New and existing unit tests pass locally with my changes,
If updated the changelog if necessary,
I have set the label "ready for review" and the checks are all green.

for more information, see https://pre-commit.ci

github-actions · 2024-04-03T15:54:25Z

Hello,
The build of the doc succeeded. The documentation preview is available here:
https://rlberry-py.github.io/rlberry/preview_pr

for more information, see https://pre-commit.ci

github-actions · 2024-04-04T08:47:37Z

Hello,
The build of the doc succeeded. The documentation preview is available here:
https://rlberry-py.github.io/rlberry/preview_pr

TimotheeMathieu

Thanks for this addition. However, for now, it is a bit difficult to review this. Can you clean up all the outputs?

docs/basics/userguide/plot_stats.md

TimotheeMathieu · 2024-04-08T09:10:59Z

docs/basics/userguide/plot_stats.md

+    [38;21m[INFO] 14:46: [Sb3-PPO[worker: 6]] | max_global_step = 10240 | time/iterations = 4 | rollout/ep_rew_mean = 44.09 | rollout/ep_len_mean = 44.09 | time/fps = 114 | time/time_elapsed = 71 | time/total_timesteps = 8192 | train/learning_rate = 0.0003 | train/entropy_loss = -0.635085660405457 | train/policy_gradient_loss = -0.019524750619893894 | train/value_loss = 51.484919738769534 | train/approx_kl = 0.009435750544071198 | train/clip_fraction = 0.088720703125 | train/loss = 15.843541145324707 | train/explained_variance = 0.25749021768569946 | train/n_updates = 30 | train/clip_range = 0.2 |  [0m
+    INFO:rlberry_logger:[Sb3-PPO[worker: 6]] | max_global_step = 10240 | time/iterations = 4 | rollout/ep_rew_mean = 44.09 | rollout/ep_len_mean = 44.09 | time/fps = 114 | time/time_elapsed = 71 | time/total_timesteps = 8192 | train/learning_rate = 0.0003 | train/entropy_loss = -0.635085660405457 | train/policy_gradient_loss = -0.019524750619893894 | train/value_loss = 51.484919738769534 | train/approx_kl = 0.009435750544071198 | train/clip_fraction = 0.088720703125 | train/loss = 15.843541145324707 | train/explained_variance = 0.25749021768569946 | train/n_updates = 30 | train/clip_range = 0.2 |
+    [38;21m[INFO] 14:46: [Sb3-PPO[worker: 9]] | max_global_step = 10240 | time/iterations = 4 | rollout/ep_rew_mean = 45.69 | rollout/ep_len_mean = 45.69 | time/fps = 117 | time/time_elapsed = 69 | time/total_timesteps = 8192 | train/learning_rate = 0.0003 | train/entropy_loss = -0.6443713787943125 | train/policy_gradient_loss = -0.012846545978391077 | train/value_loss = 55.9754634976387 | train/approx_kl = 0.008374381810426712 | train/clip_fraction = 0.0560546875 | train/loss = 18.558977127075195 | train/explained_variance = 0.21047407388687134 | train/n_updates = 30 | train/clip_range = 0.2 |  [0m
+    INFO:rlberry_logger:[Sb3-PPO[worker: 9]] | max_global_step = 10240 | time/iterations = 4 | rollout/ep_rew_mean = 45.69 | rollout/ep_len_mean = 45.69 | time/fps = 117 | time/time_elapsed = 69 | time/total_timesteps = 8192 | 


Please do not include colab stuffs. They are not useful for the doc and may cause bugs.

Don't know how to remove, this was generated automatically with a jupyter to markdown converter.

You can just remove them manually.

TimotheeMathieu · 2024-04-08T09:12:20Z

docs/basics/userguide/plot_stats.md

+    [38;21m[INFO] 14:46: [Sb3-PPO[worker: 6]] | max_global_step = 10240 | time/iterations = 4 | rollout/ep_rew_mean = 44.09 | rollout/ep_len_mean = 44.09 | time/fps = 114 | time/time_elapsed = 71 | time/total_timesteps = 8192 | train/learning_rate = 0.0003 | train/entropy_loss = -0.635085660405457 | train/policy_gradient_loss = -0.019524750619893894 | train/value_loss = 51.484919738769534 | train/approx_kl = 0.009435750544071198 | train/clip_fraction = 0.088720703125 | train/loss = 15.843541145324707 | train/explained_variance = 0.25749021768569946 | train/n_updates = 30 | train/clip_range = 0.2 |  [0m
+    INFO:rlberry_logger:[Sb3-PPO[worker: 6]] | max_global_step = 10240 | time/iterations = 4 | rollout/ep_rew_mean = 44.09 | rollout/ep_len_mean = 44.09 | time/fps = 114 | time/time_elapsed = 71 | time/total_timesteps = 8192 | train/learning_rate = 0.0003 | train/entropy_loss = -0.635085660405457 | train/policy_gradient_loss = -0.019524750619893894 | train/value_loss = 51.484919738769534 | train/approx_kl = 0.009435750544071198 | train/clip_fraction = 0.088720703125 | train/loss = 15.843541145324707 | train/explained_variance = 0.25749021768569946 | train/n_updates = 30 | train/clip_range = 0.2 |
+    [38;21m[INFO] 14:46: [Sb3-PPO[worker: 9]] | max_global_step = 10240 | time/iterations = 4 | rollout/ep_rew_mean = 45.69 | rollout/ep_len_mean = 45.69 | time/fps = 117 | time/time_elapsed = 69 | time/total_timesteps = 8192 | train/learning_rate = 0.0003 | train/entropy_loss = -0.6443713787943125 | train/policy_gradient_loss = -0.012846545978391077 | train/value_loss = 55.9754634976387 | train/approx_kl = 0.008374381810426712 | train/clip_fraction = 0.0560546875 | train/loss = 18.558977127075195 | train/explained_variance = 0.21047407388687134 | train/n_updates = 30 | train/clip_range = 0.2 |  [0m
+    INFO:rlberry_logger:[Sb3-PPO[worker: 9]] | max_global_step = 10240 | time/iterations = 4 | rollout/ep_rew_mean = 45.69 | rollout/ep_len_mean = 45.69 | time/fps = 117 | time/time_elapsed = 69 | time/total_timesteps = 8192 | 


Please, write an explanation on how to interpret the results.

Done with a todo

docs/user_guide.md

github-actions · 2024-04-08T12:36:42Z

Hello,
The build of the doc succeeded. The documentation preview is available here:
https://rlberry-py.github.io/rlberry/preview_pr

…nd maybe smoothing curve

for more information, see https://pre-commit.ci

github-actions · 2024-04-08T12:57:43Z

Hello,
The build of the doc succeeded. The documentation preview is available here:
https://rlberry-py.github.io/rlberry/preview_pr

for more information, see https://pre-commit.ci

KohlerHECTOR added 2 commits April 3, 2024 17:46

updt user guide

64b1210

updt user guide

89d08b6

KohlerHECTOR requested a review from JulienT01 April 3, 2024 15:48

[pre-commit.ci] auto fixes from pre-commit.com hooks

a3f61c9

for more information, see https://pre-commit.ci

KohlerHECTOR added documentation Improvements or additions to documentation ready for review labels Apr 3, 2024

KohlerHECTOR and others added 3 commits April 4, 2024 10:29

added miising png in doc

e9f93c1

Merge branch 'main' of https://github.com/KohlerHECTOR/rlberry

1d77808

[pre-commit.ci] auto fixes from pre-commit.com hooks

266ede3

for more information, see https://pre-commit.ci

KohlerHECTOR added the Marathon To do during Marathon label Apr 4, 2024

fixed indent in md

0cb83a0

TimotheeMathieu reviewed Apr 8, 2024

View reviewed changes

KohlerHECTOR and others added 2 commits April 8, 2024 14:29

rever some changes

54bc2f1

Merge branch 'rlberry-py:main' into main

b6a592c

KohlerHECTOR and others added 3 commits April 8, 2024 14:50

first pass after review. Need to add interpretation of adastop res. a…

939c2b2

…nd maybe smoothing curve

[pre-commit.ci] auto fixes from pre-commit.com hooks

74d7338

for more information, see https://pre-commit.ci

added missing link and text

49a5413

KohlerHECTOR and others added 2 commits April 8, 2024 15:04

added small adastop interpretation

1885275

[pre-commit.ci] auto fixes from pre-commit.com hooks

4e1bca7

for more information, see https://pre-commit.ci

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

update user guide #446

update user guide #446

KohlerHECTOR commented Apr 3, 2024 •

edited

github-actions bot commented Apr 3, 2024

github-actions bot commented Apr 4, 2024

TimotheeMathieu left a comment

TimotheeMathieu Apr 8, 2024

KohlerHECTOR Apr 8, 2024

TimotheeMathieu Apr 9, 2024

TimotheeMathieu Apr 8, 2024

KohlerHECTOR Apr 8, 2024

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

update user guide #446

Are you sure you want to change the base?

update user guide #446

Conversation

KohlerHECTOR commented Apr 3, 2024 • edited

Description

Checklist

github-actions bot commented Apr 3, 2024

github-actions bot commented Apr 4, 2024

TimotheeMathieu left a comment

Choose a reason for hiding this comment

TimotheeMathieu Apr 8, 2024

Choose a reason for hiding this comment

KohlerHECTOR Apr 8, 2024

Choose a reason for hiding this comment

TimotheeMathieu Apr 9, 2024

Choose a reason for hiding this comment

TimotheeMathieu Apr 8, 2024

Choose a reason for hiding this comment

KohlerHECTOR Apr 8, 2024

Choose a reason for hiding this comment

github-actions bot commented Apr 8, 2024

github-actions bot commented Apr 8, 2024

KohlerHECTOR commented Apr 3, 2024 •

edited