Integration of Continual Learning tasks and algorithm (WIP) #45

kalifou · 2019-05-21T20:17:12Z

This PR is a draft - the code is still in development:

add of tasks for CL: circular and square shaped move around a target, reaching a target (fex updates)
Updated data-generation: on-policy, grid walker, generative replay
CL algo : policy distillation
evaluation of catastrophic forgetting

TODO

doc & tests for CF
doc & tests for Distillation
doc & tests for all envs
Refactor code: config files, util functions ...
fix for multiprocessing of on-policy data generation
i.e:
Loading a model without an environment, this model cannot be trained until it has a valid environment. THCudaCheck FAIL file=/pytorch/aten/src/THC/THCGeneral.cpp line=55 error=3 : initialization error Process Process-2: Traceback (most recent call last): File "/home/rene/anaconda3/envs/py35/lib/python3.5/multiprocessing/process.py", line 252, in _bootstrap self.run() File "/home/rene/anaconda3/envs/py35/lib/python3.5/multiprocessing/process.py", line 93, in run self._target(*self._args, **self._kwargs) File "/home/rene/Documents/duplicate/robotics-rl-srl/rl_baselines/utils.py", line 186, in _run env_object=None) File "/home/rene/Documents/duplicate/robotics-rl-srl/state_representation/models.py", line 97, in loadSRLModel split_dimensions=split_dimensions, inverse_model_type=inverse_model_type) File "/home/rene/Documents/duplicate/robotics-rl-srl/state_representation/models.py", line 173, in __init__ self.model = self.model.to(self.device) File "/home/rene/anaconda3/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 381, in to return self._apply(convert) File "/home/rene/anaconda3/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 187, in _apply module._apply(fn) File "/home/rene/anaconda3/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 193, in _apply param.data = fn(param.data) File "/home/rene/anaconda3/envs/py35/lib/python3.5/site-packages/torch/nn/modules/module.py", line 379, in convert return t.to(device, dtype if t.is_floating_point() else None, non_blocking) File "/home/rene/anaconda3/envs/py35/lib/python3.5/site-packages/torch/cuda/__init__.py", line 162, in _lazy_init torch._C._cuda_init() RuntimeError: cuda runtime error (3) : initialization error at /pytorch/aten/src/THC/THCGeneral.cpp:55

…shaped movement

…dQin/robotics-rl-srl into circular_movement_omnibot

…into circular_movement_omnibot

…dQin/robotics-rl-srl into circular_movement_omnibot

…l-srl into escape_dev

…erge

…fusioner_issue reward can be float for circular task and escaping task

…bot_data_fusioner_issue bug fix for dataset_merger

fail to resample, delte

adding escaping task and modify the `dataset merger` to give the option for preserving the dataset after the merge

delete jupyter notebook created for testing purpose

Caselles and others added 30 commits March 14, 2019 11:56

Added targets files

967e4ac

new tasks for continual learning: random target, circular and square …

cbffb3b

…shaped movement

Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…

8949baa

…dQin/robotics-rl-srl into circular_movement_omnibot

adding args for learning the CL tasks

8f23671

Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…

6750919

…dQin/robotics-rl-srl into circular_movement_omnibot

collect CL args for replay

daff917

Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…

ad352c9

…dQin/robotics-rl-srl into circular_movement_omnibot

WIP on continual tasks

bc21db4

Continual tasks: added vizu and solved a few bugs

c68baeb

Solved bug on history not getting emptied between episodes

a1d4da9

add penality for bumping

bf9c6c9

coeff for circular task

7de393d

fix reward shaping with the product operator

edd82fa

adding new task - eight shape (draft)

ed6923c

On-Policy dataset-generator

1920de4

add small fix

e6bdde1

Merge branch 'master' of https://github.com/GaspardQin/robotics-rl-srl …

93ea793

…into circular_movement_omnibot

Generative Replay for Dataset generation

a5b6623

Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…

b976aa8

…dQin/robotics-rl-srl into circular_movement_omnibot

fix to on-policy generation for srl based policies

71f29fa

fix to loading args for replay

413c447

Merge branch 'circular_movement_omnibot' of https://github.com/Gaspar…

e99f6f2

…dQin/robotics-rl-srl into circular_movement_omnibot

first steps towards policy distillation

6082a68

small fix (init OmniRobotManagerBase)

a6dbfe8

cleaning pkgs imports

bc224b8

clean up & loading srl model in distillation script

bcb12db

cross-evaluation

4fb1e7b

cross-evaluation

deaf450

read-me update

ca5df77

plot results

c2d0405

kalifou and others added 30 commits June 13, 2019 11:52

Merge branch 'circular_movement_omnibot' into escape_dev

af944ee

fixed orientation for the chasing agent

486ea7e

bug fixed

12cd0c0

target position update

77f7267

fix merger in case of distillation

7f2ce56

reward update

fe09e49

Merge branch 'escape_dev' of https://github.com/GaspardQin/robotics-r…

b8899f9

…l-srl into escape_dev

reward can be float for circular task and escaping task

ed26f2e

a new dataset merger for the balanced timesteps settings during the m…

83bed1e

…erge

Merge pull request #8 from GaspardQin/circular_movement_omnibot_data_…

d8fdb05

…fusioner_issue reward can be float for circular task and escaping task

Revert "reward can be float for circular task and escaping task"

8701bb0

dataset manager

fbde89d

separator

52e4730

data separator

99d3039

separator

ce2d7e4

sparser dataset

f6c0f03

resampling of data

d45a369

float reward data merger

e622e1f

Merge pull request #9 from GaspardQin/revert-8-circular_movement_omni…

e823cbb

…bot_data_fusioner_issue bug fix for dataset_merger

separator

6c29da3

dataset_merger can preserve the original dataset for further use

a1d2639

cleaning

bf50fc5

learning

24b16be

preserve original data after merge

6e2272c

resampling for the distillation

d10e9b6

fail to resample, delte

cleaning

11586d1

Merge pull request #11 from GaspardQin/escape_dev

604b1ac

adding escaping task and modify the `dataset merger` to give the option for preserving the dataset after the merge

test4esc&clearning

d4c2bd9

Delete delete_val.ipynb

9527b00

delete jupyter notebook created for testing purpose

Update environment.yml

e2f4c76

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Integration of Continual Learning tasks and algorithm (WIP) #45

Integration of Continual Learning tasks and algorithm (WIP) #45

kalifou commented May 21, 2019 •

edited by ncble

Integration of Continual Learning tasks and algorithm (WIP) #45

Are you sure you want to change the base?

Integration of Continual Learning tasks and algorithm (WIP) #45

Conversation

kalifou commented May 21, 2019 • edited by ncble

kalifou commented May 21, 2019 •

edited by ncble