🏃 Running The Code

Quick Links

Setup
Tutorial
Agents and performances

🏃 Running The Code

To start a train session, once installed:

python Run.py

Defaults:

Agent=Agents.AC2Agent

task=atari/pong

Plots, logs, generated images, and videos are automatically stored in: ./Benchmarking.

Welcome ye, weary Traveller.

Stop here and rest at our local tavern,

Where all your reinforcements and supervisions be served, à la carte!

Drink up! 🍻

🖊️ Paper & Citing

For detailed documentation, see our 📜.

@article{UnifiedML,
  title   = {UnifiedML: A Unified Framework For Intelligence Training},
  author  = {Sam Lerman, Chenliang Xu},
  howpublished = {https://github.com/AGI-init/UnifiedML-legacy},
  year    = {2023}
}

If you use this work, please give us a star ⭐ and be sure to cite the above!

An acknowledgment to Denis Yarats, whose excellent DrQV2 repo inspired much of this library and its design.

☂️ Unified Learning?

Yes.

Our AC2Agent supports discrete and continuous control, classification, generative modeling, and more.

See example scripts of various configurations below.

🔧 Setting Up

Let's get to business.

1. Clone The Repo

git clone git@github.com:agi-init/UnifiedML-legacy.git
cd UnifiedML-legacy

2. Gemme Some Dependencies

All dependencies can be installed via Conda:

conda env create --name ML --file=Conda.yml

3. Activate Your Conda Env.

conda activate ML

ⓘ Depending on your CUDA version, you may need to redundantly install Pytorch with CUDA from pytorch.org/get-started after activating your Conda environment.

For example, for CUDA 11.6:
pip uninstall torch torchvision torchaudio
pip install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu116

🕹️ Installing The Suites

1. Atari Arcade

A collection of retro Atari games.

You can install via AutoROM if you accept the license. First install AutoROM.

pip install autorom

Then accept the license.

AutoROM --accept-license

2. DeepMind Control

Comes pre-installed! For any issues, consult the DMC repo.

▶️ Click to play

Video of different tasks in action.

3. Classify

Eight different ladybug species in the iNaturalist dataset.

All datasets come ready-to-use ✅

That's it.

💡 Train Atari example: python Run.py task=atari/mspacman

💡 Train DMC example: python Run.py task=dmc/cheetah_run

💡 Train Classify example: python Run.py task=classify/mnist

🗄️ Key files

Run.py handles learning and evaluation loops, saving, distributed training, logging, plotting.

Environment.py handles rollouts.

./Agents contains self-contained agents.

🔍 Full Tutorials

RL

🔍 Click to interact

Train DQN Agent to play Ms. Pac-Man:

python Run.py task=atari/mspacman Agent=Agents.DQNAgent

Our implementation expands on ensemble Q-learning with data regularization and Soft-DQN.
Original Nature DQN paper.

——❖——

Humanoid from pixels with DrQV2 Agent, a state of the art algorithm for continuous control from images:

python Run.py task=dmc/humanoid_walk Agent=Agents.DrQV2Agent

⋆⋅☆⋅⋆

Play Super Mario Bros. with Dueling DQN Agent, an extension of DQN that uses dueling Q networks:

python Run.py task=mario Agent=Agents.DuelingDQNAgent

•⎽⎼⎻⎺⎺⎻⎼⎽⎽⎼✧ ☼ 𖥸 ☽ ✧⎼⎽⎽⎼⎻⎺⎺⎻⎼⎽•

The library's default Agent is our AC2 Agent (Agent=Agents.AC2Agent).

python Run.py

+agent.depth=5 can activate a self-supervisor to predict temporal dynamics for a number of timesteps ahead, similar to Dreamer and SPR.
+agent.num_actors=5 +agent.num_critics=5 can activate actor-critic ensembling.

In addition to RL, this agent supports classification, generative modeling, and various modes. Therefore we refer to it as a framework, not just an agent. The full array of the library's features and cross-domain compatibilities are supported by this agent.

⎽⎼⎻⎺⎺⎻⎼⎽⎽⎼⎻⎺⎺⎻⎼⎽⎽⎼⎻⎺⎺⎻⎼⎽⎽⎼⎻⎺⎺⎻⎼⎽

Save videos with vlog=true.

🎬 🎥 -> Benchmarking/<experiment>/<agent>/<suite>/<task>_<seed>_Video_Image/

Check out args.yaml for the full array of configurable options available, including

N-step rewards (nstep=)
Frame stack (frame_stack=)
Action repeat (action_repeat=)
& more, with per-task defaults in /Hyperparams/task — please share your hyperparams if you discover new or better ones!

ⓘ If you'd like to discretize a continuous domain, pass in discrete=true and specify the number of discrete bins per action dimension via num_actions=. If you'd like to continuous-ize a discrete domain, pass in discrete=false. Action space conversions are experimental.

💡 The below sections describe many features in other domains, but chances are those features will work in RL as well. For example, a cosine annealing learning rate schedule can be toggled with: lr_decay_epochs=100. It will anneal per-episode rather than per-epoch. Different model architectures, image transforms, EMAs, and more are all supported across domains!

The vast majority of this hasn't been tested outside of its respective domain (CV, RL, etc.), so the research opportunity is a lot!

Classification

🔍 Click to categorize

CNN on MNIST:

python Run.py task=classify/mnist

Note: RL=false is the default for classify tasks. Keeps training at standard supervised-only classification.

Variations

Since this is UnifiedML, there are a couple noteworthy variations. You can ignore these if you are only interested in standard classification via cross-entropy supervision only.

With RL=true, an augmented RL update joins the supervised learning update $\text{s.t. } reward = -error$ (experimental).
Alternatively, and interestingly, supervise=false RL=true will only supervise via RL $reward = -error$. This is pure-RL training and actually works!

Classify environments can actually be great testbeds for certain RL problems since they give near-instant and clear performance feedback.

Ignore these variations for doing standard classification.

Important features

Many popular features are unified in this library and generalized across RL/CV/generative domains, with more being added:

Evaluation with exponential moving average (EMA) of params can be toggled with the ema=true flag; customize the decay rate with ema_decay=.
See Custom Architectures for mix-and-matching custom or pre-defined (e.g. ViT, ResNet50) architectures via the command line syntax.
Different optimizations can be configured too.
As well as Custom Datasets.
Ensembling is supported (e.g., +agent.num_actors=)
Training with weight decay can be toggled via weight_decay=.
A cosine annealing learning rate schedule can be applied for $N$ epochs (or episodes in RL) with lr_decay_epochs=.
And TorchVision transforms can be passed in as dicts via transform=.

For example,

python Run.py task=classify/cifar10 weight_decay=0.01 transform="{RandomHorizontalFlip:{p:0.5}}" Eyes=Blocks.Architectures.ResNet18

The above returns a $94$% on CIFAR-10 with a ResNet18, which is pretty good. Changing datasets/architectures is as easy as modifying the corresponding parts task= and Eyes= of the above script.

And if you set supervise=false RL=true, we get about the same score... vis-à-vis pure-RL.

This library is meant to be useful for academic research, and out of the box supports many datasets, including

Tiny-ImageNet (task=classify/tinyimagenet),
iNaturalist, (task=classify/inaturalist),
CIFAR-100 (task=classify/cifar100),
& more, normalized and no manual preparation needed

Generative Modeling

🔍 Click to synth

Via the generate=true flag:

python Run.py task=classify/mnist generate=true

Synthesized MNIST images, conjured up and imagined by a simple MLP.

Saves to ./Benchmarking/<experiment>/<Agent name>/<task>_<seed>_Video_Image/.

Defaults can be easily modified with custom architectures or even datasets as elaborated in Custom Architectures and Custom Datasets. Let's try the above with a CNN Discriminator:

python Run.py task=classify/mnist generate=true Discriminator=CNN +agent.num_critics=1

+agent.num_critics=1 uses only a single Discriminator rather than ensembling as is done in RL. See How Is This Possible? for more details on the unification.

Or a ResNet18:

python Run.py task=classify/mnist generate=true Discriminator=ResNet18

Let's speed up training by turning off the default image augmentation, which is overkill anyway for this simple case:

python Run.py task=classify/mnist generate=true Aug=Identity +agent.num_critics=1

Aug=Identity substitutes the default random cropping image-augmentation with the Identity function, thereby disabling it.

Generative mode implicitly treats training as offline, and assumes a replay is saved that can be loaded. As long as a dataset is available or a replay has been saved, generate=true will work for any defined visual task, making it a powerful hyper-parameter that can just work. For now, only visual (image) tasks are compatible.

Can even work with RL tasks (due to frame stack, the generated images are technically multi-frame videos).

python Run.py task=atari/breakout generate=true

Make sure you have saved a replay that can be loaded before doing this.

Saving

🔍 Click to remember

Agents are automatically saved at the end of training:

python Run.py train_steps=2

Agents can be saved periodically and/or loaded with the save_per_steps= or load=true flags respectively:

# Saves periodically
python Run.py save_per_steps=100000

# Load
python Run.py load=true

Agents may be trained without saving by adding the save=false flag.

An experience replay can be saved and/or loaded with the replay.save=true or replay.load=true flags.

# Save
python Run.py replay.save=true

# Load
python Run.py replay.load=true

Online tasks, such as online RL, will create a new replay if replay.load=false, or — careful — potentially delete the current replay at the end of training if replay.save=false.

By default, classify tasks are offline, meaning you don't have to worry about loading or saving replays. Since the dataset is static, creating/loading is handled automatically.

Click here to learn more about replays

In UnifiedML, replays are an efficient accelerated storage format for data that support both static and dynamic (changing/growing) datasets.

You can disable the use of replays with stream=true, which just sends data to the Agent directly from the environment. In RL, this is equivalent to on-policy training. In classification, it means you'll just directly use the Pytorch Dataset, without all the fancy replay features and accelerations.

Replays are recommended for RL because on-policy algorithmic support is currently limited.

~

Agents and replays save to ./Checkpoints and ./Datasets/ReplayBuffer respectively per a unique experiment, otherwise overwriting.

A unique experiment is distinguished by the flags: experiment=, Agent=, suite=, task_name=, and seed=.

You can change the Agent load/save path with load_path=/save_path=, and replay.path= for replays. All three accept string paths e.g. load_path='./Checkpoints/Exp/AC2Agent/classify/MNIST_1.pt'.

Offline RL

🔍 Click to play retroactively

Offline means the dataset size doesn't grow.

From a saved experience replay, sans additional rollouts:

python Run.py task=atari/breakout offline=true

Assumes a replay is saved.

Implicitly treats replay.load=true and replay.save=true, and only does learning updates and evaluation rollouts.

offline=true is the default for classification, where datasets are automatically downloaded and created into offline replays.

Distributed

🔍 Click to de-centralize

The simplest way to do distributed training is to use the parallel=true flag,

python Run.py parallel=true

which automatically parallelizes the Encoder's "Eyes" across all visible GPUs. The Encoder is usually the most compute-intensive architectural portion.

To share whole agents across multiple parallel instances and/or machines,

Click to expand 📖

you can use the load_per_steps= flag.

For example, a data-collector agent and an update agent,

python Run.py learn_per_steps=0 replay.save=true load_per_steps=1

python Run.py offline=true replay.offline=false replay.save=true replay.load=true save_per_steps=2

in concurrent processes.

Since both use the same experiment name, they will save and load from the same agent and replay, thereby emulating distributed training. Just make sure the replay from the first script is created before launching the second script. Highly experimental!

Here is another example of distributed training, via shared replays:

python Run.py replay.save=true

Then, in a separate process, after that replay has been created:

python Run.py replay.load=true replay.save=true

Custom Architectures

🔍 Click to construct

A rich and expressive command line syntax is available for selecting and customizing architectures such as those defined in ./Blocks/Architectures.

ResNet18 on CIFAR-10:

python Run.py task=classify/cifar10 Eyes=ResNet18

Atari with ViT:

python Run.py Eyes=ViT +eyes.patch_size=7

Shorthands like Aug, Eyes, and Pool make it easy to plug and play custom architectures. All of an agent's architectural parts can be accessed, mixed, and matched with their corresponding recipe shorthand names.

Generally, the rule of thumb is Capital names for paths to classes (such as Eyes=Blocks.Architectures.MLP) and lowercase names for shortcuts to tinker with model args (such as +eyes.depth=1).

Architectures imported in Blocks/Architectures/__init__.py can be accessed directly without need for entering their full paths, as in Eyes=ViT works just as well as Eyes=Blocks.Architectures.ViT.

See more examples 📖

CIFAR-10 with ViT:

python Run.py Eyes=ViT task=classify/cifar10 ema=true weight_decay=0.01 +eyes.depth=6 +eyes.out_channels=512 +eyes.mlp_hidden_dim=512 transform="{RandomCrop:{size:32,padding:4},RandomHorizontalFlip:{}}" Aug=Identity

Here is a more complex example, disabling the Encoder's flattening of the feature map, and instead giving the Actor and Critic unique Attention Pooling operations on their trunks to pool the unflattened features. The Identity architecture disables that flattening component.

python Run.py task=classify/mnist Q_trunk=Transformer Pi_trunk=Transformer Pool=Identity

Here is a nice example of the critic using a small CNN for downsampling features:

python Run.py task=dmc/cheetah_run Q_trunk=CNN +q_trunk.depth=1 pool=Identity

A CNN Actor and Critic:

python Run.py Q_trunk=CNN Pi_trunk=CNN +q_trunk.depth=1 +pi_trunk.depth=1 Pool=Identity

A little secret, but pytorch code can be passed directly too via quotes:

python Run.py "eyes='CNN(kwargs.input_shape,32,depth=3)'"

python Run.py "eyes='torch.nn.Conv2d(kwargs.input_shape[0],32,kernel_size=3)'"

Some blocks have default args which can be accessed with the kwargs. interpolation shown above.

An intricate example of the expressiveness of this syntax:

python Run.py Optim=SGD 'Pi_trunk="nn.Sequential(MLP(input_shape=kwargs.input_shape, output_shape=kwargs.output_shape),nn.ReLU(inplace=True))"' lr=0.01

Both the uppercase and lowercase syntax support direct function calls in place of usual syntax, with function calls distinguished by the syntactical quotes and parentheticals.

The parser automatically registers the imports/class paths in Utils. in both the uppercase and lowercase syntax, including modules/classes torch, torch.nn, and architectures/paths in ./Blocks/Architectures/ like CNN for direct access and no need to type Utils..

To make a custom architecture, you can use any Pytorch module which outputs a tensor. Woohoo, done.

To make it mix-and-matchable throughout UnfiedML for arbitrary dimensionalities and domains, to generalize as much as possible, you can add:

input_shape and output_shape arguments to the __init__ method, such that your architecture can have a defined adaptation scheme for different possible shapes.
Support arbitrary many inputs (such as by concatenating them) of weird shapes (broadcasting them).
A repr_shape(*_) method that pre-computes the output shape given a varying-number of input shape dimensions as arguments.

None of these add-ons are necessary, but if you include all of them, then your architecture can adapt to everything. There are lazy ways to hack all of these features into any architecture, or you can follow the pretty basic templates used in our existing array of architectures. Most of our architectures can probably be used to build whatever architecture you’re trying to build, honestly, or at least something similar enough that you could have a good jumping-off point.

In short: To make your own architecture mix-and-matchable, just put it in a pytorch module with initialization options for input_shape and output_shape, as in the architectures in ./Blocks/Architectures.

The Encoder Eyes automatically adapt 2d conv to 1d conv by the way (if data is 1d).

Custom Optimizers

🔍 Click to search/explore

You can pass in a path to the Optim= flag or select a built-in Pytorch optimizer like SGD, or both as below:

python Run.py Optim=Utils.torch.optim.SGD lr=0.1

Equivalently via the expressive recipe interface:

python Run.py Optim=SGD lr=0.1

or

python Run.py "optim='torch.optim.SGD(kwargs.params, lr=0.1)'"

In the first two examples, the lr= flag was optional. The default learning rate is 1e-4 and we could have writen +optim.lr=.

Per-block optimizers For example, just the Encoder:

python Run.py encoder.Optim=SGD

Learning rate schedulers. Scheduler= works analogously to Optim=, or just use the lr_decay_epochs= shorthand for cosine annealing e.g.

python Run.py task=classify/mnist lr_decay_epochs=100

Custom Env

🔍 Click to let there be light

As an example of custom environments, we provide the Super Mario Bros. game environment in ./Datasets/Suites/SuperMario.py.

To use it, you can just pass in the path to Env= and specify the suite and the task_name to your choosing:

python Run.py Env=Datasets.Suites.SuperMario.SuperMario suite=SuperMario task_name=Mario

Mario trained via DQN.

Any Hyperparams you don't specify will be inherited from the default task, atari/pong in ./Hyperparams/task/atari/pong.yaml, or whichever task is selected.

ⓘ If you want to save Hyperparams and formally define a task, you can create files like ./Hyperparams/task/mario.yaml in the ./Hyperparams/task/ directory:

# ./Hyperparams/task/mario.yaml
defaults:
  - _self_

Env: Datasets.Suites.SuperMario.SuperMario
suite: SuperMario
task_name: Mario
discrete: true
action_repeat: 4
truncate_episode_steps: 250
nstep: 3
frame_stack: 4
train_steps: 3000000
stddev_schedule: 'linear(1.0,0.1,800000)'

Now you can launch Mario with:

python Run.py task=mario

You can also customize params and worlds and stages with the +env. syntax:

python Run.py task=mario +env.stage=2

Custom Dataset

🔍 Click to read, parse, & boot up

You can pass in any Dataset as follows:

python Run.py task=classify/custom Dataset=torchvision.datasets.MNIST

That will launch MNIST. Another example, with a custom class and path,

python Run.py task=classify/custom Dataset=Datasets.Suites._TinyImageNet.TinyImageNet

This will initiate a classify task on the custom-defined TinyImageNet Dataset.

You can change the task name as it's saved for benchmarking and plotting, with task_name=. The default is the class name such as TinyImageNet.

UnifiedML is compatible with datasets & domains besides Vision.

Thanks to dimensionality adaptivity (slide 12) for example, train the default CNN architecture on raw 1D Audio:

python Run.py task=classify/custom Dataset=Datasets.Suites._SpeechCommands.SpeechCommands Aug=Identity

Gets a perfect score on speech command classification from raw 1D audio with the default CNN setting.

More details and examples 📖

For a non-Vision/Audio tutorial, we provide a full end-to-end example in Crystal classification, reproducing classifying crystal structures and space groups from X-ray diffraction patterns.

Note: You can also specify an independent test dataset explicitly with TestDataset=.

Recipes

🔍 Learn to cook

Save hyperparams to .yaml files by defining them in the ./Hyperparams/task/ directory. There are many saved examples already.

If you've defined a .yaml file called my_recipe.yaml for example, you can use it via

python Run.py task=my_recipe

Please share your recipes in our Discussions page if you discover new or better hyperparams for a problem.

Recipes can also be defined temporarily via command line without saving them to .yaml files.

Below is a running list of some out-of-the-ordinary or interesting ones:

python Run.py Eyes=Sequential +eyes._targets_="[CNN, Transformer]" task=classify/mnist

python Run.py task=classify/mnist Pool=Sequential +pool._targets_="[Transformer, AvgPool]" +pool.positional_encodings=false

python Run.py task=classify/mnist Pool=Residual +pool.model=Transformer +pool.depth=2

python Run.py task=classify/mnist Pool=Sequential +pool._targets_="[ChannelSwap, Residual]" +'pool.model="MLP(kwargs.input_shape[-1])"' +'pool.down_sample="MLP(input_shape=kwargs.input_shape[-1])"'

python Run.py task=classify/mnist Pool=RN

python Run.py task=classify/mnist Pool=Sequential +pool._targets_="[RN, AvgPool]"

python Run.py task=classify/mnist Eyes=Perceiver +eyes.depths="[3, 3, 2]"  +eyes.num_tokens=128

python Run.py task=classify/mnist Predictor=Perceiver +predictor.token_dim=32

python Run.py task=classify/mnist Predictor=Perceiver train_steps=2
python Run.py task=dmc/cheetah_run Predictor=load +predictor.path=./Checkpoints/Exp/DQNAgent/classify/MNIST_1.pt +predictor.attr=actor.Pi_head +predictor.device=cpu save=false

python Run.py task=classify/mnist Eyes=Identity Predictor=Perceiver +predictor.depths=10

python Run.py Aug=Sequential +aug._targets_="[IntensityAug, RandomShiftsAug]" +aug.scale=0.05 aug.pad=4

These are also useful for testing whether I've broken things.

Experiment naming, plotting

🔍 Click to see

Plots automatically save to ./Benchmarking/<experiment>/; the default experiment is experiment=Exp.

python Run.py

📈 📊 --> ./Benchmarking/Exp/

Optionally plot multiple experiments

python Run.py experiment=Exp2 plotting.plot_experiments="['Exp', 'Exp2']"

Alternatively, you can call Plot.py directly

python Plot.py plot_experiments="['Exp', 'Exp2']"

to generate plots. Here, the <experiment> directory name will be the underscore_concatenated union of all experiment names ("Exp_Exp2").

Plotting also accepts regex expressions. For example, to plot all experiments with Exp in the name:

python Plot.py plot_experiments="['Exp.*']"

Another option is to use WandB, which is supported by UnifiedML:

python Run.py logger.wandb=true

You can connect UnifiedML to your WandB account by first running wandb login in your Conda environment.

To do a hyperparameter sweep, just use the -m flag.

python Run.py -m task=atari/pong,classify/mnist seed=1,2,3

Log video during evaluations with log_media=true.

Publishing

🔍 Click to write your own paper

We have released our slide deck!

Templates available here

Feel free to use our UnifiedML templates and figures in your work, citing us of course.

Open-source research for minimal redundancy and optimal standardization is the way to go, balancing privacy and de-centrality, and streamlining successive works that depend on ours in good faith. Post your own designs and assets here in the discussion board. Read the rules to keep citations and credit attribution fair.

📊 Agents & Performances

Atari

We can attain 100% mean human-normalized score across the Atari-26 benchmark suite in about 1m environment steps.

The below example script shows how to launch training for just Pong and Breakout with AC2Agent:

python Run.py task=atari/pong,atari/breakout -m

The results are reported for all 26 games and 3 different agents:

Click here to see per-task results.

We found these results to be pretty stable across a range of exploration rates as well:

Each time point averages over 10 evaluation episodes (and 26 games).

DCGAN

The simplest way to do DCGAN is to use the DCGAN architecture:

python Run.py task=classify/celeba generate=true Discriminator=DCGAN.Discriminator Generator=DCGAN.Generator train_steps=50000

We can then improve the results, and speed up training tenfold, by modifying the hyperparameters:

python Run.py task=classify/celeba generate=true Discriminator=DCGAN.Discriminator Generator=DCGAN.Generator z_dim=100 Aug=Identity Optim=Adam '+optim.betas=[0.5, 0.999]' lr=2e-4 +agent.num_critics=1 train_steps=5000

⁉️ How is this possible

We use our new Creator framework to unify RL discrete and continuous action spaces, as elaborated in our paper.

Then we frame actions as "predictions" in supervised learning. We can even augment supervised learning with an RL phase, treating reward as negative error.

For generative modeling, well, it turns out that the difference between a Generator-Discriminator and Actor-Critic is rather nominal.

🎓 Pedagogy and Research

All files are designed for pedagogical clarity and extendability for research, to be useful for educational and innovational purposes, with simplicity at heart.

🧑‍🤝‍🧑 Contributing

Please support financially by Sponsoring.

We are a nonprofit, single-PhD student team. If possible, compute resources appreciated.

Feel free to contact agi.__init__.

I am always looking for collaborators. Don't hesitate to volunteer in any way to help realize the full potential of this library.

MIT license Included.

Non-legacy version: here.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
.github		.github
Agents		Agents
Benchmarking/Reference/DrQV2Agent/dmc		Benchmarking/Reference/DrQV2Agent/dmc
Blocks		Blocks
Datasets		Datasets
Hyperparams		Hyperparams
Losses		Losses
Conda.yml		Conda.yml
Distributions.py		Distributions.py
Logger.py		Logger.py
MIT_LICENSE		MIT_LICENSE
Plot.py		Plot.py
README.md		README.md
Run.py		Run.py
SHORT_STORY.md		SHORT_STORY.md
Utils.py		Utils.py
Vlogger.py		Vlogger.py
__init__.py		__init__.py

License

AGI-init/UnifiedML-legacy

Folders and files

Latest commit

History

Repository files navigation

Quick Links

🏃 Running The Code

🖊️ Paper & Citing

☂️ Unified Learning?

🔧 Setting Up

1. Clone The Repo

2. Gemme Some Dependencies

3. Activate Your Conda Env.

🕹️ Installing The Suites

1. Atari Arcade

2. DeepMind Control

3. Classify

🗄️ Key files

🔍 Full Tutorials

RL

Classification

Generative Modeling

Saving

Offline RL

Distributed

Custom Architectures

Custom Optimizers

Custom Env

Custom Dataset

Recipes

Experiment naming, plotting

Publishing

📊 Agents & Performances

⁉️ How is this possible

🎓 Pedagogy and Research

🧑‍🤝‍🧑 Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Sponsor this project

Languages