๐ Deep Reinforcement Learning Course Leaderboard ๐
This leaderboard displays trained agents from the Deep Reinforcement Learning Course.
Models are ranked using mean_reward - std_reward.
If you can't find your model, please wait for the next update (every 2 hours).
LunarLander-v2 ๐
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
10000 | pig-that-will-pierce-the-heavens | dogpizza/Deep-Reinforcement-Learning_Unit-8-1_PPO-LunarLander-v2 | -0.0600000000000022 | -89508.36 | 17.258640080301436 |
CartPole-v1
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
1000 | krishnadasar-sudheer-kumar | dogpizza/Deep-Reinforcement-Learning_Unit4_Cartpole-v1-modified | -16.650000000000006 | 272.3999938964844 | 320436.5 |
FrozenLake-v1-4x4-no_slippery โ๏ธ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
1000 | toomanygoodnamesarebeingusedsoIstickwithth | toomanygoodnamesarebeingusedsoIstickwithth/q-FrozenLake-v1-4x4-noSlippery | 0.6000000000000001 | 500.5 | 288.67 |
FrozenLake-v1-8x8-no_slippery โ๏ธ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
100 | vespasianvindex | butchland/q-FrozenLake-v1-8x8-nonslippery-work1 | 0.87 | 0.87 | 0.2 |
FrozenLake-v1-4x4 โ๏ธ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
100 | francisco-perez-sorrosal | MarkusBaumgartner/mwb-rl-unit2-q-FrozenLake-v1-4x4-noSlippery | -0.0399999999999999 | 0.95 | 0.12 |
FrozenLake-v1-8x8 โ๏ธ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
100 | TheFuriousGunner | turnip/huggingfaceclass-qtable-FrozenLake-v1-8x8-slip3 | -0.0399999999999999 | -0.35 | 0.17 |
Taxi-v3 ๐
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
1000 | toomanygoodnamesarebeingusedsoIstickwithth | dogpizza/Deep-Reinforcement-Learning_Unit2_q-Taxi-v3_reduced-decay-rate | -50.440000000000005 | -102.93 | 2.706732347314747 |
CarRacing-v0 ๐๏ธ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
10 | knutselmiddag123 | qgallouedec/ppo-CarRacing-v0-1293566242 | 500.25000000000006 | 896.87 | 176.26 |
CarRacing-v2 ๐๏ธ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
10 | Rudder-AmitBirmal | mehulbhosale/ppo-CarRacing-Rudder-v2 | 352.05999999999995 | 902.33 | 125.12 |
MountainCar-v0 โฐ๏ธ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
100 | open-rl-leaderboard | MohamedMaged262/Q-Learning-using-GYM-MountainCar-v0-Environment | -130.01999999999998 | -110.88 | 143.38 |
SpaceInvadersNoFrameskip-v4 ๐พ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
1000 | krishnadasar-sudheer-kumar | dogpizza/Deep-Reinforcement-Learning_Unit3_dqn-SpaceInvadersNoFrameskip-v4-v2 | -0.8100000000000023 | 14799.75 | 20276.64 |
PongNoFrameskip-v4 ๐พ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
100 | open-rl-leaderboard | OpenDILabCommunity/PongNoFrameskip-v4-SampledEfficientZero | -0.0399999999999991 | -20.97 | -1000 |
BreakoutNoFrameskip-v4 ๐งฑ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
100 | open-rl-leaderboard | qgallouedec/BreakoutNoFrameskip-v4-rainbow_atari-seed1 | -0.0899999999999998 | 424.81 | 206.31 |
QbertNoFrameskip-v4 ๐ฆ
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
10 | open-rl-leaderboard | OpenDILabCommunity/QbertNoFrameskip-v4-PPOOffPolicy | 8474.689999999999 | 16071.88 | 8863.18 |
BipedalWalker-v3
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
10 | OpenDILabCommunity | MattStammers/ppo-Bipedal_Walker_v3-HardcoreTrained-take2 | -11.110000000000014 | -119.12 | 101.52 |
Walker2DBulletEnv-v0
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
10 | ThomasSimonini | marci0929/Walker2DBulletEnv-Walker2DBulletEnv-v0-100k | 2285.6600000000003 | 2674.29 | 566.59 |
AntBulletEnv-v0
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
100 | EduardoCGarridoMerchan | sarahpuspdew/DeepRLCourse_Unit6-a2c-AntBulletEnv-v0 | 3421.2200000000003 | 3578.87 | 113.88 |
HalfCheetahBulletEnv-v0
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
10 | zoltantensorfow | qgallouedec/ddpg-HalfCheetahBulletEnv-v0-1712752751 | 2431.4300000000003 | -1605.41 | 1132.58 |
PandaReachDense-v2
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
100 | EduardoCGarridoMerchan | PeterDerLustige/a2c-PandaReachDense-v2-lr-0.0001 | -0.4499999999999999 | -10.03 | 0.08 |
PandaReachDense-v3
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
1000 | krishnadasar-sudheer-kumar | basakderyakilic/A2C-PandaReachDense-v3-PandaReachDense-v3-PandaReachDense-v3 | -0.1699999999999999 | -11.07 | 10.29 |
Pixelcopter-PLE-v0
Ranking | User | Model | Results | Mean Reward | Std Reward |
|---|---|---|---|---|---|
1000 | krishnadasar-sudheer-kumar | ajinkyag24/reinforcement_policy_gradient_with_pytorch_pixelcopter-v1 | -0.0099999999999997 | 183.17 | 169.58 |