Post
161
๐ Let LLMs wander - Engineering RL Environments
Reinforcement Learning Environments are little worlds
where models can act, get rewards, and learn.
I've been exploring how to design them, figuring out what works and what doesn't.
If you want to learn how to build them, I recorded a practical intro video.
You'll also see how to turn Liquid AI LFM2-2.6B into a Tic-tac-toe master ๐
๐ฅ Engineering RL Environments video: https://www.youtube.com/watch?v=71V3fTaUp2Q
---
๐ฑ LLM RL Environments Lil Course: https://github.com/anakin87/llm-rl-environments-lil-course
๐ค๐น๏ธ Play against the trained model: anakin87/LFM2-2.6B-mr-tictactoe
๐ HF collection (datasets + models): https://huggingface.co/collections/anakin87/lfm2-26b-mr-tic-tac-toe
Reinforcement Learning Environments are little worlds
where models can act, get rewards, and learn.
I've been exploring how to design them, figuring out what works and what doesn't.
If you want to learn how to build them, I recorded a practical intro video.
You'll also see how to turn Liquid AI LFM2-2.6B into a Tic-tac-toe master ๐
๐ฅ Engineering RL Environments video: https://www.youtube.com/watch?v=71V3fTaUp2Q
---
๐ฑ LLM RL Environments Lil Course: https://github.com/anakin87/llm-rl-environments-lil-course
๐ค๐น๏ธ Play against the trained model: anakin87/LFM2-2.6B-mr-tictactoe
๐ HF collection (datasets + models): https://huggingface.co/collections/anakin87/lfm2-26b-mr-tic-tac-toe