DGPO Distillation-Guided Policy Optimization for Preserving Agentic RAG Capabilities omron-sinicx/SearchR1-ppo-qwen2.5-3b-instruct 3B • Updated about 1 month ago • 11 • 1 omron-sinicx/Qwen2.5-0.5B-Instruct-kd 0.5B • Updated about 1 month ago • 12 omron-sinicx/Qwen2.5-0.5B-Instruct-sft 0.5B • Updated about 1 month ago • 8 omron-sinicx/SearchR1-ppo-llama3.1-8b-instruct 8B • Updated about 1 month ago • 13 • 1
DGPO Distillation-Guided Policy Optimization for Preserving Agentic RAG Capabilities omron-sinicx/SearchR1-ppo-qwen2.5-3b-instruct 3B • Updated about 1 month ago • 11 • 1 omron-sinicx/Qwen2.5-0.5B-Instruct-kd 0.5B • Updated about 1 month ago • 12 omron-sinicx/Qwen2.5-0.5B-Instruct-sft 0.5B • Updated about 1 month ago • 8 omron-sinicx/SearchR1-ppo-llama3.1-8b-instruct 8B • Updated about 1 month ago • 13 • 1