ToolOrchestra: Elevating Intelligence via Efficient Model and Tool Orchestration Paper • 2511.21689 • Published 14 days ago • 100
shuoxing/qwen2-5-7b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 9 days ago • 58
shuoxing/qwen2-5-7b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 9 days ago • 58
shuoxing/qwen2-5-7b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 9 days ago • 64
shuoxing/qwen3-4b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 9 days ago • 62
shuoxing/qwen2-5-7b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 9 days ago • 64
shuoxing/qwen2-5-7b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 9 days ago • 56
shuoxing/qwen3-4b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 9 days ago • 62
shuoxing/qwen3-4b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 9 days ago • 61
shuoxing/qwen-0_5b-full-pretrain-control-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 0.5B • Updated 9 days ago • 13
shuoxing/qwen2-5-7b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 9 days ago • 56
shuoxing/qwen3-4b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 9 days ago • 61
shuoxing/qwen3-4b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 9 days ago • 73
shuoxing/qwen2-5-7b-full-pretrain-mix-low-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 333k • Updated 9 days ago • 90
shuoxing/qwen-0_5b-full-pretrain-mix-high-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 0.5B • Updated 9 days ago • 28
shuoxing/qwen-0_5b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 0.5B • Updated 9 days ago • 33
shuoxing/qwen-0_5b-full-pretrain-mix-low-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 0.5B • Updated 9 days ago • 25
shuoxing/qwen3-4b-full-pretrain-mix-mid-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 9 days ago • 73
shuoxing/qwen3-4b-full-pretrain-mix-low-tweet-1m-en-no-packing-new-sft-bs32 Text Generation • 196k • Updated 9 days ago • 99