baseten/btest-TinyLlama-1.1B-Chat-v1.0-NVIDIA-A10G-v0.20.0-TP1
Updated
•
19
baseten/whisper_trt_large_v3_fixed_20240624_NVIDIA_H100_80GB_HBM3_MIG_3g_40gb_0_13_0
Updated
baseten/btest-Qwen-0.5B-NVIDIA-H100-80GB-HBM3-v0.20.0-TP1
baseten/qwen3-rerank-8b-h100
baseten/qwen3-embed-8b-h100
baseten/qwen3-embed-0.6b-h100
baseten/orpheus-3b-0.1-ft-fp8
3B
•
Updated
•
3
baseten/orpheus-3b-0.1-ft-fp8-fix
3B
•
Updated
•
27
baseten/orpheus-3b-0.1-ft-fp8nokv
3B
•
Updated
•
6
baseten/whisper_trt_large_v3_websocket_bs_24_NVIDIA_H100_80GB_HBM3_MIG_3g_40gb_0_13_0
Updated
baseten/Llama-3.2-3B-Instruct-fp8nokv
3B
•
Updated
•
5
baseten/whisper_trt_large_v3_websocket_bs_16_NVIDIA_H100_80GB_HBM3_MIG_3g_40gb_0_13_0
Updated
baseten/Llama-3.2-3B-Instruct-fp8
3B
•
Updated
•
5
baseten/whisper_trt_large_v3_20250604_NVIDIA_L4_0_13_0
Updated
baseten/DeepSeek-R1-0528-FP4
397B
•
Updated
•
7
baseten/whisper_trt_large_v3_turbo_exp20250522_NVIDIA_H100_80GB_HBM3_MIG_3g_40gb_0_13_0
Updated
baseten/whisper_trt_large_v3_turbo_exp20250522_NVIDIA_H100_80GB_HBM3_0_13_0
Updated
baseten/whisper_trt_large_v3_exp20250522_NVIDIA_H100_80GB_HBM3_0_13_0
Updated
baseten/whisper_trt_large_v3_exp20250522_NVIDIA_H100_80GB_HBM3_MIG_3g_40gb_0_13_0
Updated
baseten/whisper_trt_large_v3_turbo_test_NVIDIA_L4_0_16_0
Updated
baseten/Qwen2.5-32B-Instruct-128k
Text Generation
•
33B
•
Updated
•
86
baseten/btest-Llama-3.1-8B-Instruct-NVIDIA-H100-80GB-HBM3-v0.18.1-TP1
baseten/btest-Qwen-0.5B-NVIDIA-H100-80GB-HBM3-v0.18.1-TP1
baseten/Llama-4-Scout-17B-16E-fp4
62B
•
Updated
•
9
•
1
baseten/Llama-4-Scout-17B-16E-fp8
108B
•
Updated
•
6
baseten/btest-TinyLlama-1.1B-Chat-v1.0-NVIDIA-A10G-v0.20.0r-TP1
baseten/orpheus-3b-0.1-ft
Text-to-Speech
•
4B
•
Updated
•
23
•
2
baseten/whisper_trt_large_v3_turbo_test_NVIDIA_L4_0_18_2
Updated
397B
•
Updated
•
1.67k
•
1
baseten/Qwen2.5-Coder-32B-Instruct-128k
Text Generation
•
33B
•
Updated
•
4