Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2505.11594

inference acceleration

SageAttention2++: A More Efficient Implementation of SageAttention2

Paper • 2505.21136 • Published May 27 • 45
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16 • 75

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16 • 75
Latent Collaboration in Multi-Agent Systems

Paper • 2511.20639 • Published 16 days ago • 113

about 6 hours ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 57
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 45
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 63

Multimodal Pre-training

Exploring pre-training paradigms of large models across modalities towards Artificial General Intelligence (AGI).

LoQT: Low Rank Adapters for Quantized Training

Paper • 2405.16528 • Published May 26, 2024 • 3
Scaling Vision Pre-Training to 4K Resolution

Paper • 2503.19903 • Published Mar 25 • 41
mohdmus99/slurm_commands

Viewer • Updated Nov 5, 2024 • 73 • 12
lldacing/flash-attention-windows-wheel

Updated May 31 • 280

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Paper • 2411.10958 • Published Nov 17, 2024 • 55
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference

Paper • 2502.18137 • Published Feb 25 • 58
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16 • 75
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Paper • 2410.02367 • Published Oct 3, 2024 • 49

inference acceleration

SageAttention2++: A More Efficient Implementation of SageAttention2

Paper • 2505.21136 • Published May 27 • 45
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16 • 75

Multimodal Pre-training

Exploring pre-training paradigms of large models across modalities towards Artificial General Intelligence (AGI).

LoQT: Low Rank Adapters for Quantized Training

Paper • 2405.16528 • Published May 26, 2024 • 3
Scaling Vision Pre-Training to 4K Resolution

Paper • 2503.19903 • Published Mar 25 • 41
mohdmus99/slurm_commands

Viewer • Updated Nov 5, 2024 • 73 • 12
lldacing/flash-attention-windows-wheel

Updated May 31 • 280

SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16 • 75
Latent Collaboration in Multi-Agent Systems

Paper • 2511.20639 • Published 16 days ago • 113

SageAttention2 Technical Report: Accurate 4 Bit Attention for Plug-and-play Inference Acceleration

Paper • 2411.10958 • Published Nov 17, 2024 • 55
SpargeAttn: Accurate Sparse Attention Accelerating Any Model Inference

Paper • 2502.18137 • Published Feb 25 • 58
SageAttention3: Microscaling FP4 Attention for Inference and An Exploration of 8-Bit Training

Paper • 2505.11594 • Published May 16 • 75
SageAttention: Accurate 8-Bit Attention for Plug-and-play Inference Acceleration

Paper • 2410.02367 • Published Oct 3, 2024 • 49

about 6 hours ago

LLM Pruning and Distillation in Practice: The Minitron Approach

Paper • 2408.11796 • Published Aug 21, 2024 • 57
TableBench: A Comprehensive and Complex Benchmark for Table Question Answering

Paper • 2408.09174 • Published Aug 17, 2024 • 52
To Code, or Not To Code? Exploring Impact of Code in Pre-training

Paper • 2408.10914 • Published Aug 20, 2024 • 45
Open-FinLLMs: Open Multimodal Large Language Models for Financial Applications

Paper • 2408.11878 • Published Aug 20, 2024 • 63

Company

TOS Privacy About Careers

Website

Models Datasets Spaces Pricing Docs