HoPE

university

https://sustcsonglin.github.io

AI & ML interests

None defined yet.

sonta7

updated a model 5 months ago

HoPEPEPE/smollm_1.7b_pythonedu_4b_megamathwebpro_4b_dclm_4b

2B • Updated Aug 20, 2025 • 5

sonta7

published a model 5 months ago

HoPEPEPE/smollm_1.7b_pythonedu_4b_megamathwebpro_4b_dclm_4b

2B • Updated Aug 20, 2025 • 5

renll

authored 2 papers 5 months ago

PaTH Attention: Position Encoding via Accumulating Householder Transformations

Paper • 2505.16381 • Published May 22, 2025

Decoder-Hybrid-Decoder Architecture for Efficient Reasoning with Long Generation

Paper • 2507.06607 • Published Jul 9, 2025 • 10

renll

authored 2 papers 8 months ago

Phi-4-Mini-Reasoning: Exploring the Limits of Small Reasoning Language Models in Math

Paper • 2504.21233 • Published Apr 30, 2025 • 49

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Paper • 2504.20571 • Published Apr 29, 2025 • 98

renll

authored a paper 10 months ago

Phi-4-Mini Technical Report: Compact yet Powerful Multimodal Language Models via Mixture-of-LoRAs

Paper • 2503.01743 • Published Mar 3, 2025 • 89

LutherXD

authored 2 papers 10 months ago

Evaluating Hallucinations in Chinese Large Language Models

Paper • 2310.03368 • Published Oct 5, 2023

Thus Spake Long-Context Large Language Model

Paper • 2502.17129 • Published Feb 24, 2025 • 73

sonta7

authored 3 papers over 1 year ago

A Controlled Study on Long Context Extension and Generalization in LLMs

Paper • 2409.12181 • Published Sep 18, 2024 • 45

Parallelizing Linear Transformers with the Delta Rule over Sequence Length

Paper • 2406.06484 • Published Jun 10, 2024 • 4

Gated Slot Attention for Efficient Linear-Time Sequence Modeling

Paper • 2409.07146 • Published Sep 11, 2024 • 20

renll

authored a paper over 1 year ago

Samba: Simple Hybrid State Space Models for Efficient Unlimited Context Language Modeling

Paper • 2406.07522 • Published Jun 11, 2024 • 40

sonta7

authored a paper over 1 year ago

HGRN2: Gated Linear RNNs with State Expansion

Paper • 2404.07904 • Published Apr 11, 2024 • 20

renll

authored a paper about 2 years ago

Sparse Modular Activation for Efficient Sequence Modeling

Paper • 2306.11197 • Published Jun 19, 2023 • 1