Kristaller486's picture

Kristaller486

kristaller486

·

krist486

AI & ML interests

NLP, Machine Translation

Recent Activity

liked a dataset 1 day ago

nyuuzyou/ru-QnA-333K

liked a model 3 days ago

Lightricks/LTX-2

upvoted a paper 7 days ago

Gamayun's Path to Multilingual Mastery: Cost-Efficient Training of a 1.5B-Parameter LLM

View all activity

Organizations

upvoted a paper 7 days ago

Gamayun's Path to Multilingual Mastery: Cost-Efficient Training of a 1.5B-Parameter LLM

Paper • 2512.21580 • Published 16 days ago • 7

upvoted 2 collections 17 days ago

T-lite-2.1

4 items • Updated 17 days ago • 3

T-pro-2.1

3 items • Updated 17 days ago • 5

upvoted a collection 21 days ago

Kandinsky 5.0 Video Pro Diffusers

Kandinsky 5.0 Video Pro is a 19B model that generates high-quality HD videos from English and Russian prompts with controllable camera motion. • 4 items • Updated 27 days ago • 10

upvoted a collection 28 days ago

NeMo Gym

Collection of RL verifiable data for NeMo Gym • 13 items • Updated 17 days ago • 34

upvoted a paper 28 days ago

T-pro 2.0: An Efficient Russian Hybrid-Reasoning Model and Playground

Paper • 2512.10430 • Published 30 days ago • 113

upvoted a changelog 29 days ago

Changelog

Featured Spaces are now easier to spot

Nov 25, 2025

• 65

upvoted a paper 3 months ago

When Models Lie, We Learn: Multilingual Span-Level Hallucination Detection with PsiloQA

Paper • 2510.04849 • Published Oct 6, 2025 • 114

upvoted a collection 3 months ago

Nanonets-OCR2

2 items • Updated Oct 13, 2025 • 24

upvoted a collection 5 months ago

DeepSeek-V3.1

4 items • Updated Nov 27, 2025 • 257

upvoted 3 papers 5 months ago

Sample More to Think Less: Group Filtered Policy Optimization for Concise Reasoning

Paper • 2508.09726 • Published Aug 13, 2025 • 15

SONAR-LLM: Autoregressive Transformer that Thinks in Sentence Embeddings and Speaks in Tokens

Paper • 2508.05305 • Published Aug 7, 2025 • 46

Falcon-H1: A Family of Hybrid-Head Language Models Redefining Efficiency and Performance

Paper • 2507.22448 • Published Jul 30, 2025 • 68

upvoted 2 collections 6 months ago

T-pro-2.0

Hybrid reasoning model based on Qwen3 32B • 14 items • Updated 20 days ago • 29

Skywork-Reward-V2

Scaling preference data curation to the extreme • 9 items • Updated Jul 4, 2025 • 26

upvoted a paper 7 months ago

Geopolitical biases in LLMs: what are the "good" and the "bad" countries according to contemporary language models

Paper • 2506.06751 • Published Jun 7, 2025 • 71

upvoted 2 papers 8 months ago

Exploring the Latent Capacity of LLMs for One-Step Text Generation

Paper • 2505.21189 • Published May 27, 2025 • 61

Quartet: Native FP4 Training Can Be Optimal for Large Language Models

Paper • 2505.14669 • Published May 20, 2025 • 78

upvoted a collection 8 months ago

Falcon-H1

Falcon-H1 Family of Hybrid-Head Language Models (Transformer-SSM), including 0.5B, 1.5B, 1.5B-Deep, 3B, 7B, and 34B (pretrained & instruction-tuned). • 39 items • Updated about 15 hours ago • 57

upvoted a paper 8 months ago

Insights into DeepSeek-V3: Scaling Challenges and Reflections on Hardware for AI Architectures

Paper • 2505.09343 • Published May 14, 2025 • 74