Models
Datasets
Spaces
Docs
Enterprise
Pricing
Log In
Sign Up

Collections

Discover the best community collections!

Collections including paper arxiv:2401.17268

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 48
SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 74
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Paper • 2401.16380 • Published Jan 29, 2024 • 50

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 105
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 24
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 9
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 21

EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation

Paper • 2310.08185 • Published Oct 12, 2023 • 8
GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence

Paper • 2310.05388 • Published Oct 9, 2023 • 4
PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers

Paper • 2311.09180 • Published Nov 15, 2023 • 8
Weaver: Foundation Models for Creative Writing

Paper • 2401.17268 • Published Jan 30, 2024 • 45

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 39
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 81
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 83

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 189
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Paper • 2401.04658 • Published Jan 9, 2024 • 27
Weaver: Foundation Models for Creative Writing

Paper • 2401.17268 • Published Jan 30, 2024 • 45
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21

System 2 Attention (is something you might need too)

Paper • 2311.11829 • Published Nov 20, 2023 • 44
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems

Paper • 2311.11315 • Published Nov 19, 2023 • 8
Alignment for Honesty

Paper • 2312.07000 • Published Dec 12, 2023 • 15
Steering Llama 2 via Contrastive Activation Addition

Paper • 2312.06681 • Published Dec 9, 2023 • 14

A Zero-Shot Language Agent for Computer Control with Structured Reflection

Paper • 2310.08740 • Published Oct 12, 2023 • 16
AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 36
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Paper • 2308.10848 • Published Aug 21, 2023 • 1
CLEX: Continuous Length Extrapolation for Large Language Models

Paper • 2310.16450 • Published Oct 25, 2023 • 10

Large-Scale Automatic Audiobook Creation

Paper • 2309.03926 • Published Sep 7, 2023 • 55
Agents: An Open-source Framework for Autonomous Language Agents

Paper • 2309.07870 • Published Sep 14, 2023 • 42
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 53
StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 31

TinyLlama: An Open-Source Small Language Model

Paper • 2401.02385 • Published Jan 4, 2024 • 95
MM-LLMs: Recent Advances in MultiModal Large Language Models

Paper • 2401.13601 • Published Jan 24, 2024 • 48
SliceGPT: Compress Large Language Models by Deleting Rows and Columns

Paper • 2401.15024 • Published Jan 26, 2024 • 74
Rephrasing the Web: A Recipe for Compute and Data-Efficient Language Modeling

Paper • 2401.16380 • Published Jan 29, 2024 • 50

DocLLM: A layout-aware generative language model for multimodal document understanding

Paper • 2401.00908 • Published Dec 31, 2023 • 189
Lightning Attention-2: A Free Lunch for Handling Unlimited Sequence Lengths in Large Language Models

Paper • 2401.04658 • Published Jan 9, 2024 • 27
Weaver: Foundation Models for Creative Writing

Paper • 2401.17268 • Published Jan 30, 2024 • 45
Efficient Tool Use with Chain-of-Abstraction Reasoning

Paper • 2401.17464 • Published Jan 30, 2024 • 21

Attention Is All You Need

Paper • 1706.03762 • Published Jun 12, 2017 • 105
BERT: Pre-training of Deep Bidirectional Transformers for Language Understanding

Paper • 1810.04805 • Published Oct 11, 2018 • 24
RoBERTa: A Robustly Optimized BERT Pretraining Approach

Paper • 1907.11692 • Published Jul 26, 2019 • 9
DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter

Paper • 1910.01108 • Published Oct 2, 2019 • 21

System 2 Attention (is something you might need too)

Paper • 2311.11829 • Published Nov 20, 2023 • 44
TPTU-v2: Boosting Task Planning and Tool Usage of Large Language Model-based Agents in Real-world Systems

Paper • 2311.11315 • Published Nov 19, 2023 • 8
Alignment for Honesty

Paper • 2312.07000 • Published Dec 12, 2023 • 15
Steering Llama 2 via Contrastive Activation Addition

Paper • 2312.06681 • Published Dec 9, 2023 • 14

EIPE-text: Evaluation-Guided Iterative Plan Extraction for Long-Form Narrative Text Generation

Paper • 2310.08185 • Published Oct 12, 2023 • 8
GROVE: A Retrieval-augmented Complex Story Generation Framework with A Forest of Evidence

Paper • 2310.05388 • Published Oct 9, 2023 • 4
PEARL: Personalizing Large Language Model Writing Assistants with Generation-Calibrated Retrievers

Paper • 2311.09180 • Published Nov 15, 2023 • 8
Weaver: Foundation Models for Creative Writing

Paper • 2401.17268 • Published Jan 30, 2024 • 45

A Zero-Shot Language Agent for Computer Control with Structured Reflection

Paper • 2310.08740 • Published Oct 12, 2023 • 16
AgentTuning: Enabling Generalized Agent Abilities for LLMs

Paper • 2310.12823 • Published Oct 19, 2023 • 36
AgentVerse: Facilitating Multi-Agent Collaboration and Exploring Emergent Behaviors

Paper • 2308.10848 • Published Aug 21, 2023 • 1
CLEX: Continuous Length Extrapolation for Large Language Models

Paper • 2310.16450 • Published Oct 25, 2023 • 10

interesting stuff

Chain-of-Verification Reduces Hallucination in Large Language Models

Paper • 2309.11495 • Published Sep 20, 2023 • 39
Adapting Large Language Models via Reading Comprehension

Paper • 2309.09530 • Published Sep 18, 2023 • 81
CulturaX: A Cleaned, Enormous, and Multilingual Dataset for Large Language Models in 167 Languages

Paper • 2309.09400 • Published Sep 17, 2023 • 85
Language Modeling Is Compression

Paper • 2309.10668 • Published Sep 19, 2023 • 83

Large-Scale Automatic Audiobook Creation

Paper • 2309.03926 • Published Sep 7, 2023 • 55
Agents: An Open-source Framework for Autonomous Language Agents

Paper • 2309.07870 • Published Sep 14, 2023 • 42
PDFTriage: Question Answering over Long, Structured Documents

Paper • 2309.08872 • Published Sep 16, 2023 • 53
StarCoder: may the source be with you!

Paper • 2305.06161 • Published May 9, 2023 • 31

Previous
1
2
Next

Company

TOS Privacy About Jobs

Website

Models Datasets Spaces Pricing Docs