NanoBEIR datasets Collection These datasets are compatible with the (Sparse)NanoBEIREvaluator with Sentence Transformers v5.2+. Also CrossEncoderNanoBEIREvaluator if bm25 column • 14 items • Updated about 3 hours ago • 7
TurkColBERT: Turkish Late-Interaction Models Collection TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval • 7 items • Updated 13 days ago • 5
view article Article TurkColBERT: A Benchmark of Dense and Late-Interaction Models for Turkish Information Retrieval 6 days ago • 18
Mistral Large 3 Collection A state-of-the-art, open-weight, general-purpose multimodal model with a granular Mixture-of-Experts architecture. • 4 items • Updated 8 days ago • 73
Ministral 3 Collection A collection of edge models, with Base, Instruct and Reasoning variants, in 3 different sizes: 3B, 8B and 14B. All with vision capabilities. • 9 items • Updated 8 days ago • 119
view article Article Building Jobly: Semantic Job Matching with RAG and Vector Embeddings 12 days ago • 12
view article Article Transformers v5: Simple model definitions powering the AI ecosystem +2 10 days ago • 234
Strong & Small Rerankers Collection Trained on MS MARCO using the a similar strategy as https://huggingface.co/cross-encoder/ms-marco-MiniLM-L12-v2, except with the Ettin base models • 4 items • Updated 16 days ago • 1
Tarka Embed V1 Collection Efficient DFKD embeddings for language understanding • 4 items • Updated 8 days ago • 6
view article Article LightOnOCR-1B: The Case for End-to-End and Efficient Domain-Specific Vision-Language Models for OCR Oct 23 • 62