Papers Read on AI

World Model on Million-Length Video And Language With RingAttention

2024/2/17

Current language models fall short in understanding aspects of the world not easily described in wor

Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models

2024/2/16

Harnessing the power of human-annotated data through Supervised Fine-Tuning (SFT) is pivotal for adv

Fractal Patterns May Unravel the Intelligence in Next-Token Prediction

2024/2/15

We study the fractal structure of language, aiming to provide a precise formalism for quantifying pr

Precise Zero-Shot Dense Retrieval without Relevance Labels

2024/2/13

While dense retrieval has been shown to be effective and efficient across tasks and languages, it re

ColBERTv2: Effective and Efficient Retrieval via Lightweight Late Interaction

2024/2/11

Neural information retrieval (IR) has greatly advanced search and other knowledge-intensive language

Relevance-guided Supervision for OpenQA with ColBERT

2024/2/11

Abstract Systems for Open-Domain Question Answering (OpenQA) generally depend on a retriever for fin

PLAID: An Efficient Engine for Late Interaction Retrieval

2024/2/10

Pre-trained language models are increasingly important components across multiple information retrie

RAPTOR: Recursive Abstractive Processing for Tree-Organized Retrieval

2024/2/9

Retrieval-augmented language models can better adapt to changes in world state and incorporate long-

Corrective Retrieval Augmented Generation

2024/2/8

Large language models (LLMs) inevitably exhibit hallucinations since the accuracy of generated texts

DeepSeek-Coder: When the Large Language Model Meets Programming -- The Rise of Code Intelligence

2024/2/7

The rapid development of large language models has revolutionized code intelligence in software deve

A Comprehensive Survey on 3D Content Generation

2024/2/7

Recent years have witnessed remarkable advances in artificial intelligence generated content(AIGC),

OLMo: Accelerating the Science of Language Models

2024/2/6

Language models (LMs) have become ubiquitous in both NLP research and in commercial product offering

Who’s Harry Potter? Approximate Unlearning in LLMs

2024/2/4

Large language models (LLMs) are trained on massive internet corpora that often contain copyrighted

Parameter-Efficient Transfer Learning for NLP

2024/2/3

Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the pres

A Survey on Transformers in Reinforcement Learning

2024/2/2

Transformer has been considered the dominating neural architecture in NLP and CV, mostly under super

Beyond Chain-of-Thought, Effective Graph-of-Thought Reasoning in Large Language Models

2024/2/1

With the widespread use of large language models (LLMs) in NLP tasks, researchers have discovered th

DSPy: Compiling Declarative Language Model Calls into Self-Improving Pipelines

2024/1/31

The ML community is rapidly exploring techniques for prompting language models (LMs) and for stackin

Matryoshka Representation Learning

2024/1/30

Learned representations are a central component in modern ML systems, serving a multitude of downstr

How to train your ViT? Data, Augmentation, and Regularization in Vision Transformers

2024/1/27

Vision Transformers (ViT) have been shown to attain highly competitive performance for a wide range

Eyes Wide Shut? Exploring the Visual Shortcomings of Multimodal LLMs

2024/1/26

Is vision good enough for language? Recent advancements in multimodal models primarily stem from the

Episodes