We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions

Papers Read on AI

Keeping you up to date with the latest trends and best performing architectures in this fast evolvin

Episodes

Total: 205

Large language models (LLMs) demonstrate powerful capabilities, but they still face challenges in pr

This paper introduces PowerInfer, a high-speed Large Language Model (LLM) inference engine on a pers

Reinforcement Learning (RL) offers a versatile framework for achieving long-term goals. Its generali

Large language models have exhibited emergent abilities, demonstrating exceptional performance acros

Sparse Mixture-of-Experts (MoE) is a neural architecture design that can be utilized to add learnabl

This paper surveys research works in the quickly advancing field of instruction tuning (IT), a cruci

We present MegaBlocks, a system for efficient Mixture-of-Experts (MoE) training on GPUs. Our system

The capacity of a neural network to absorb information is limited by its number of parameters. Condi

We introduce a novel sequential modeling approach which enables learning a Large Vision Model (LVM)

We introduce Magicoder, a series of fully open-source (code, weights, and data) Large Language Model

Foundation models, now powering most of the exciting applications in deep learning, are almost unive

We introduce Adversarial Diffusion Distillation (ADD), a novel training approach that efficiently sa

The dominant paradigm for instruction tuning is the random-shuffled training of maximally diverse in

Weight initialization plays an important role in neural network training. Widely used initialization

Denoising diffusion models (DDMs) have attracted attention for their exceptional generation quality

This paper does not present a novel method. Instead, it delves into an essential, yet must-know base

Large Language Models (LLMs) have shown impressive abilities in natural language understanding and g

Large language models (LLMs) have demonstrated remarkable performance and tremendous potential acros

Large language models (LLMs) have dramatically enhanced the field of language intelligence, as demon

Generating step-by-step"chain-of-thought"rationales improves language model performance on complex r