We're sunsetting PodQuest on 2025-07-28. Thank you for your support!
Export Podcast Subscriptions

Papers Read on AI

Keeping you up to date with the latest trends and best performing architectures in this fast evolvin

Episodes

Total: 205

Since the advent of personal computing devices, intelligent personal assistants (IPAs) have been one

We posit that to achieve superhuman agents, future models require superhuman feedback in order to pr

Recently the state space models (SSMs) with efficient hardware-aware designs, i.e., Mamba, have show

Code generation problems differ from common natural language problems - they require matching the ex

As large language models (LLMs) are adopted as a fundamental component of language technologies, it

This work elicits LLMs' inherent ability to handle long contexts without fine-tuning. The limited le

Information extraction (IE) aims to extract structural knowledge (such as entities, relations, and e

In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for managi

The utilization of long contexts poses a big challenge for large language models due to their limite

Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the pres

Mixtral of Experts

2024/1/12

We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same a

State Space Models (SSMs) have become serious contenders in the field of sequential modeling, challe

This paper presents the first few-shot LLM-based chatbot that almost never hallucinates and has high

With the burgeoning growth of online video platforms and the escalating volume of video content, the

The recent development on large multimodal models (LMMs), especially GPT-4V(ision) and Gemini, has b

In the era of advanced multimodel learning, multimodal large language models (MLLMs) such as GPT-4V

Diffusion model based Text-to-Image has achieved impressive achievements recently. Although current

Driven by curiosity, humans have continually sought to explore and understand the world around them,

This paper introduces 26 guiding principles designed to streamline the process of querying and promp

With the widespread adoption of Large Language Models (LLMs), many deep learning practitioners are l