Keeping you up to date with the latest trends and best performing architectures in this fast evolvin
Since the advent of personal computing devices, intelligent personal assistants (IPAs) have been one
We posit that to achieve superhuman agents, future models require superhuman feedback in order to pr
Recently the state space models (SSMs) with efficient hardware-aware designs, i.e., Mamba, have show
Code generation problems differ from common natural language problems - they require matching the ex
As large language models (LLMs) are adopted as a fundamental component of language technologies, it
This work elicits LLMs' inherent ability to handle long contexts without fine-tuning. The limited le
Information extraction (IE) aims to extract structural knowledge (such as entities, relations, and e
In the era of large language models, Mixture-of-Experts (MoE) is a promising architecture for managi
The utilization of long contexts poses a big challenge for large language models due to their limite
Fine-tuning large pre-trained models is an effective transfer mechanism in NLP. However, in the pres
We introduce Mixtral 8x7B, a Sparse Mixture of Experts (SMoE) language model. Mixtral has the same a
State Space Models (SSMs) have become serious contenders in the field of sequential modeling, challe
This paper presents the first few-shot LLM-based chatbot that almost never hallucinates and has high
With the burgeoning growth of online video platforms and the escalating volume of video content, the
The recent development on large multimodal models (LMMs), especially GPT-4V(ision) and Gemini, has b
In the era of advanced multimodel learning, multimodal large language models (MLLMs) such as GPT-4V
Diffusion model based Text-to-Image has achieved impressive achievements recently. Although current
Driven by curiosity, humans have continually sought to explore and understand the world around them,
This paper introduces 26 guiding principles designed to streamline the process of querying and promp
With the widespread adoption of Large Language Models (LLMs), many deep learning practitioners are l