Keeping you up to date with the latest trends and best performing architectures in this fast evolvin
We introduce phi-3-mini, a 3.8 billion parameter language model trained on 3.3 trillion tokens, whos
Large-scale recommendation systems are characterized by their reliance on high cardinality, heteroge
We study how to apply large language models to write grounded and organized long-form articles from
In this work, we introduce Mini-Gemini, a simple and effective framework enhancing multi-modality Vi
We present InstantMesh, a feed-forward framework for instant 3D mesh generation from a single image,
We analyze how well pre-trained large language models (e.g., Llama2, GPT-4, Claude 3, etc) can do li
Researchers have made significant progress in automating the software development process in the pas
Large language models (LLMs), exemplified by ChatGPT, have gained considerable attention for their e
In this study, we propose AniPortrait, a novel framework for generating high-quality animation drive
Generating long-form 44.1kHz stereo audio from text prompts can be computationally demanding. Furthe
Creating high-fidelity 3D head avatars has always been a research hotspot, but there remains a great
Parameter-efficient fine-tuning (PEFT) methods seek to adapt large models via updates to a small num
Large language models (LLMs) often generate content that contains factual errors when responding to
We present Jamba, a new base large language model based on a novel hybrid Transformer-Mamba mixture-
Recently years have witnessed a rapid development of large language models (LLMs). Despite the stron
We present MegaBlocks, a system for efficient Mixture-of-Experts (MoE) training on GPUs. Our system
We introduce VoiceCraft, a token infilling neural codec language model, that achieves state-of-the-a
This paper focuses on task-agnostic prompt compression for better generalizability and efficiency. C
We present a novel application of evolutionary algorithms to automate the creation of powerful found
Jailbreak attacks are crucial for identifying and mitigating the security vulnerabilities of Large L