
arXiv.org e-Print archive
arXiv is a free distribution service and an open-access archive for nearly 2.4 million scholarly articles in the fields of physics, mathematics, computer science, quantitative biology, …
Physics - arXiv.org
Accelerator theory and simulation. Accelerator technology. Accelerator experiments. Beam Physics. Accelerator design and optimization. Advanced accelerator concepts. Radiation …
[2412.19437] DeepSeek-V3 Technical Report - arXiv.org
Dec 27, 2024 · Comprehensive evaluations reveal that DeepSeek-V3 outperforms other open-source models and achieves performance comparable to leading closed-source models. …
Computer Science
These findings establish a practical, privacy-preserving blueprint for deploying open-source SLMs in multilingual clinical NLP settings with limited infrastructure and annotation resources, and …
[2312.00752] Mamba: Linear-Time Sequence Modeling with ...
Dec 1, 2023 · Foundation models, now powering most of the exciting applications in deep learning, are almost universally based on the Transformer architecture and its core attention …
[2502.09992] Large Language Diffusion Models - arXiv.org
Feb 14, 2025 · The capabilities of large language models (LLMs) are widely regarded as relying on autoregressive models (ARMs). We challenge this notion by introducing LLaDA, a diffusion …
[2303.18223] A Survey of Large Language Models - arXiv.org
Mar 31, 2023 · Language is essentially a complex, intricate system of human expressions governed by grammatical rules. It poses a significant challenge to develop capable AI …