<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Flecto</title>
    <link>https://flecto.zer0ai.dev</link>
    <description>Daily curated papers from arxiv cs.AI, auto-converted to rich HTML.</description>
    <language>en</language>
    <lastBuildDate>Wed, 08 Apr 2026 16:13:39 GMT</lastBuildDate>
    <atom:link href="https://flecto.zer0ai.dev/feed.xml" rel="self" type="application/rss+xml"/>
    <item>
      <title>Token Warping Helps MLLMs Look from Nearby Viewpoints</title>
      <link>https://arxiv.org/abs/2604.02870</link>
      <description>Can warping tokens, rather than pixels, help multimodal large language models (MLLMs) understand how a scene appears from a nearby viewpoint? While MLLMs perform well on visual reasoning, they remain fragile to viewpoint changes, as pixel-wise warping is highly sensitive to small depth errors and often introduces geometric distortions. Drawing on theories of mental imagery that posit part-level structural representations as the basis for human perspective transformation, we examine whether image</description>
      <author>Phillip Y. Lee, Chanho Park, Mingue Park, Seungwoo Yoo, Juil Koo, Minhyuk Sung</author>
      <pubDate>2026-04-03T00:00:00+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2604.02870</guid>
    </item>
    <item>
      <title>Agentic-MME: What Agentic Capability Really Brings to Multimodal Intelligence?</title>
      <link>https://arxiv.org/abs/2604.03016</link>
      <description>Multimodal Large Language Models (MLLMs) are evolving from passive observers into active agents, solving problems through Visual Expansion (invoking visual tools) and Knowledge Expansion (open-web search). However, existing evaluations fall short: they lack flexible tool integration, test visual and search tools separately, and evaluate primarily by final answers. Consequently, they cannot verify if tools were actually invoked, applied correctly, or used efficiently. To address this, we introduc</description>
      <author>Qianshan Wei, Yishan Yang, Siyi Wang, Jinglin Chen, Binyu Wang, Jiaming Wang, Shuang Chen, Zechen Li, Yang Shi, Yuqi Tang, Weining Wang, Yi Yu, Chaoyou Fu, Qi Li, Yi-Fan Zhang</author>
      <pubDate>2026-04-03T00:00:00+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2604.03016</guid>
    </item>
    <item>
      <title>Self-Distilled RLVR</title>
      <link>https://arxiv.org/abs/2604.03128</link>
      <description>On-policy distillation (OPD) has become a popular training paradigm in the LLM community. This paradigm selects a larger model as the teacher to provide dense, fine-grained signals for each sampled trajectory, in contrast to reinforcement learning with verifiable rewards (RLVR), which only obtains sparse signals from verifiable outcomes in the environment. Recently, the community has explored on-policy self-distillation (OPSD), where the same model serves as both teacher and student, with the te</description>
      <author>Chenxu Yang, Chuanyu Qin, Qingyi Si, Minghui Chen, Naibin Gu, Dingyu Yao, Zheng Lin, Weiping Wang, Jiaqi Wang, Nan Duan</author>
      <pubDate>2026-04-03T00:00:00+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2604.03128</guid>
    </item>
    <item>
      <title>A Simple Baseline for Streaming Video Understanding</title>
      <link>https://arxiv.org/abs/2604.02317</link>
      <description>Recent streaming video understanding methods increasingly rely on complex memory mechanisms to handle long video streams. We challenge this trend with a simple finding: a sliding-window baseline that feeds only the most recent N frames to an off-the-shelf VLM already matches or surpasses published streaming models. We formalize this baseline as SimpleStream and evaluate it against 13 major offline and online video LLM baselines on OVO-Bench and StreamingBench. Despite its simplicity, SimpleStrea</description>
      <author>Yujiao Shen, Shulin Tian, Jingkang Yang, Ziwei Liu</author>
      <pubDate>2026-04-02T00:00:00+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2604.02317</guid>
    </item>
    <item>
      <title>Meta-Harness: End-to-End Optimization of Model Harnesses</title>
      <link>https://arxiv.org/abs/2603.28052</link>
      <description>Meta-Harness is an automated system that optimizes the harness (wrapper code) around large language models to achieve state-of-the-art performance. By using a coding agent with filesystem access to iterate over execution traces and diagnostic evidence, it discovers harnesses that outperform hand-crafted solutions across text classification (+7.7 points over ACE), math reasoning (+4.7 pts on IMO benchmarks), and agentic coding (37.6% on TerminalBench-2, rank #1 among Haiku 4.5 agents).</description>
      <author>Aiden Grossman, Sanyam Kapoor, Arjun Desai, Benjamin Therien, Cécile Capponi, Christopher Ré, Evan Hubinger, Frieda Rong, Irhum Shafkat, James Caverlee, Jessica Rumbelow, Julien Launay, Liam Dugan, Lintang Sutawika, Mrinank Sharma, Sachin Kumar, Sasha Luccioni, Scott Linderman, Thomas Scialom, Tim Dettmers, Victor Sanh, William Jurayj, Zaid Alyafeai</author>
      <pubDate>2026-03-28T17:59:04+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.28052</guid>
    </item>
    <item>
      <title>DataFlex: A Unified Framework for Data-Centric Dynamic Training of Large Language Models</title>
      <link>https://arxiv.org/abs/2603.26164</link>
      <description>Data-centric training has emerged as a promising direction for improving large language models (LLMs) by optimizing not only model parameters but also the selection, composition, and weighting of training data during optimization. However, existing approaches to data selection, data mixture optimization, and data reweighting are often developed in isolated codebases with inconsistent interfaces, hindering reproducibility, fair comparison, and practical integration. In this paper, we present Data</description>
      <author>Hao Liang, Zhengyang Zhao, Meiyi Qiang, Mingrui Chen, Lu Ma, Rongyi Yu, Hengyi Feng, Shixuan Sun, Zimo Meng, Xiaochen Ma, Xuanlin Yang, Qifeng Cai, Ruichuan An, Bohan Zeng, Zhen Hao Wong, Chengyu Shen, Runming He, Zhaoyang Han, Yaowei Zheng, Fangcheng Fu, Conghui He, Bin Cui, Zhiyu Li, Weinan E, Wentao Zhang</author>
      <pubDate>2026-03-27T08:28:02+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.26164</guid>
    </item>
    <item>
      <title>PixelSmile: Toward Fine-Grained Facial Expression Editing</title>
      <link>https://arxiv.org/abs/2603.25728</link>
      <description>Fine-grained facial expression editing has long been limited by intrinsic semantic overlap. To address this, we construct the Flex Facial Expression (FFE) dataset with continuous affective annotations and establish FFE-Bench to evaluate structural confusion, editing accuracy, linear controllability, and the trade-off between expression editing and identity preservation. We propose PixelSmile, a diffusion framework that disentangles expression semantics via fully symmetric joint training. PixelSm</description>
      <author>Jiabin Hua, Hengyuan Xu, Aojie Li, Wei Cheng, Gang Yu, Xingjun Ma, Yu-Gang Jiang</author>
      <pubDate>2026-03-26T17:59:04+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.25728</guid>
    </item>
    <item>
      <title>Voxtral TTS</title>
      <link>https://arxiv.org/abs/2603.25551</link>
      <description>We introduce Voxtral TTS, an expressive multilingual text-to-speech model that generates natural speech from as little as 3 seconds of reference audio. Voxtral TTS adopts a hybrid architecture that combines auto-regressive generation of semantic speech tokens with flow-matching for acoustic tokens. These tokens are encoded and decoded with Voxtral Codec, a speech tokenizer trained from scratch with a hybrid VQ-FSQ quantization scheme. In human evaluations conducted by native speakers, Voxtral TT</description>
      <author>Alexander H. Liu, Alexis Tacnet, Andy Ehrenberg, Andy Lo, Chen-Yo Sun, Guillaume Lample, Henry Lagarde, Jean-Malo Delignon, Jaeyoung Kim, John Harvill, Khyathi Raghavi Chandu, Lorenzo Signoretti, Margaret Jennings, Patrick von Platen, Pavankumar Reddy Muddireddy, Rohin Arora, Sanchit Gandhi, Samuel Humeau, Soham Ghosh, Srijan Mishra, Van Phung, Abdelaziz Bounhar, Abhinav Rastogi, Adrien Sadé, Alan Jeffares, Albert Jiang, Alexandre Cahill, Alexandre Gavaudan, Alexandre Sablayrolles, Amélie Héliou, Amos You, Andrew Bai, Andrew Zhao, Angele Lenglemetz, Anmol Agarwal, Anton Eliseev, Antonia Calvi, Arjun Majumdar, Arthur Fournier, Artjom Joosen, Avi Sooriyarachchi, Aysenur Karaduman Utkur, Baptiste Bout, Baptiste Rozière, Baudouin De Monicault, Benjamin Tibi, Bowen Yang, Charlotte Cronjäger, Clémence Lanfranchi, Connor Chen, Corentin Barreau, Corentin Sautier, Cyprien Courtot, Darius Dabert, Diego de las Casas, Elizaveta Demyanenko, Elliot Chane-Sane, Emmanuel Gottlob, Enguerrand Paquin, Etienne Goffinet, Fabien Niel, Faruk Ahmed, Federico Baldassarre, Gabrielle Berrada, Gaëtan Ecrepont, Gauthier Guinet, Genevieve Hayes, Georgii Novikov, Giada Pistilli, Guillaume Kunsch, Guillaume Martin, Guillaume Raille, Gunjan Dhanuka, Gunshi Gupta, Han Zhou, Harshil Shah, Hope McGovern, Hugo Thimonier, Indraneel Mukherjee, Irene Zhang, Jacques Sun, Jan Ludziejewski, Jason Rute, Jérémie Dentan, Joachim Studnia, Jonas Amar, Joséphine Delas, Josselin Somerville Roberts, Julien Tauran, Karmesh Yadav, Kartik Khandelwal, Kilian Tep, Kush Jain, Laurence Aitchison, Laurent Fainsin, Léonard Blier, Lingxiao Zhao, Louis Martin, Lucile Saulnier, Luyu Gao, Maarten Buyl, Manan Sharma, Marie Pellat, Mark Prins, Martin Alexandre, Mathieu Poirée, Mathieu Schmitt, Mathilde Guillaumin, Matthieu Dinot, Matthieu Futeral, Maxime Darrin, Maximilian Augustin, Mert Unsal, Mia Chiquier, Mikhail Biriuchinskii, Minh-Quang Pham, Mircea Lica, Morgane Rivière, Nathan Grinsztajn, Neha Gupta, Olivier Bousquet, Olivier Duchenne, Patricia Wang, Paul Jacob, Paul Wambergue, Paula Kurylowicz, Philippe Pinel, Philomène Chagniot, Pierre Stock, Piotr Miłoś, Prateek Gupta, Pravesh Agrawal, Quentin Torroba, Ram Ramrakhya, Randall Isenhour, Rishi Shah, Romain Sauvestre, Roman Soletskyi, Rosalie Millner, Rupert Menneer, Sagar Vaze, Samuel Barry, Samuel Belkadi, Sandeep Subramanian, Sean Cha, Shashwat Verma, Siddhant Waghjale, Siddharth Gandhi, Simon Lepage, Sumukh Aithal, Szymon Antoniak, Tarun Kumar Vangani, Teven Le Scao, Théo Cachet, Theo Simon Sorg, Thibaut Lavril, Thomas Chabal, Thomas Foubert, Thomas Robert, Thomas Wang, Tim Lawson, Tom Bewley, Tom Edwards, Tyler Wang, Umar Jamil, Umberto Tomasini, Valeriia Nemychnikova, Vedant Nanda, Victor Jouault, Vincent Maladière, Vincent Pfister, Virgile Richard, Vladislav Bataev, Wassim Bouaziz, Wen-Ding Li, William Havard, William Marshall, Xinghui Li, Xingran Guo, Xinyu Yang, Yannic Neuhaus, Yassine El Ouahidi, Yassir Bendou, Yihan Wang, Yimu Pan, Zaccharie Ramzi, Zhenlin Xu</author>
      <pubDate>2026-03-26T15:23:34+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.25551</guid>
    </item>
    <item>
      <title>RealRestorer: Towards Generalizable Real-World Image Restoration with Large-Scale Image Editing Models</title>
      <link>https://arxiv.org/abs/2603.25502</link>
      <description>Image restoration under real-world degradations is critical for downstream tasks such as autonomous driving and object detection. However, existing restoration models are often limited by the scale and distribution of their training data, resulting in poor generalization to real-world scenarios. Recently, large-scale image editing models have shown strong generalization ability in restoration tasks, especially for closed-source models like Nano Banana Pro, which can restore images while preservi</description>
      <author>Yufeng Yang, Xianfang Zeng, Zhangqi Jiang, Fukun Yin, Jianzhuang Liu, Wei Cheng, jinghong lan, Shiyu Liu, Yuqi Peng, Gang YU, Shifeng Chen</author>
      <pubDate>2026-03-26T14:39:39+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.25502</guid>
    </item>
    <item>
      <title>Intern-S1-Pro: Scientific Multimodal Foundation Model at Trillion Scale</title>
      <link>https://arxiv.org/abs/2603.25040</link>
      <description>We introduce Intern-S1-Pro, the first one-trillion-parameter scientific multimodal foundation model. Scaling to this unprecedented size, the model delivers a comprehensive enhancement across both general and scientific domains. Beyond stronger reasoning and image-text understanding capabilities, its intelligence is augmented with advanced agent capabilities. Simultaneously, its scientific expertise has been vastly expanded to master over 100 specialized tasks across critical science fields, incl</description>
      <author>Yicheng Zou, Dongsheng Zhu, Lin Zhu, Tong Zhu, Yunhua Zhou, Peiheng Zhou, Xinyu Zhou, Dongzhan Zhou, Zhiwang Zhou, Yuhao Zhou, Bowen Zhou, Zhanping Zhong, Zhijie Zhong, Haiteng Zhao, Penghao Zhao, Xiaomeng Zhao, Zhiyuan Zhao, Yechen Zhang, Jin Zhang, Wenwei Zhang, Hongjie Zhang, Zhuo Zhang, Wenlong Zhang, Bo Zhang, Chao Zhang, Chen Zhang, Yuhang Zang, Fei Yuan, Jiakang Yuan, Jiashuo Yu, Jinhui Yin, Haochen Ye, Qian Yao, Bowen Yang, Danni Yang, Kaichen Yang, Ziang Yan, Jun Xu, Yicheng Xu, Wanghan Xu, Xuenan Xu, Chao Xu, Ruiliang Xu, Shuhao Xing, Long Xing, Xinchen Xie, Ling-I Wu, Zijian Wu, Zhenyu Wu, Lijun Wu, Yue Wu, Jianyu Wu, Wen Wu, Fan Wu, Xilin Wei, Qi Wei, Bingli Wang, Rui Wang, Ziyi Wang, Zun Wang, Yi Wang, Haomin Wang, Yizhou Wang, Lintao Wang, Yiheng Wang, Longjiang Wang, Bin Wang, Jian Tong, Zhongbo Tian, Huanze Tang, Chen Tang, Shixiang Tang, Yu Sun, Qiushi Sun, Xuerui Su, Qisheng Su, Chenlin Su, Demin Song, Jin Shi, Fukai Shang, Yuchen Ren, Pengli Ren, Xiaoye Qu, Yuan Qu, Jiantao Qiu, Yu Qiao, Runyu Peng, Tianshuo Peng, Jiahui Peng, Qizhi Pei, Zhuoshi Pan, Linke Ouyang, Wenchang Ning, Yichuan Ma, Zerun Ma, Ningsheng Ma, Runyuan Ma, Chengqi Lyu, Haijun Lv, Han Lv, Lindong Lu, Kuikun Liu, Jiangning Liu, Yuhong Liu, Kai Liu, Hongwei Liu, Zhoumianze Liu, Mengjie Liu, Ziyu Liu, Wenran Liu, Yang Liu, Liwei Liu, Kaiwen Liu, Junyao Lin, Junming Lin, Tianyang Lin, Dahua Lin, Jianze Liang, Linyang Li, Peiji Li, Zonglin Li, Zehao Li, Pengze Li, Guoyan Li, Lingkai Kong, Linglin Jing, Zhenjiang Jin, Feifei Jiang, Qian Jiang, Junhao Huang, Zixian Huang, Haian Huang, Zhouqi Hua, Han Hu, Linfeng Hou, Yinan He, Conghui He, Tianyao He, Xu Guo, Qipeng Guo, Aijia Guo, Yuzhe Gu, Lixin Gu, Jingyang Gong, Qiming Ge, Jiaye Ge, Songyang Gao, Jianfei Gao, Xinyu Fang, Caihua fan, Yue Fan, Yanhui Duan, Zichen Ding, Shengyuan Ding, Xuanlang Dai, Erfei Cui, Ganqu Cui, Pei Chu, Tao Chu, Guangran Cheng, Yu Cheng, Kai Chen, Yongkang Chen, Chiyu Chen, Guanzhou Chen, Qiaosheng Chen, Sitao Chen, Xin Chen, Haojiong Chen, Yicheng Chen, Weihan Cao, Yuhang Cao, Qinglong Cao, Lei Bai</author>
      <pubDate>2026-03-26T05:21:45+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.25040</guid>
    </item>
    <item>
      <title>Natural-Language Agent Harnesses</title>
      <link>https://arxiv.org/abs/2603.25723</link>
      <description>Agent performance increasingly depends on harness engineering, yet harness design is usually buried in controller code and runtime-specific conventions, making it hard to transfer, compare, and study as a scientific object. We ask whether the high-level control logic of an agent harness can instead be externalized as a portable executable artifact. We introduce Natural-Language Agent Harnesses (NLAHs), which express harness behavior in editable natural language, and Intelligent Harness Runtime (</description>
      <author>Linyue Pan, Lexiao Zou, Shuo Guo, Jingchen Ni, Hai-Tao Zheng</author>
      <pubDate>2026-03-26T00:00:00Z</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.25723</guid>
    </item>
    <item>
      <title>Calibri: Enhancing Diffusion Transformers via Parameter-Efficient Calibration</title>
      <link>https://arxiv.org/abs/2603.24800</link>
      <description>In this paper, we uncover the hidden potential of Diffusion Transformers (DiTs) to significantly enhance generative tasks. Through an in-depth analysis of the denoising process, we demonstrate that introducing a single learned scaling parameter can significantly improve the performance of DiT blocks. Building on this insight, we propose Calibri, a parameter-efficient approach that optimally calibrates DiT components to elevate generative quality. Calibri frames DiT calibration as a black-box rew</description>
      <author>Danil Tokhchukov, Aysel Mirzoeva, Andrey Kuznetsov, Konstantin Sobolev</author>
      <pubDate>2026-03-25T20:19:50+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.24800</guid>
    </item>
    <item>
      <title>ClawKeeper: Comprehensive Safety Protection for OpenClaw Agents Through Skills, Plugins, and Watchers</title>
      <link>https://arxiv.org/abs/2603.24414</link>
      <description>OpenClaw has rapidly established itself as a leading open-source autonomous agent runtime, offering powerful capabilities including tool integration, local file access, and shell command execution. However, these broad operational privileges introduce critical security vulnerabilities, transforming model errors into tangible system-level threats such as sensitive data leakage, privilege escalation, and malicious third-party skill execution. Existing security measures for the OpenClaw ecosystem r</description>
      <author>Songyang Liu, Chaozhuo Li, Chenxu Wang, Jinyu Hou, Zejian Chen, Litian Zhang, Zheng Liu, Qiwei Ye, Yiming Hei, Xi Zhang, Zhongyuan Wang</author>
      <pubDate>2026-03-25T15:27:54+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.24414</guid>
    </item>
    <item>
      <title>CUA-Suite: Massive Human-annotated Video Demonstrations for Computer-Use Agents</title>
      <link>https://arxiv.org/abs/2603.24440</link>
      <description>CUA-Suite is a large-scale ecosystem of expert video demonstrations and dense annotations for professional desktop computer-use agents, providing approximately 55 hours of continuous 30fps video across 87 diverse applications.</description>
      <author>CUA-Suite Team</author>
      <pubDate>2026-03-24T00:00:00Z</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.24440</guid>
    </item>
    <item>
      <title>Attention Residuals</title>
      <link>https://arxiv.org/abs/2603.15031</link>
      <description>Residual connections with PreNorm are standard in modern LLMs, yet they accumulate all layer outputs with fixed unit weights. This uniform aggregation causes uncontrolled hidden-state growth with depth, progressively diluting each layer&apos;s contribution. We propose Attention Residuals (AttnRes), which replaces this fixed accumulation with softmax attention over preceding layer outputs, allowing each layer to selectively aggregate earlier representations with learned, input-dependent weights. To ad</description>
      <author>Kimi Team, Guangyu Chen, Yu Zhang, Jianlin Su, Weixin Xu, Siyuan Pan, Yaoyu Wang, Yucheng Wang, Guanduo Chen, Bohong Yin, Yutian Chen, Junjie Yan, Ming Wei, Y. Zhang, Fanqing Meng, Chao Hong, Xiaotong Xie, Shaowei Liu, Enzhe Lu, Yunpeng Tai, Yanru Chen, Xin Men, Haiqing Guo, Y. Charles, Haoyu Lu, Lin Sui, Jinguo Zhu, Zaida Zhou, Weiran He, Weixiao Huang, Xinran Xu, Yuzhi Wang, Guokun Lai, Yulun Du, Yuxin Wu, Zhilin Yang, Xinyu Zhou</author>
      <pubDate>2026-03-16T09:32:21+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.15031</guid>
    </item>
    <item>
      <title>Emerging Therapeutic Strategies in Asthma: Advances in Treatment, Drug Delivery, Drug Adherence, and Disease Management.</title>
      <link>https://arxiv.org/abs/PMC12966251</link>
      <description>A comprehensive 22-page review synthesising 2020-2025 evidence on emerging asthma therapeutic strategies across three domains: targeted biologics (anti-IgE, anti-IL-5/5R, anti-IL-4/13, anti-TSLP) for Th2-high disease, regenerative and gene-based approaches (MSC, siRNA/miRNA, mRNA vaccines) that remain preclinical, nanoparticle drug delivery systems (PLGA, chitosan, SLN), and digital health tools (smart inhalers, DTx, environmental monitors) for adherence.</description>
      <author>Lim YX, Choo YN, Looi YT, Chuan YW, Chiam KX, Wong RS, Ng NC, Goh BH</author>
      <pubDate>2026-03-06T00:00:00Z</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/PMC12966251</guid>
    </item>
    <item>
      <title>Tool Building as a Path to &quot;Superintelligence&quot;</title>
      <link>https://arxiv.org/abs/2602.21061</link>
      <description>The Diligent Learner framework suggests LLMs can achieve superintelligence via test-time search, provided a sufficient step-success probability γ. In this work, we design a benchmark to measure γ on logical out-of-distribution inference. We construct a class of tasks involving GF(2) circuit reconstruction that grow more difficult with each reasoning step, and that are, from an information-theoretic standpoint, impossible to reliably solve unless the LLM carefully integrates all of the informatio</description>
      <author>David Koplow, Tomer Galanti, Tomaso Poggio</author>
      <pubDate>2026-02-25T00:00:00+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2602.21061</guid>
    </item>
    <item>
      <title>AutoHarness: improving LLM agents by automatically synthesizing a code harness</title>
      <link>https://arxiv.org/abs/2603.03329</link>
      <description>Despite significant strides in language models in the last few years, when used as agents, such models often try to perform actions that are not just suboptimal for a given state, but are strictly prohibited by the external environment. For example, in the recent Kaggle GameArena chess competition, 78% of Gemini-2.5-Flash losses were attributed to illegal moves. Often people manually write &quot;harnesses&quot; around LLMs to prevent such failures. In this paper, we demonstrate that Gemini-2.5-Flash can a</description>
      <author>Xinghua Lou, Miguel Lázaro-Gredilla, Antoine Dedieu, Carter Wendelken, Wolfgang Lehrach, Kevin P. Murphy</author>
      <pubDate>2026-02-10T14:12:54Z</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.03329</guid>
    </item>
    <item>
      <title>PaperBanana: Automating Academic Illustration for AI Scientists</title>
      <link>https://arxiv.org/abs/2601.23265</link>
      <description>Despite rapid advances in autonomous AI scientists powered by language models, generating publication-ready illustrations remains a labor-intensive bottleneck in the research workflow. To lift this burden, we introduce PaperBanana, an agentic framework for automated generation of publication-ready academic illustrations. Powered by state-of-the-art VLMs and image generation models, PaperBanana orchestrates specialized agents to retrieve references, plan content and style, render images, and iter</description>
      <author>Dawei Zhu, Rui Meng, Yale Song, Xiyu Wei, Sujian Li, Tomas Pfister, Jinsung Yoon</author>
      <pubDate>2026-01-30T18:33:37+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2601.23265</guid>
    </item>
    <item>
      <title>LTX-2: Efficient Joint Audio-Visual Foundation Model</title>
      <link>https://arxiv.org/abs/2601.03233</link>
      <description>Recent text-to-video diffusion models can generate compelling video sequences, yet they remain silent -- missing the semantic, emotional, and atmospheric cues that audio provides. We introduce LTX-2, an open-source foundational model capable of generating high-quality, temporally synchronized audiovisual content in a unified manner. LTX-2 consists of an asymmetric dual-stream transformer with a 14B-parameter video stream and a 5B-parameter audio stream, coupled through bidirectional audio-video </description>
      <author>Yoav HaCohen, Benny Brazowski, Nisan Chiprut, Yaki Bitterman, Andrew Kvochko, Avishai Berkowitz, Daniel Shalem, Daphna Lifschitz, Dudu Moshe, Eitan Porat, Eitan Richardson, Guy Shiran, Itay Chachy, Jonathan Chetboun, Michael Finkelson, Michael Kupchick, Nir Zabari, Nitzan Guetta, Noa Kotler, Ofir Bibi, Ori Gordon, Poriya Panet, Roi Benita, Shahar Armon, Victor Kulikov, Yaron Inger, Yonatan Shiftan, Zeev Melumian, Zeev Farbman</author>
      <pubDate>2026-01-06T18:24:41+00:00</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2601.03233</guid>
    </item>
    <item>
      <title>The Future of Epigenetics: Emerging Technologies and Clinical Applications.</title>
      <link>https://arxiv.org/abs/PMC12993786</link>
      <description>ACS Pharmacol Transl Sci</description>
      <author>Iyer KA, Koynova-Tenchov R, Sasso JM, Thite T, Deng Y, Zhou QA</author>
      <pubDate>2026 Mar 13</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/PMC12993786</guid>
    </item>
    <item>
      <title>Inhalation-Based Nanoparticle Drug Delivery Targeting the Diseased Lower Airways in Idiopathic Pulmonary Fibrosis.</title>
      <link>https://arxiv.org/abs/PMC12944101</link>
      <description>A focused review on inhaled nanomedicine for idiopathic pulmonary fibrosis (IPF), covering pulmonary barriers, therapeutic modalities, and nanocarrier platforms with strategies for clinical translation.</description>
      <author>Lee JW, Skibba M, Tang T, Noh H, Brasier AR, Hong S</author>
      <pubDate>2026 Jan 27</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/PMC12944101</guid>
    </item>
    <item>
      <title>Lung-targeted RNA delivery systems: strategies and therapeutic applications.</title>
      <link>https://arxiv.org/abs/PMC12888636</link>
      <description>A comprehensive review integrating structure-function design principles with clinical translation insights for lung-targeted RNA delivery, covering LNPs, polymers, peptides, viral vectors, exosomes, and protein carriers.</description>
      <author>Xu S, Li M, Wang T, Chen R, Zhou M, Tang Z, Liu Q, Hu L, Li Z</author>
      <pubDate>2026 Jan 13</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/PMC12888636</guid>
    </item>
    <item>
      <title>Targeted Protein Degradation in Cancer: PROTACs, New Targets, and Clinical Mechanisms.</title>
      <link>https://arxiv.org/abs/PMC12937832</link>
      <description>Biomolecules</description>
      <author>Faryal B, Ul Abideen Z, Irfan M, Ahmed H, Jalilov F, Abduraximova L, Ashraf GA</author>
      <pubDate>2026 Feb 19</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/PMC12937832</guid>
    </item>
    <item>
      <title>Inhalation: A Smart Strategy and Increasing Potential for Drug Delivery.</title>
      <link>https://arxiv.org/abs/PMC12912003</link>
      <description>Drug Des Devel Ther</description>
      <author>Wang SC, Kuo TH, Rai CI, Chen YC</author>
      <pubDate>2026</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/PMC12912003</guid>
    </item>
    <item>
      <title>Ensifentrine Added on to Dual Bronchodilator or Triple Therapy Demonstrates Clinically Meaningful Improvement in CAT Score in Symptomatic Patients with Chronic Obstructive Pulmonary Disease.</title>
      <link>https://arxiv.org/abs/PMC12912034</link>
      <description>Int J Chron Obstruct Pulmon Dis</description>
      <author>Siler TM, Rheault T, Reyner D, MacDonald-Berko M, Davidson J, Rickard K</author>
      <pubDate>2026</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/PMC12912034</guid>
    </item>
    <item>
      <title>Agentic Context Engineering: Evolving Contexts for Self-Improving Language Models</title>
      <link>https://arxiv.org/abs/2510.04618</link>
      <description>ACE (Agentic Context Engineering) treats LLM contexts as evolving playbooks, achieving +10.6% on agents and +8.6% on domain-specific benchmarks. Published at ICLR 2026.</description>
      <author>Ahanaf Tazwar Shamim, Farhan Sadik, Taiyeong Lee</author>
      <pubDate>2025-10-06</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2510.04618</guid>
    </item>
    <item>
      <title>VibeVoice Technical Report</title>
      <link>https://arxiv.org/abs/2508.19205</link>
      <description>This report presents VibeVoice, a novel model designed to synthesize long-form speech with multiple speakers by employing next-token diffusion, which is a unified method for modeling continuous data by autoregressively generating latent vectors via diffusion.</description>
      <author>Zhiliang Peng, Jianwei Yu, Wenhui Wang, Yaoyao Chang, Yutao Sun, Yurong Mou, Xingpeng Wang, Hanyuan Zhang, Haozhe Liu, Jiarui Lu, Zeya Chen, Peng Ye, Furu Wei, Baining Guo</author>
      <pubDate>2025-08-26T17:09:12Z</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2508.19205</guid>
    </item>
    <item>
      <title>Efficacy and Safety of GLP-1 Receptor Agonists in the Treatment of Type 2 Diabetes Mellitus: A Systematic Review and Network Meta-Analysis</title>
      <link>https://arxiv.org/abs/PMC12230154</link>
      <description>This Bayesian network meta-analysis of 64 RCTs and 25,572 participants provides the most comprehensive comparison of 8 GLP-1 receptor agonist formulations against conventional antidiabetic drugs. Tirzepatide achieved the greatest HbA1c reduction (−2.3%) and body weight loss (−9.1 kg). Long-acting GLP-1 RAs consistently outperformed short-acting formulations for glycemic control.</description>
      <author>Xiaoyu Ren, Honghao Hua, Yuanqin Wu, Wei Zhang, Xianzhen Long, Yana Bai, Ning Cheng</author>
      <pubDate>2025-01-01T00:00:00Z</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/PMC12230154</guid>
    </item>
    <item>
      <title>Studies on the functionality of the TC-NER ERCC6-M1097V protein variant frequently found in Louisiana patients with PCa upon UV damage.</title>
      <link>https://arxiv.org/abs/PMC12909189</link>
      <description>Front Oncol</description>
      <author>Ogundepo O, De Benedetti A</author>
      <pubDate>2025</pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/PMC12909189</guid>
    </item>
    <item>
      <title>LlamaFactory: Unified Efficient Fine-Tuning of 100+ Language Models</title>
      <link>https://arxiv.org/abs/2403.13372</link>
      <description></description>
      <author></author>
      <pubDate></pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2403.13372</guid>
    </item>
    <item>
      <title>CARLA-Air: Fly Drones Inside a CARLA World -- A Unified Infrastructure for Air-Ground Embodied Intelligence</title>
      <link>https://arxiv.org/abs/2603.28032</link>
      <description></description>
      <author></author>
      <pubDate></pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2603.28032</guid>
    </item>
    <item>
      <title>InCoder-32B-Thinking: Industrial Code World Model for Thinking</title>
      <link>https://arxiv.org/abs/2604.03144</link>
      <description></description>
      <author>Jian Yang, Wei Zhang, Jiajun Wu, Junhang Cheng, Tuney Zheng</author>
      <pubDate></pubDate>
      <guid isPermaLink="true">https://arxiv.org/abs/2604.03144</guid>
    </item>
  </channel>
</rss>