arXiv日报 - 2026

2026-06-25

今日 cs.CV 共 94 篇论文，命中 1 篇。

1. LEVIRDet: A Million-Scale 159-Category Dataset and Foundation Model for Universal Remote Sensing Object Detection

作者: Qinzhe Yang, Dongyu Wang, Haohan Niu, Jia Xu, Zhenwei Shi, Zhengxia Zou
关键词: object detection
推荐度: ⭐值得深入
推荐理由: 百万级遥感目标检测数据集+基础模型。LEVIRDet-159 是当前最大的遥感检测数据集（159类/256万框），LEVIRDetNet 通过在线 GSD 预测+层次检测头实现零样本跨域泛化，在9个外部基准上平均超全监督 SOTA 5.02 mAP。数据集和模型将开源，对遥感检测通用化有重要推动。

提出 LEVIRDet-159 数据集（159类、256万边界框、70万细粒度标注）和 LEVIRDetNet 基础模型，通过在线 GSD 预测与层次感知检测头实现跨传感器/跨分辨率通用遥感目标检测，零样本在9个外部基准达到 SOTA。

2026-06-24

今日 cs.CV 共 116 篇论文，命中 4 篇。

1. From Spatial to Spectral: An Efficient, Frequency-Guided Feature Representation Learner for Small Object Detection

作者: Yuhan Rui, Shihan Qiao, Yibin Lou, Mingxi Yu, Yutong Wan, Yanqiao Chen, Dongsheng Hou, Zhen Cao, Athena Zhuoming Zhong, Qi Hao
关键词: object detection, small object, yolo
推荐度: ⭐值得深入
推荐理由: 小目标检测直接相关，提出频域特征范式（从 spatial 转向 spectral），DER 算子灵活通用（CNN/Transformer），DERNet 在同等规模下超越 YOLOv11 且参数仅 1/6，多基准验证（VisDrone, UAVDT, TinyPerson, DOTA）。方向新颖且实用性强。

提出频域特征表示框架 DERNet，通过 Decompose-Enhance-Reconstruct 算子在 backbone/neck/head 注入频域调制，高效恢复小目标的高频细节。在 VisDrone/UAVDT/TinyPerson/DOTAv1 上达到 SOTA，DERNet 以仅 1/6 的参数超越 YOLOv11。

2026-06-09

今日 cs.CV 共 217 篇论文，命中 7 篇。

1. Edge-Constrained UAV Small-Object Detection with P2 Enhancement and Quantum-Inspired Lightweight Structure Search

作者: Wuming Lei, Yanbin Gao, Mingyan Sun, Xiaobin Li, Xuechen Liang
关键词: object detection
推荐度: ⭐值得深入
推荐理由: 将量子启发进化算法用于轻量无人机检测网络的结构搜索，结合 P2 高分辨率分支提升小目标检测。VisDrone 上 AP_small 提升 31%

在 YOLOX-Nano 基础上引入 P2 高分辨率分支和量子启发进化算法进行轻量结构搜索，VisDrone 上小目标 AP 提升 31%

2026-06-05

今日 cs.CV 共 101 篇论文，命中 2 篇。

1. BMCR: Adaptive Backbone Module Composition via Reinforcement Learning for Remote Sensing Object Detection

作者: Liu, Wenlin, Hu, Xikun, Zhong, Ping
关键词: object detection
推荐度: 📖可浏览
推荐理由: 遥感目标检测方向，用RL动态组合CNN和ViT模块构建自适应backbone。核心创新是模块分解+OT对齐+RL路由。遥感领域非主要关注方向，但跨backbone动态组合的思路值得了解。

遥感目标检测中CNN和ViT骨干的动态组合方法。通过强化学习从可复用模块中自适应组装推理路径，使用最优传输接口实现跨族对齐。

2. Unveiling the Unknown: Open Vocabulary Object Detection with Scene Graphs

作者: Chen, Yi, Lu, Yinghao, Li, Zhehao 等
关键词: object detection
推荐度: ⭐值得深入
推荐理由: 开放词汇检测直接相关。用场景图建模区域间语义和空间关系来解决OVOD中novel category检测问题，思路新颖。涉及Relation Attention Module和scene-based textual alignment，与猴哥的检测+场景理解方向相关。

开放词汇目标检测方法。通过场景图结构引导检测器识别未见类别。

2026-06-03

今日 cs.CV 共 123 篇论文，命中 3 篇。

3. Ultralytics YOLO26: Unified Real-Time End-to-End Vision Models

作者: Glenn Jocher, Jing Qiu, Mengyu Liu, Shuai Lyu
关键词: yolo
推荐度: ⭐值得深入
推荐理由: YOLO26 官方论文，Ultralytics 发布。NMS-Free 端到端架构、MuSGD 优化器、Progressive Loss。

摘要：YOLO26 通过 dual-head 设计实现原生 NMS-free 推理，移除 DFL，引入 MuSGD（从 LLM 训练引入的 Muon+SGD），STAL 小目标标签分配，ProgLoss 动态损失。

2026-06-02

今日 cs.CV 共 299 篇论文，命中 6 篇。

1. FlowOVD: Learning Generative Latent Flows for Zero-shot Open-vocabulary Detection

作者: Yao Wei, Andrea Cavallaro, Changjae Oh
关键词: object detection, open-vocabulary detection
推荐度: ⭐值得深入
推荐理由: 把 decoder query 生成建模为 rectified flow，替代传统离散 query 构造。COCO 49.5 AP (+1.2 over GroundingDINO), LVIS 31.5 AP (+4.1)。思路干净效果扎实。

提出 FlowOVD，通过 rectified flow 将 text-agnostic query 渐进转化为 text-guided query。无需额外训练数据即超越 GroundingDINO。

2. LFA: Layer Feature Attention for Run-Time Introspection of 2D Object Detectors in Automated Driving

作者: Mert Keser, Alois Knoll
关键词: object detection, object detector
推荐度: ⭐值得深入
推荐理由: 利用 attention 聚合多层 backbone 特征预测检测器错误，比单层 last-layer 方法更好。与检测器可靠性分析方向相关。

轻量级 introspection 方法，学习多 backbone 层的重要性权重。在 KITTI 和 BDD100K 上达到 SOTA introspection 性能。

3. Self-Improving Small Object Grounding in LVLMs

作者: Tianze Yang, Yucheng Shi, Ruitong Sun, Ninghao Liu, Jin Sun
关键词: small object
推荐度: 📖可浏览
推荐理由: 用 LVLM attention map 训 IoU 回归器优化小目标定位，无需 fine-tune。ACS-Free 的训-free selector 思路可参考。

提出 ACS 框架，用 attention 编码定位质量选择最优 box。COCO 和 Objects365 小目标定位提升最高 19%。

4. Collaborative Space Object Detection with Multi-Satellite Viewpoints

作者: Xingyu Qu, Wenxuan Zhang, Peng Hu
关键词: object detection, yolo
推荐度: 📖可浏览
推荐理由: 多视角融合空间目标检测，YOLOv9-m baseline。领域特殊，但多视角融合思路可借鉴。

多卫星视角融合 + YOLO，三视角 RGB 将 mAP50 从 0.638 提升到 0.732。

2026-05-27（补）

今日 cs.CV 共 105 篇论文，命中 2 篇。

1. LV-OSD: Language-Vision-Complementary Open-Set Object Detection

作者: Yupeng Zhang, Ruize Han, Wei Feng, Song Wang, Liang Wan
关键词: object detection, open-set, multi-modal
推荐度: 📖可浏览
推荐理由: 提出双分支开放检测框架 LVDor，同时支持文本和图像 prompts，TPDW 模块动态加权对齐多模态语义。

摘要：提出 LV-OSD 问题，设计 LVDor 框架同时接收文本和图像提示，TPDW 模块动态对齐目标语义。

2026-05-27

今日 cs.CV 共 254 篇论文，命中 7 篇。

1. DisDop: Distillation with Domain Priors for Open-Vocabulary Aerial Object Detection

作者: Ruihao Xu 等
关键词: object detection, open-vocabulary, distillation, aerial detection
推荐度: ⭐值得深入
推荐理由: 面向无人机航拍场景的开放词汇检测，用 RemoteCLIP+DINOv3 蒸馏领域先验，在遥感数据集上 SOTA。
阅读笔记: 2026-05-23-disdop

摘要：针对无人机视角图像稀缺、与自然图像差异大的问题，提出 DisDop 框架，从遥感基础模型（RemoteCLIP、DINOv3）中蒸馏多层次领域先验到轻量检测器。

作者: Bharatesh Chakravarthi 等
关键词: object detection, event camera, cross-modal fusion
推荐度: 📖可浏览
推荐理由: 事件相机+RGB 融合用于低光目标检测，理论证明优雅（最小方差线性估计），在 LLE-VOS 基准上提升明显。

摘要：提出 AdaFuse-Det，双流框架融合 CLAHE 增强的 RGB 和事件相机数据，使用自适应跨模态融合模块。

3. Calibrating Probabilistic Object Detectors with Annotator Disagreement

作者: Zhi Qin Tan 等
关键词: object detector, calibration, uncertainty
推荐度: 📖可浏览
推荐理由: 解决标注不一致问题下的检测器校准，不使用 GT，对齐预测不确定性与标注者分布。方法通用。

摘要：针对模糊目标标注不一致问题，提出无需 GT 的概率检测器校准框架，对齐分类和定位的置信度与标注分布。

4. Multiscale Real-Time Object Detection in the NMS-Free Era: YOLOv8 vs YOLO26

作者: Ozioma C. Oguine
关键词: yolo, nms-free, object detection
推荐度: ⏭不推荐
推荐理由: YOLOv8 和 YOLO26 的对比分析文章，无新方法。YOLO26 在 Pascal VOC 更强，YOLOv8 在 GPU 延迟仍有优势。

摘要：在 Pascal VOC 和 VisDrone 上对比 YOLOv8 和 YOLO26，NMS-free 设计不一定在所有场景有优势。

5. TinyFormer: Preserving Tiny Objects in YOLO-DETR Hybrid Real-time Detectors

作者: Jun-Wei Hsieh 等
关键词: yolo, detr, tiny object detection, real-time
推荐度: ⭐值得深入
推荐理由: YOLO-DETR 混合架构解决小目标检测，SOTA 性能（58.4% AP small objects）。PBM 模块和 SSA 设计针对性强。
阅读笔记: 2026-05-24-tinyformer

摘要：TinyFormer 统一 YOLO 和 DETR 混合实时检测器，提出 PBM 保持高分辨率特征，SSA 补偿 token 化带来的空间损失。

6. Weakly Supervised Camouflaged Object Detection Based on SAM Model and Mask Guidance

作者: Xia Li 等
关键词: camouflaged object detection, sam, weakly supervised
推荐度: 📖可浏览
推荐理由: 用 SAM 生成 pseudo labels 做弱监督伪装目标检测，BoxSAM + MGNet 思路可参考。

摘要：提出 MGNet 和 BoxSAM，用 SAM 结合边界框提示生成高质量 pseudo labels 进行弱监督训练。

7. SAM3-Assisted Training of Lightweight YOLO Models for Precision Pig Farming

作者: Marcos Faria 等
关键词: yolo, sam, precision agriculture
推荐度: ⏭不推荐
推荐理由: 农业应用，用 SAM3 蒸馏 YOLO 到猪场检测，方法常规，应用导向。

摘要：用 SAM3 零样本生成 pseudo labels 训练 YOLOv8，在 PigLife 数据集上 mAP 79.4%，推理延迟降低 200 倍。

2026-05-26

今日 cs.CV 共 179 篇论文，命中 10 篇。

5. MDS-DETR: DETR with Masked Duplicate Suppressor

推荐度: ⭐值得深入
说明: DETR 改进，Masked Duplicate Suppressor 在单解码器内同时利用 one-to-one 和 one-to-many 监督

2026-05-21

今日 cs.CV 共 141 篇论文，命中 6 篇。

1. Decoupling Ego-Motion from Target Dynamics for UAV Detection

推荐度: 📖可浏览
说明: 双间隔运动线索解耦自运动与目标运动，YOLOv8 基线在 VisDrone-VID 上提升

2. SADGE: Structure and Appearance Domain Gap Estimation

推荐度: 📖可浏览
说明: 合成/真实数据域差距量化指标，DINOv3+MASt3R 融合 Pearson r=0.88

3. Synthetic RAW Augmentations for Low-Light Person Detection

推荐度: 📖可浏览
说明: 行人夜间检测，合成 RAW 低光照增强，性能指标与真实低光数据一致

5. Impact of Atmospheric Turbulence and Pointing Error on Earth Observation

推荐度: 📖可浏览
说明: 大气湍流/抖动对卫星目标检测的影响，YOLOv8 召回从 91% 降至 <40%，RetinaNet 更鲁棒

2026-05-20

今日 cs.CV 共 134 篇新论文，命中 7 篇（arXiv API 429/503，通过 HTML 列表页降级获取）。

1. LER-YOLO: RGB-IR UAV Detection

推荐度: 📖可浏览
说明: 可靠性感知 MoE，RGB-IR 双模态小 UAV 检测，89.9% AP50

2. GSA-YOLO: X-ray Security Inspection

推荐度: 📖可浏览
说明: 结构化稀疏 + 自适应知识蒸馏，X 光安检，8.0G FLOPs / 189 FPS

arXiv日报 - 2026 ​

2026-06-25 ​

1. LEVIRDet: A Million-Scale 159-Category Dataset and Foundation Model for Universal Remote Sensing Object Detection ​

2026-06-24 ​

1. From Spatial to Spectral: An Efficient, Frequency-Guided Feature Representation Learner for Small Object Detection ​

2026-06-09 ​

1. Edge-Constrained UAV Small-Object Detection with P2 Enhancement and Quantum-Inspired Lightweight Structure Search ​

2026-06-05 ​

1. BMCR: Adaptive Backbone Module Composition via Reinforcement Learning for Remote Sensing Object Detection ​

2. Unveiling the Unknown: Open Vocabulary Object Detection with Scene Graphs ​

2026-06-03 ​

3. Ultralytics YOLO26: Unified Real-Time End-to-End Vision Models ​

2026-06-02 ​

1. FlowOVD: Learning Generative Latent Flows for Zero-shot Open-vocabulary Detection ​

2. LFA: Layer Feature Attention for Run-Time Introspection of 2D Object Detectors in Automated Driving ​

3. Self-Improving Small Object Grounding in LVLMs ​

4. Collaborative Space Object Detection with Multi-Satellite Viewpoints ​

2026-05-27（补） ​

1. LV-OSD: Language-Vision-Complementary Open-Set Object Detection ​

2026-05-27 ​

1. DisDop: Distillation with Domain Priors for Open-Vocabulary Aerial Object Detection ​

2. AdaFuse-Det: Adaptive Cross-Modal Fusion of Event Cameras for Robust Object Detection in Low-Light RGB Imagery ​

3. Calibrating Probabilistic Object Detectors with Annotator Disagreement ​

4. Multiscale Real-Time Object Detection in the NMS-Free Era: YOLOv8 vs YOLO26 ​

5. TinyFormer: Preserving Tiny Objects in YOLO-DETR Hybrid Real-time Detectors ​

6. Weakly Supervised Camouflaged Object Detection Based on SAM Model and Mask Guidance ​

7. SAM3-Assisted Training of Lightweight YOLO Models for Precision Pig Farming ​

2026-05-26 ​

5. MDS-DETR: DETR with Masked Duplicate Suppressor ​

2026-05-21 ​

1. Decoupling Ego-Motion from Target Dynamics for UAV Detection ​

2. SADGE: Structure and Appearance Domain Gap Estimation ​

3. Synthetic RAW Augmentations for Low-Light Person Detection ​

5. Impact of Atmospheric Turbulence and Pointing Error on Earth Observation ​

2026-05-20 ​

1. LER-YOLO: RGB-IR UAV Detection ​

2. GSA-YOLO: X-ray Security Inspection ​