Paper ToDo List about MLLM
MLLM相关已读和待读Paper列表
Paper ToDo List about MLLM
MLLM
📊 统计
- 总论文: 11篇
- 待读: 11篇
- 进行中: 0篇
- 已完成: 0篇
| ID | 状态 | 年份 | 收录日期 | 完成日期 | 论文标题 |
|---|---|---|---|---|---|
| 1 | ⏳ | 2024 | 2024-12-05 | - | Critic-V: VLM Critics Help Catch VLM Errors in Multimodal Reasoning |
| 2 | ⏳ | 2024 | 2024-12-05 | - | DINO-X:AUnifiedVisionModelfor Open-WorldObjectDetectionandUnderstanding |
| 3 | ⏳ | 2024 | 2024-12-05 | - | Efficient Long Video Tokenization via Coordinated-based Patch Reconstruction |
| 4 | ⏳ | 2024 | 2024-12-05 | - | FINECAPTION: Compositional Image Captioning Focusing on Wherever You Want at Any Granularity |
| 5 | ⏳ | 2024 | 2024-12-05 | - | JanusFlow: Harmonizing Autoregression and Rectified Flow for Unified Multimodal Understanding and Generation |
| 6 | ⏳ | 2024 | 2024-12-05 | - | Look Every Frame All at Once: Video-Ma2mba for Efficient Long-form Video Understanding with Multi-Axis Gradient Checkpointing |
| 7 | ⏳ | 2024 | 2024-12-05 | - | SEAGULL: No-reference Image Quality Assessment for Regions of Interest via Vision-Language Instruction Tuning |
| 8 | ⏳ | 2024 | 2024-12-05 | - | SymDPO: Boosting In-Context Learning of Large Multimodal Models with Symbol Demonstration Direct Preference Optimization |
| 9 | ⏳ | 2024 | 2024-12-06 | - | PaliGemma 2: A Family of Versatile VLMs for Transfer |
| 10 | ⏳ | 2024 | 2024-12-09 | - | OmniFlow: Any-to-Any Generation with Multi-Modal Rectified Flows |
| 11 | ⏳ | 2024 | 2024-12-11 | - | Exploring Multi-Grained Concept Annotations for Multimodal Large Language Models |
| 12 | ⏳ | 2024 | 2024-12-31 | - | Next Token Prediction Towards Multimodal Intelligence: A Comprehensive Survey |
| 13 | ⏳ | 2024 | 2024-12-31 | - | Task Preference Optimization: Improving Multimodal Large Language Models with Vision Task Alignment |
| 14 | ⏳ | 2024 | 2025-01-14 | - | LlamaV-o1: Rethinking Step-by-step Visual Reasoning in LLMs |
图例说明:
- ⏳ 待读
- 📝 进行中
- ✅ 已完成
This post is licensed under CC BY 4.0 by the author.