Kakao | 🍎 Paper Today I Read 🦔

[149] Noise-aware Learning from Web-crawled Image-Text Data for Image Captioning

ICCV 25min 2022Q4 kakao

[143] Honeybee: Locality-enhanced Projector for Multimodal LLM

kakao 2023Q4 MLLM

[126] ViLT: Vision-and-Language Transformer Without Convolution or Region Supervision

multimodal 2021Q1 25min kakao

[72] Sparse DETR: Efficient End-to-End Object Detection with Learnable Sparsity

2021Q4 ICLR object detection sparse kakao