Dataset | 🍎 Paper Today I Read 🦔

[221] Scaling Synthetic Data Creation with 1,000,000,000 Personas

dataset LLM 2024Q3

[151] FOIL it! Find One mismatch between Image and Language caption

dataset 2017 XAI evaluation

[138] ShareGPT4V: Improving Large Multi-Modal Models with Better Captions

multimodal dataset 2023Q4 MLLM

[135] Multimodal C4: An Open, Billion-scale Corpus of Images Interleaved with Text

multimodal dataset NeurIPS 2023Q2

[133] DataComp: In search of the next generation of multimodal datasets

dataset CLIP 2023Q2

[108] Unsupervised Vision-Language Parsing: Seamlessly Bridging Visual Scene Graphs with Language Structures via Dependency Relationships

2022Q1 dataset CVPR graph

[74] “This is my unicorn, Fluffy”: Personalizing frozen vision-language representations

dataset 2022Q3 25min ECCV nvidia CLIP

[41] Panoptic Scene Graph Generation

dataset SGG 2022Q3 25min

[19] Multimodal Explanations: Justifying Decisions and Pointing to the Evidence

multimodal 2018 dataset