Added new VLM blog to `image-text-to-text` task #1465

sergiopaniego · 2025-05-20T13:03:38Z

Adding Vision Language Models (Better, Faster, Stronger) link and removing link to same page.

pcuenca · 2025-05-20T13:23:11Z

packages/tasks/src/tasks/image-text-to-text/about.md

 - [Vision Language Models Explained](https://huggingface.co/blog/vlms)
 - [Welcome PaliGemma 2 – New vision language models by Google](https://huggingface.co/blog/paligemma2)
 - [SmolVLM - small yet mighty Vision Language Model](https://huggingface.co/blog/smolvlm)
 - [Multimodal RAG using ColPali and Qwen2-VL](https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb)
- [Image-text-to-text task guide](https://huggingface.co/tasks/image-text-to-text)


recursion ftw!

sorry it should be https://huggingface.co/docs/transformers/tasks/image_text_to_text my fault

pcuenca

Good idea to update, let's wait for @merveenoyan's review before merging :)

sergiopaniego · 2025-05-20T13:41:33Z

I was thinking about reviewing the whole task page and updating it with ideas from the blog but I'd do that in a separated PR 😄

Vaibhavs10

Nice! - no preference on the overhaul - you can do it in this or another (there's no rush)

Vaibhavs10 · 2025-05-20T15:47:41Z

packages/tasks/src/tasks/image-text-to-text/about.md

 - [Vision Language Models Explained](https://huggingface.co/blog/vlms)
 - [Welcome PaliGemma 2 – New vision language models by Google](https://huggingface.co/blog/paligemma2)
 - [SmolVLM - small yet mighty Vision Language Model](https://huggingface.co/blog/smolvlm)
 - [Multimodal RAG using ColPali and Qwen2-VL](https://github.com/merveenoyan/smol-vision/blob/main/ColPali_%2B_Qwen2_VL.ipynb)
- [Image-text-to-text task guide](https://huggingface.co/tasks/image-text-to-text)


recursion ftw!

merveenoyan · 2025-05-20T16:00:37Z

@sergiopaniego can you bring the bullet back with this link? https://huggingface.co/docs/transformers/tasks/image_text_to_text can't do suggestions for some reason

sergiopaniego · 2025-05-20T16:17:55Z

Updated! Thanks for the pointer @merveenoyan 😄

Added new VLM 25 blog to image-text-to-text task

da9271f

sergiopaniego requested review from SBrandeis, gary149, Wauplin, julien-c, pcuenca and ngxson as code owners May 20, 2025 13:03

pcuenca reviewed May 20, 2025

View reviewed changes

pcuenca approved these changes May 20, 2025

View reviewed changes

Vaibhavs10 reviewed May 20, 2025

View reviewed changes

Included new link

11cde8d

Vaibhavs10 approved these changes May 21, 2025

View reviewed changes

sergiopaniego merged commit 91003db into main May 21, 2025
4 of 5 checks passed

sergiopaniego deleted the add-new-vlm-blog branch May 21, 2025 12:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Added new VLM blog to `image-text-to-text` task #1465

Added new VLM blog to `image-text-to-text` task #1465

sergiopaniego commented May 20, 2025

pcuenca May 20, 2025

Vaibhavs10 May 20, 2025

merveenoyan May 20, 2025

pcuenca left a comment

sergiopaniego commented May 20, 2025

Vaibhavs10 left a comment

Vaibhavs10 May 20, 2025

merveenoyan commented May 20, 2025

sergiopaniego commented May 20, 2025

Added new VLM blog to image-text-to-text task #1465

Added new VLM blog to image-text-to-text task #1465

Conversation

sergiopaniego commented May 20, 2025

pcuenca May 20, 2025

Choose a reason for hiding this comment

Vaibhavs10 May 20, 2025

Choose a reason for hiding this comment

merveenoyan May 20, 2025

Choose a reason for hiding this comment

pcuenca left a comment

Choose a reason for hiding this comment

sergiopaniego commented May 20, 2025

Vaibhavs10 left a comment

Choose a reason for hiding this comment

Vaibhavs10 May 20, 2025

Choose a reason for hiding this comment

merveenoyan commented May 20, 2025

sergiopaniego commented May 20, 2025

Added new VLM blog to `image-text-to-text` task #1465

Added new VLM blog to `image-text-to-text` task #1465