News

06/25/2025 · Three papers accepted at ICCV 2025!

Our papers "MissRAG: Addressing the Missing Modality Challenge in Multimodal Large Language Models", "What Changed? Detecting and Evaluating Instruction-Guided Image Edits with Multimodal Large Language Models" and "Talking to DINO: Bridging Self-Supervised Vision Backbones with Language for Open-Vocabulary Segmentation" have been accepted to ICCV 2025!

06/21/2025 · Area Chair for WACV 2026

Happy to share that I will serve as an Area Chair for WACV 2026!

05/02/2025 · Tutorial on AI and HPC at ICIAP 2025

We are glad to announce that we will organize, together with NVIDIA, a tutorial on AI and HPC at ICIAP 2025. The tutorial is organized as part of the MINERVA European Project.

02/26/2025 · Three papers accepted at CVPR 2025!

Our papers "Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering", "Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval" and "Hyperbolic Safety-Aware Vision-Language Models" (joint collaboration with UvA) have been accepted to CVPR 2025!

Update: "Hyperbolic Safety-Aware Vision-Language Models" has been selected as highlight paper!

02/21/2025 · Invited talk at ML Modena

I will be giving an invited talk at ML Modena, on Trustworhy AI. More info and registration here.

02/21/2025 · IRCDL 2026 in Modena

Happy to share that we will host the 2026 edition of IRCDL, the Conference on Information and Research Science Connecting to Digital and Library Science.

02/17/2025 · Area Chair for ACM Multimedia 2025

Happy to share that I will serve as an Area Chair for ACM Multimedia 2025!

01/22/2025 · Paper accepted at ICLR 2025

Our paper "Causal Graphical Models for Vision-Language Compositional Understanding" (with F. Parascandolo, N. Moratelli, E. Sangineto, R. Cucchiara) has been accepted to ICLR 2025!

12/30/2024 · Associate Editor for Computer Vision and Image Understanding (CVIU)

Happy to share that I joined the CVIU Editorial Board as an Associate Editor!

12/06/2024 · CADL Workshop accepted at ISC 2025

Our "Computational Aspects of Deep Learning" Workshop, organized in conjunction with NVIDIA and CINECA, has been accepted in ISC High Performance 2025, the HPC Event.

12/03/2024 · Area Chair for ICCV 2025

Happy to share that I will serve as an Area Chair for ICCV 2025!

11/01/2024 · Two papers accepted at WACV 2025

Glad to announce that our papers, "Semantically Conditioned Prompts for Visual Recognition under Missing Modality Scenarios" (with V. Pipoli, F. Bolelli, S. Sarto, M. Cornia, C. Grana, R. Cucchiara and E. Ficarra) and "Perceive, Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries" (with R. Amoroso, G. Zhang, R. Koner, R. Cucchiara, V. Tresp) have been accepted at WACV 2025!

09/26/2024 · Paper accepted to NeurIPS 2024

Our paper, "Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments", has been accepted to NeurIPS 2024, Datasets and Benchmarks track!

08/03/2024 · Introducing LLaVa-MORE

Today we are introducing LLaVA-MORE, a family of models that enhances LLaVA by integrating LLaMA 3.1 as the language model. Check out our Github repo!

07/20/2024 · Oral paper accepted to BMVC 2024

Our paper, "Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization", has been accepted for oral presentation to BMVC 2024!

07/12/2024 · Research Collaborator and Post-doc Positions Available

We have open research collaborator and post-doc positions within the MINERVA EU project, and the PRIN projects "MUCES - a Multimedia platform for Content Enrichment and Search in audiovisual archives" and "MUSMA: Multimedia Understanding meets Social Media Analysis". If you are interested, please get in touch!

07/04/2024 · MINERVA proposal successful!

Our proposal MINERVA, submitted to the DIGITAL-EUROHPC-JU-2023-AISC-03-01 call, and coordinated by CINECA, has been successfully approved!

07/01/2024 · Three papers accepted at ECCV 2024!

Glad to announce that we have three papers accepted at ECCV 2024: "Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models", "Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities" and "BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues".

05/16/2024 · Paper accepted to ACL 2024

Our paper, "The Revolution of Multimodal Large Language Models: A Survey", has been accepted to the ACL 2024 Findings!

04/14/2024 · Three workshop proposals accepted at ECCV 2024

Glad to announce that I will be organizing three workshops at ECCV 2024: "Computational Aspects of Deep Learning" (CADL) with NVIDIA, "TWYN: Trust What You learN. 1st Workshop on Trustworthiness in Computer Vision" with Leonardo SpA, and AI4DH "Artificial Intelligence for Digital Humanities" with UNIMC. Details soon to follow!

02/27/2024 · Paper accepted at CVPR 2024

Our paper "Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation" (with L. Barsellotti, R. Amoroso, M. Cornia, R. Cucchiara) has been accepted at CVPR 2024! [project page, video, arxiv]

02/10/2024 · Paper accepted at ICRA 2024

Our paper "Mapping High-level Semantic Regions in Indoor Environments without Object Recognition" (with R. Bigazzi, S. Kousik, R. Cucchiara and M. Pavone) has been accepted as oral to ICRA 2024!

01/07/2024 · Workshop and Challenge on DeepFake Analysis and Detection at CVPR 2024

I am co-organizing the second International Workshop on DeepFake Analysis and Detection, which will be held in conjunction with CVPR 2024. See the website (www.dfad.unimore.it) for more!

12/05/2023 · Associate Editor of Pattern Recognition

I have been appointed as Associate Editor for Pattern Recognition.

11/23/2023 · Area Chair for ACM Multimedia 2024

I will serve as Area Chair for ACM Multimedia 2024. The deadline for paper submission is 12th April 2024.

10/31/2023 · Paper accepted at IJCV

Our paper "Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets" (with M. Cornia, G. Fiameni, R. Cucchiara) has been accepted to IJCV!

10/24/2023 · Two papers accepted to WACV 2024

Two papers accepted on semantic (and open-vocabulary) segmentation at WACV 2024. Thanks to R. Amoroso, L. Barsellotti, R. Cucchiara, M. Bernhard, Y. Kindermann, M. Schubert and V. Tresp!

07/30/2023 · PRIN 2022 PNRR project accepted

Our PRIN 2022 project "MUCES - a MUltimedia platform for Content Enrichment and Search in audiovisual archives" (with F. Carrara) has been accepted!

07/14/2023 · Paper accepted at ICCV 2023

We are glad to announce that our paper "With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning" has been accepted to ICCV 2023.

06/18/2023 · PRIN 2022 project accepted

Our PRIN 2022 project "MUSMA: Multimedia Understanding meets Social Media Analysis" (with G. Serra and W. Quattrociocchi) has been accepted!

06/03/2023 · Workshop and Challenge on DeepFake Analysis and Detection at ICCV 2023

I am co-organizing the first International Workshop on DeepFake Analysis and Detection, which will be held in conjunction with ICCV 2023. See the website for more. Paper submission deadline is July, 17th AoE.

05/10/2023 · AIDA e-lecture

I will be delivering for the International AI Doctoral Academy (AIDA) the e-lecture: “From Images to Text: New forms of Human-AI Interaction”, on May 16th, 2023 17:00 -18:00 CET. See details in: http://www.i-aida.org/ai-lectures/

You can join for free using the zoom link: link & Password: 148148

04/20/2023 · Area Chair ACM Multimedia 2023

I will serve as Area Chair to the 30th ACM International Conference on Multimedia. The deadline for paper submission is 30th April 2023.

04/19/2023 · ELLIS Summer School on Large-Scale AI

Our Modena Elllis Unit will is organizing a Summer School of the Ellis network this year, on September 18th to 25th, at Modena Technopole. See the website for more.

03/02/2023 · Paper accepted as highlight at CVPR 2023

We are glad to announce that our paper "Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation" has been accepted to CVPR 2023 as highlight paper (top 2.5% of submissions). Arxiv and Github.

02/02/2023 · Paper accepted at ICRA 2023

Our paper "Embodied Agents for Efficient Exploration and Smart Scene Description" has been accepted to ICRA 2023 (arxiv).

08/02/2022 · Best paper award at CBMI 2022

Our paper "Retrieval-Augmented Transformer for Image Captioning" has been selected as best paper at the International Conference on Content-based Multimedia Indexing (CBMI 2022)!

05/27/2022 · Best Student Paper Award at ICIAP 2021

Our paper "Investigating Bidimensional Downsampling in Vision Transformer Models" by Paolo Bruno, Roberto Amoroso, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, and Rita Cucchiara has been selected for the Best Student Paper Award at ICIAP 2021.

03/27/2022 · Computational Aspects of Deep Learning (CADL) workshop accepted at ECCV

Together with NVAITC, we are organizing the second Workshop on Computational Aspects of Deep Learning (CADL), which will be hosted at ECCV 2022. Check out the website.

02/15/2022 · Workshop on Artificial Intelligence for Digital Humanities at ICIAP 2021

I am co-organizing the first International Workshop on Artificial Intelligence for Digital Humanities, which will be held in conjunction with ICIAP 2021. See the website for more.

02/07/2022 · Area Chair ACM Multimedia 2022

I will participate as Area Chair to the 29th ACM International Conference on Multimedia. The deadline for paper submission is 31st March 2022.

01/31/2022 · Paper accepted at ICRA 2022

Our paper "Focus on Impact: Indoor Exploration with Intrinsic Motivation" has been accepted to ICRA 2022 and IEEE RA-L. Read it on ArXiv.

01/18/2022 · From Show to Tell: A Survey on Image Captioning accepted at TPAMI

Interested in Image Captioning? Our definitive guide to techniques, datasets and variants has been accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence. Check it out!

01/17/2022 · Associate Editor for ICPR 2022

I will serve as Associate Editor for ICPR 2022.

10/31/2021 · Communications Chair for ECCV 2022

Happy to announce that I will be joining the ECCV 2022 organizing team as Communications Chair.

📣 Update: the official website of ECCV 2022 is out, with deadlines.

07/31/2021 · 📣 We are hiring! One open PhD position

We have a three-years Early-Stage Researcher/PhD position on the Integration of vision and language for human-robot interaction, inside the MSCA project PERSEO. See the call for information on salary and how to apply.

07/29/2021 · ELLIS Scholar

I have been elected as an ELLIS Scholar in the ELLIS society, the European Laboratory for Learning and Intelligent Systems.

07/25/2021 · Associate Editor for Frontiers in Artificial Intelligence

I will serve as Associate Editor for the specialty section in Pattern Recognition of Frontiers in Artificial Intelligence.

07/25/2021 · Paper accepted to Computer Vision and Image Understanding

Our paper "Video action detection by learning graph-based spatio-temporal interactions" has been accepted to Computer Vision and Image Understanding! You can read the full article here.

06/26/2021 · Submission accepted at ICMR 2021

Our submission "Learning to Select: A Fully Attentive Approach for Novel Object Captioning" has been accepted at ICMR 2021.

04/12/2021 · Submission accepted at NVIDIA GTC 2021

Our submission "More Efficient and Accurate Video Networks: A New Approach to Maximize the Accuracy/Computation Tradeoff" has been accepted for oral presentation at NVIDIA GTC 2021, which will take place online on April 12 -16. Watch the talk.

02/28/2021 · Area Chair ACM Multimedia 2021

I will participate as Area Chair to the 29th ACM International Conference on Multimedia. The deadline for paper submission is Apr. 10, 2021.

02/18/2021 · Paper accepted to Computer Vision and Image Understanding

Our paper "Video action detection by learning graph-based spatio-temporal interactions" has been accepted to Computer Vision and Image Understanding! You can read the full article here.

09/23/2020 · Interview with La Repubblica

I have been interviewed by Jaime D'Alessandro on Rep: Scienze, about Gpt-3 and Transformed-based language models. You can read the article here.

07/01/2020 · The first Workshop on Computational Aspects of Deep Learning at ICPR 2020

Together with NVAITC, we are organizing the first Workshop on Computational Aspects of Deep Learning (CADL), which will be hosted at ICPR 2020. See the website for more.

06/22/2020 · Call for Demo and Exhibit at ICPR 2020

ICPR invites researchers to present live demonstration of their research results and systems. Demos are intended as real, practical, and possibly interactive proofs of the presenters’ research ideas and scientific or engineering contributions. They should provide the audience the opportunity to discuss working systems, applications and prototypes based on leading edge research, and to discuss and interact in first person with the researchers presenting the demo. See the call.

06/20/2020 · Presenting our Meshed-Memory Transformer at CVPR 2020

We are presenting our work on image captioning at CVPR 2020. Take a look at the video presentation, read the paper and share code!

04/13/2020 · Walking through the museum, while staying home

The Galleria Estense of Modena, in partnership with AImageLab, is organizing virtual tours that let visitors walk through the halls of the art gallery without leaving their homes. See the press releases on Gazzetta di Modena and Modena in Diretta.

02/14/2020 · Associate Editor of Pattern Recognition Letters

I have been appointed as Associate Editor at Pattern Recognition Letters.

12/18/2019 · Our Meshed-Memory Transformer ranks first on the COCO Image Captioning Leaderboard! 🏆

With a CIDEr-D of 1.321, our architecture for image captioning is first on COCO Captioning. See the leaderboard.

09/27/2019 · Invited talk at Modena Smart Life

On September 27th I will give an invited talk at the "Modena Smart Life" festival, on Vision, Language and Embodied AI. See the program of the event for further details.

09/11/2019 · Interview at Smart City on Radio24

I have been interviewed by Maurizio Melis on Radio24. You can hear the podcast of the interview here.

08/09/2019 · PersonArt - an interactive demo at Gallerie Estensi

From September, 13th to 29th you can discover your doppelgänger in art with our interactive face similarity demo.
See the Gallerie Estensi website for more.

08/05/2019 · LAMV is being used at Facebook to detect harmful content

Our solution for matching and detecting copied videos, published in CVPR 2018, is now being used in production scale at Facebook to detect harmful content.

See the official announcement on the Facebook newsroom website, and the Github repository with the source code.

08/04/2019 · Tutorial at ICIAP 2019 - "Vision, Language and Action: from Captioning to Embodied AI"

See the abstract and program on the tutorial page.

07/05/2019 · I am co-organizing the first NVIDIA Inception Event in Italy

More details at the event page.

07/01/2019 · One paper accepted as oral at BMVC (with F. Landi and M. Corsini)

02/25/2019 · Two papers (with M. Cornia and M. Tomei) accepted at CVPR 2019!

06/29/2018 · Our paper on Human Eye Fixations Prediction has been accepted at Transactions on Image Processing (TIP)!

02/19/2018 · Our paper on Temporal Match Kernels (with M. Douze and H. Jégou) has been accepted at CVPR 2018!

02/13/2018 · I successfully defended my thesis.

04/28/2017 · I did an internship at FAIR (Facebook AI Research) Paris from July to October 2017

03/01/2017 · Our paper "Hierarchical Boundary-Aware Neural Encoder for Video Captioning" has been accepted at CVPR 2017

08/30/2016 · Imagelab will receive a GPU-based server as part of the Facebook AI Research Partnership