
09/26/2024 · Paper accepted to NeurIPS 2024

Our paper, "Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments", has been accepted to NeurIPS 2024, Datasets and Benchmarks track!

08/03/2024 · Introducing LLaVa-MORE

🔥 Today we are introducing LLaVA-MORE, a family of models that enhances LLaVA by integrating LLaMA 3.1 as the language model. Check out our Github repo!

07/20/2024 · Oral paper accepted to BMVC 2024

Our paper, "Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization", has been accepted for oral presentation to BMVC 2024!

07/12/2024 · Research Grant Positions Available!

👉 We have open research grant positions within the PRIN projects "MUCES - a Multimedia platform for Content Enrichment and Search in audiovisual archives" and "MUSMA: Multimedia Understanding meets Social Media Analysis". If you are interested, please get in touch!

07/04/2024 · MINERVA proposal successful!

Our proposal MINERVA, submitted to the DIGITAL-EUROHPC-JU-2023-AISC-03-01 call, and coordinated by CINECA, has been successfully approved!

07/01/2024 · Three papers accepted at ECCV 2024!

Glad to announce that we have three papers accepted at ECCV 2024: "Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models", "Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities" and "BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues". 

05/16/2024 · Paper accepted to ACL 2024

Our paper, "The Revolution of Multimodal Large Language Models: A Survey", has been accepted to the ACL 2024 Findings!

04/14/2024 · Three workshop proposals accepted at ECCV 2024

Glad to announce that I will be organizing three workshops at ECCV 2024: "Computational Aspects of Deep Learning" (CADL) with NVIDIA, "TWYN: Trust What You learN. 1st Workshop on Trustworthiness in Computer Vision" with Leonardo SpA, and AI4DH "Artificial Intelligence for Digital Humanities" with UNIMC. Details soon to follow!

02/27/2024 · Paper accepted at CVPR 2024

Our paper "Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation" (with L. Barsellotti, R. Amoroso, M. Cornia, R. Cucchiara) has been accepted at CVPR 2024! [project page, video, arxiv]

02/10/2024 · Paper accepted at ICRA 2024

Our paper "Mapping High-level Semantic Regions in Indoor Environments without Object Recognition" (with R. Bigazzi, S. Kousik, R. Cucchiara and M. Pavone) has been accepted as oral to ICRA 2024!

01/07/2024 · Workshop and Challenge on DeepFake Analysis and Detection at CVPR 2024

I am co-organizing the second International Workshop on DeepFake Analysis and Detection, which will be held in conjunction with CVPR 2024. See the website ( for more!

12/05/2023 · Associate Editor of Pattern Recognition

I have been appointed as Associate Editor for Pattern Recognition.

11/23/2023 · Area Chair for ACM Multimedia 2024

I will serve as Area Chair for ACM Multimedia 2024. The deadline for paper submission is 12th April 2024.

10/31/2023 · Paper accepted at IJCV

Our paper "Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets" (with M. Cornia, G. Fiameni, R. Cucchiara) has been accepted to IJCV!

10/24/2023 · Two papers accepted to WACV 2024

Two papers accepted on semantic (and open-vocabulary) segmentation at WACV 2024. Thanks to R. Amoroso, L. Barsellotti, R. Cucchiara, M. Bernhard, Y. Kindermann, M. Schubert and V. Tresp!

07/30/2023 · PRIN 2022 PNRR project accepted

Our PRIN 2022 project "MUCES - a MUltimedia platform for Content Enrichment and Search in audiovisual archives" (with F. Carrara) has been accepted!

07/14/2023 · Paper accepted at ICCV 2023

We are glad to announce that our paper "With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning" has been accepted to ICCV 2023.

06/18/2023 · PRIN 2022 project accepted

Our PRIN 2022 project "MUSMA: Multimedia Understanding meets Social Media Analysis" (with G. Serra and W. Quattrociocchi) has been accepted!

06/03/2023 · Workshop and Challenge on DeepFake Analysis and Detection at ICCV 2023

I am co-organizing the first International Workshop on DeepFake Analysis and Detection, which will be held in conjunction with ICCV 2023. See the website for more. Paper submission deadline is July, 17th AoE.

05/10/2023 · AIDA e-lecture

I will be delivering for the International AI Doctoral Academy (AIDA) the e-lecture:  “From Images to Text: New forms of Human-AI Interaction”, on May 16th, 2023 17:00 -18:00 CET. See details in:

You can join for free using the zoom link: link & Password: 148148

04/20/2023 · Area Chair ACM Multimedia 2023

I will serve as Area Chair to the 30th ACM International Conference on Multimedia. The deadline for paper submission is 30th April 2023.

04/19/2023 · ELLIS Summer School on Large-Scale AI

Our Modena Elllis Unit will is organizing a Summer School of the Ellis network this year, on September 18th to 25th, at Modena Technopole. See the website for more.

03/02/2023 · Paper accepted as highlight at CVPR 2023

We are glad to announce that our paper "Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation" has been accepted to CVPR 2023 as highlight paper (top 2.5% of submissions). Arxiv and Github.

02/02/2023 · Paper accepted at ICRA 2023

Our paper "Embodied Agents for Efficient Exploration and Smart Scene Description" has been accepted to ICRA 2023 (arxiv).

08/02/2022 · Best paper award at CBMI 2022

Our paper "Retrieval-Augmented Transformer for Image Captioning" has been selected as best paper at the International Conference on Content-based Multimedia Indexing (CBMI 2022)!

05/27/2022 · Best Student Paper Award at ICIAP 2021

Our paper "Investigating Bidimensional Downsampling in Vision Transformer Models" by Paolo Bruno, Roberto Amoroso, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, and Rita Cucchiara has been selected for the Best Student Paper Award at ICIAP 2021.

03/27/2022 · Computational Aspects of Deep Learning (CADL) workshop accepted at ECCV

Together with NVAITC, we are organizing the second Workshop on Computational Aspects of Deep Learning (CADL), which will be hosted at ECCV 2022. Check out the website.

02/15/2022 · Workshop on Artificial Intelligence for Digital Humanities at ICIAP 2021

I am co-organizing the first International Workshop on Artificial Intelligence for Digital Humanities, which will be held in conjunction with ICIAP 2021. See the website for more.

02/07/2022 · Area Chair ACM Multimedia 2022

I will participate as Area Chair to the 29th ACM International Conference on Multimedia. The deadline for paper submission is 31st March 2022.

01/31/2022 · Paper accepted at ICRA 2022

Our paper "Focus on Impact: Indoor Exploration with Intrinsic Motivation" has been accepted to ICRA 2022 and IEEE RA-L. Read it on ArXiv.

01/18/2022 · From Show to Tell: A Survey on Image Captioning accepted at TPAMI

Interested in Image Captioning? Our definitive guide to techniques, datasets and variants has been accepted to IEEE Transactions on Pattern Analysis and Machine Intelligence. Check it out!

01/17/2022 · Associate Editor for ICPR 2022

I will serve as Associate Editor for ICPR 2022.

10/31/2021 · Communications Chair for ECCV 2022

Happy to announce that I will be joining the ECCV 2022 organizing team as Communications Chair. 

📣 Update: the official website of ECCV 2022 is out, with deadlines.

07/31/2021 · 📣 We are hiring! One open PhD position

We have a three-years Early-Stage Researcher/PhD position on the Integration of vision and language for human-robot interaction, inside the MSCA project PERSEO. See the call for information on salary and how to apply.

07/29/2021 · ELLIS Scholar

I have been elected as an ELLIS Scholar in the ELLIS society, the European Laboratory for Learning and Intelligent Systems.

07/25/2021 · Associate Editor for Frontiers in Artificial Intelligence

I will serve as Associate Editor for the specialty section in Pattern Recognition of Frontiers in Artificial Intelligence.

07/25/2021 · Paper accepted to Computer Vision and Image Understanding

Our paper "Video action detection by learning graph-based spatio-temporal interactions" has been accepted to Computer Vision and Image Understanding! You can read the full article here.

06/26/2021 · Submission accepted at ICMR 2021

Our submission "Learning to Select: A Fully Attentive Approach for Novel Object Captioning" has been accepted at ICMR 2021.

04/12/2021 · Submission accepted at NVIDIA GTC 2021

Our submission "More Efficient and Accurate Video Networks: A New Approach to Maximize the Accuracy/Computation Tradeoff" has been accepted for oral presentation at NVIDIA GTC 2021, which will take place online on April 12 -16. Watch the talk.

02/28/2021 · Area Chair ACM Multimedia 2021

I will participate as Area Chair to the 29th ACM International Conference on Multimedia. The deadline for paper submission is Apr. 10, 2021.

02/18/2021 · Paper accepted to Computer Vision and Image Understanding

09/23/2020 · Interview with La Repubblica

I have been interviewed by Jaime D'Alessandro on Rep: Scienze, about Gpt-3 and Transformed-based language models. You can read the article here.

07/01/2020 · The first Workshop on Computational Aspects of Deep Learning at ICPR 2020

Together with NVAITC, we are organizing the first Workshop on Computational Aspects of Deep Learning (CADL), which will be hosted at ICPR 2020. See the website for more.

06/22/2020 · Call for Demo and Exhibit at ICPR 2020

ICPR invites researchers to present live demonstration of their research results and systems. Demos are intended as real, practical, and possibly interactive proofs of the presenters’ research ideas and scientific or engineering contributions. They should provide the audience the opportunity to discuss working systems, applications and prototypes based on leading edge research, and to discuss and interact in first person with the researchers presenting the demo. See the call.

06/20/2020 · Presenting our Meshed-Memory Transformer at CVPR 2020

We are presenting our work on image captioning at CVPR 2020. Take a look at the video presentation, read the paper and share code!


04/13/2020 · Walking through the museum, while staying home

The Galleria Estense of Modena, in partnership with AImageLab, is organizing virtual tours that let visitors walk through the halls of the art gallery without leaving their homes. See the press releases on Gazzetta di Modena and Modena in Diretta.

02/14/2020 · Associate Editor of Pattern Recognition Letters

I have been appointed as Associate Editor at Pattern Recognition Letters.

12/18/2019 · Our Meshed-Memory Transformer ranks first on the COCO Image Captioning Leaderboard! 🏆

With a CIDEr-D of 1.321, our architecture for image captioning is first on COCO Captioning. See the leaderboard.

09/27/2019 · Invited talk at Modena Smart Life

On September 27th I will give an invited talk at the "Modena Smart Life" festival, on Vision, Language and Embodied AI. See the program of the event for further details.

09/11/2019 · Interview at Smart City on Radio24

I have been interviewed by Maurizio Melis on Radio24. You can hear the podcast of the interview here.

08/09/2019 · PersonArt - an interactive demo at Gallerie Estensi

From September, 13th to 29th you can discover your doppelgänger in art with our interactive face similarity demo.
See the Gallerie Estensi website for more.

08/05/2019 · LAMV is being used at Facebook to detect harmful content

Our solution for matching and detecting copied videos, published in CVPR 2018, is now being used in production scale at Facebook to detect harmful content.

See the official announcement on the Facebook newsroom website, and the Github repository with the source code.

08/04/2019 · Tutorial at ICIAP 2019 - "Vision, Language and Action: from Captioning to Embodied AI"

See the abstract and program on the tutorial page.

07/05/2019 · I am co-organizing the first NVIDIA Inception Event in Italy

More details at the event page.

07/01/2019 · One paper accepted as oral at BMVC (with F. Landi and M. Corsini)

02/25/2019 · Two papers (with M. Cornia and M. Tomei) accepted at CVPR 2019!

06/29/2018 · Our paper on Human Eye Fixations Prediction has been accepted at Transactions on Image Processing (TIP)!

02/19/2018 · Our paper on Temporal Match Kernels (with M. Douze and H. Jégou) has been accepted at CVPR 2018!

02/13/2018 · I successfully defended my thesis.

04/28/2017 · I did an internship at FAIR (Facebook AI Research) Paris from July to October 2017

03/01/2017 · Our paper "Hierarchical Boundary-Aware Neural Encoder for Video Captioning" has been accepted at CVPR 2017

08/30/2016 · Imagelab will receive a GPU-based server as part of the Facebook AI Research Partnership