Research Grant Positions Available! · 07/12/2024

👉 We have open research grant positions within the PRIN projects "MUCES - a Multimedia platform for Content Enrichment and Search in audiovisual archives" and "MUSMA: Multimedia Understanding meets Social Media Analysis". If you are interested, please get in touch!

News

Paper accepted to NeurIPS 2024 · 09/26/2024

Our paper, "Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments", has been accepted to NeurIPS 2024, Datasets and Benchmarks track!

Introducing LLaVa-MORE · 08/03/2024

🔥 Today we are introducing LLaVA-MORE, a family of models that enhances LLaVA by integrating LLaMA 3.1 as the language model. Check out our Github repo!

Oral paper accepted to BMVC 2024 · 07/20/2024

Our paper, "Revisiting Image Captioning Training Paradigm via Direct CLIP-based Optimization", has been accepted for oral presentation to BMVC 2024!

MINERVA proposal successful! · 07/04/2024

Our proposal MINERVA, submitted to the DIGITAL-EUROHPC-JU-2023-AISC-03-01 call, and coordinated by CINECA, has been successfully approved!

Three papers accepted at ECCV 2024! · 07/01/2024

Glad to announce that we have three papers accepted at ECCV 2024: "Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models", "Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities" and "BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues".

Paper accepted to ACL 2024 · 05/16/2024

Our paper, "The Revolution of Multimodal Large Language Models: A Survey", has been accepted to the ACL 2024 Findings!

Paper accepted as highlight at CVPR 2023 · 03/02/2023

We are glad to announce that our paper "Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation" has been accepted to CVPR 2023 as highlight paper (top 2.5% of submissions). Arxiv and Github.

ELLIS Scholar · 07/29/2021

I have been elected as an ELLIS Scholar in the ELLIS society, the European Laboratory for Learning and Intelligent Systems.

Interview with La Repubblica · 09/23/2020

I have been interviewed by Jaime D'Alessandro on Rep: Scienze, about Gpt-3 and Transformed-based language models. You can read the article here.

LAMV is being used at Facebook to detect harmful content · 08/05/2019

Our solution for matching and detecting copied videos, published in CVPR 2018, is now being used in production scale at Facebook to detect harmful content.

See the official announcement on the Facebook newsroom website, and the Github repository with the source code.

Older news can be found in the news archive.

Featured publications

Complete list is available in the publications page.

Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments
Luca Barsellotti, Roberto Bigazzi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
NeurIPS 2024

Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models
Samuele Poppi, Tobia Poppi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
ECCV 2024

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities
Lorenzo Baraldi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Alessandro Nicolosi, Rita Cucchiara
ECCV 2024

BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues
Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
ECCV 2024

Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation
Luca Barsellotti, Roberto Amoroso, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition

The Revolution of Multimodal Large Language Models: A Survey
Davide Caffagni, Federico Cocchi, Luca Barsellotti, Nicholas Moratelli, Sara Sarto, Lorenzo Baraldi, Lorenzo Baraldi, Marcella Cornia, Rita Cucchiara
Findings of the Association for Computational Linguistics: ACL 2024

What’s Outside the Intersection? Fine-grained Error Analysis for Semantic Segmentation Beyond IoU
Maximilian Bernhard, Yannic Kindermann, Roberto Amoroso, Matthias Schubert, Lorenzo Baraldi, Rita Cucchiara, Volker Tresp
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

FOSSIL: Free Open-Vocabulary Semantic Segmentation through Synthetic References Retrieval
Luca Barsellotti, Roberto Amoroso, Lorenzo Baraldi, Rita Cucchiara
Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision (WACV)

Generating More Pertinent Captions by Leveraging Semantics and Style on Multi-Source Datasets
M. Cornia, L. Baraldi, G. Fiameni, R. Cucchiara
International Journal of Computer Vision

Fully-Attentive Iterative Networks for Region-based Controllable Image and Video Captioning
M. Cornia, L. Baraldi, A. Tal, R. Cucchiara
Computer Vision and Image Understanding

With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning
Manuele Barraco, Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
Proceedings of the IEEE/CVF International Conference on Computer Vision

Video Surveillance and Privacy: A Solvable Paradox?
Rita Cucchiara, Lorenzo Baraldi, Marcella Cornia, Sara Sarto
IEEE Computer

Superpixel Positional Encoding to Improve ViT-based Semantic Segmentation Models
Roberto Amoroso, Matteo Tomei, Lorenzo Baraldi, Rita Cucchiara
Proceedings of the British Machine Vision Conference 2023

Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation
Sara Sarto, Manuele Barraco, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
Proceedings of the 2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition

Let's ViCE! Mimicking Human Cognitive Behavior in Image Generation Evaluation
Federico Betti, Jacopo Staiano, Lorenzo Baraldi, Lorenzo Baraldi, Rita Cucchiara, Nicu Sebe
Proceedings of ACM International Conference on Multimedia 2023

From Show to Tell: A Survey on Image Captioning
M. Stefanini, M. Cornia, L. Baraldi, S. Cascianelli, G. Fiameni, R. Cucchiara
IEEE TPAMI

Video action detection by learning graph-based spatio-temporal interactions
Matteo Tomei, Lorenzo Baraldi, Simone Calderara, Simone Bronzin, Rita Cucchiara
Computer Vision and Image Understanding

SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability
Marcella Cornia, LORENZO BARALDI, Rita Cucchiara
International Conference on Robotics and Automation

Meshed-Memory Transformer for Image Captioning
MARCELLA CORNIA, MATTEO STEFANINI, LORENZO BARALDI, Rita CUCCHIARA
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters
Federico Landi, Lorenzo Baraldi, Massimiliano Corsini, Rita Cucchiara
BMVC 2019

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
CVPR 2019

Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-to-Image Translation
Matteo Tomei, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
CVPR 2019

Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model
Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, Rita Cucchiara
IEEE TRANSACTIONS ON IMAGE PROCESSING

LAMV: Learning to align and match videos with kernelized temporal layers
Lorenzo Baraldi, Matthijs Douze, Rita Cucchiara, Hervé Jégou
CVPR 2018

Hierarchical Boundary-Aware Neural Encoder for Video Captioning
Lorenzo Baraldi, Costantino Grana, Rita Cucchiara
CVPR 2017

Teaching

Complete list is available in the teaching page.

Architettura dei Calcolatori (2024/2025)
Course material
Ingegneria Informatica
Rita Cucchiara, Lorenzo Baraldi

Computer Vision and Cognitive Systems (2023/2024)
Course material · Upcoming exams
Laurea Magistrale in Ingegneria Informatica
Lorenzo Baraldi, Vittorio Cuculo

AI for Automotive (2023/2024)
Electronic Engineering for Intelligent Vehicles
Rita Cucchiara, Lorenzo Baraldi

Scalable AI (2023/2024)
Course material
Laurea Magistrale in Ingegneria Informatica
Lorenzo Baraldi, Giuseppe Fiameni, Marta Lovino

If you need anything, see the Office hours page to reach me out.

Tweets by lorenzo_baraldi