Profile
                picture

Lorenzo Baraldi

Associate Professor, AImageLab
ELLIS Scholar and Coordinator of the Modena Unit
University of Modena and Reggio Emilia
  • Email: lorenzo.baraldi -at- unimore.it
  • Curriculum: C.V.
  •   

Background

Since 2024, I am an Associate Professor at the University of Modena and Reggio Emilia, where I work on Deep Learning, Vision-and-Language integration, Large-Scale models and Multimedia. I teach in the courses of "Computer Vision and Cognitive Systems," Scalable AI, and Computer Architecture. My research interests span various areas, including Vision-and-Language integration, Multimodal Retrieval, Image and Video Captioning, Visual-Semantic alignment, Large-Scale model development, HPC and Embodied AI.

I have authored more than 120 publications in international journals and conferences. Currently, I serve as an Associate Editor for Computer Vision and Image Understanding and Pattern Recognition and act as an Area Chair for ICCV and major multimedia conferences. I am also a Scholar in the ELLIS society (European Laboratory for Learning and Intelligent Systems), where I coordinate the Modena ELLIS Unit.

Since 2021, I have held the position of deputy director at the Interdepartmental Center on Digital Humanities at the University of Modena and Reggio Emilia. Earlier in my career, in 2017, I worked at the Facebook AI Research laboratory in Paris under the supervision of Hervé Jégou. During that time, I worked on the development of a video-matching algorithm that was adopted in production on the social network to detect abusive content.

News

Tutorial on AI and HPC at ICIAP 2025

Tutorial on AI and HPC at ICIAP 2025

We are glad to announce that we will organize, together with NVIDIA, a tutorial on AI and HPC at ICIAP 2025. The tutorial is organized as part of the MINERVA European Project.

Publication date: 05/02/2025
Three papers accepted at CVPR 2025!

Three papers accepted at CVPR 2025!

Our papers "Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering", "Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval" and "Hyperbolic Safety-Aware Vision-Language Models" (joint collaboration with UvA) have been accepted to CVPR 2025!

Update: "Hyperbolic Safety-Aware Vision-Language Models" has been selected as highlight paper!

Update: Me and four of my co-authors have been selected as Outstanding Reviewers.

Publication date: 02/26/2025
IRCDL 2026 in Modena

IRCDL 2026 in Modena

Happy to share that we will host the 2026 edition of IRCDL, the Conference on Information and Research Science Connecting to Digital and Library Science.

Publication date: 02/21/2025
Area Chair for ACM Multimedia 2025

Area Chair for ACM Multimedia 2025

Happy to share that I will serve as an Area Chair for ACM Multimedia 2025!

Publication date: 02/17/2025
Paper accepted at ICLR 2025

Paper accepted at ICLR 2025

Our paper "Causal Graphical Models for Vision-Language Compositional Understanding" (with F. Parascandolo, N. Moratelli, E. Sangineto, R. Cucchiara) has been accepted to ICLR 2025!

Publication date: 01/22/2025
Associate Editor for Computer Vision and Image Understanding (CVIU)

Associate Editor for Computer Vision and Image Understanding (CVIU)

Happy to share that I joined the CVIU Editorial Board as an Associate Editor!

Publication date: 12/30/2024
CADL Workshop accepted at ISC 2025

CADL Workshop accepted at ISC 2025

Our "Computational Aspects of Deep Learning" Workshop, organized in conjunction with NVIDIA and CINECA, has been accepted in ISC High Performance 2025, the HPC Event.

Publication date: 12/06/2024
Area Chair for ICCV 2025

Area Chair for ICCV 2025

Happy to share that I will serve as an Area Chair for ICCV 2025!

Publication date: 12/03/2024
Two papers accepted at WACV 2025

Two papers accepted at WACV 2025

Glad to announce that our papers, "Semantically Conditioned Prompts for Visual Recognition under Missing Modality Scenarios" (with V. Pipoli, F. Bolelli, S. Sarto, M. Cornia, C. Grana, R. Cucchiara and E. Ficarra) and "Perceive, Query & Reason: Enhancing Video QA with Question-Guided Temporal Queries" (with R. Amoroso, G. Zhang, R. Koner, R. Cucchiara, V. Tresp) have been accepted at WACV 2025!

Publication date: 11/01/2024

Featured publications

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Recurrence-Enhanced Vision-and-Language Transformers for Robust Multimodal Document Retrieval

Davide Caffagni, Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
CVPR 2025

Hyperbolic Safety-Aware Vision-Language Models

Hyperbolic Safety-Aware Vision-Language Models

Tobia Poppi, Tejaswi Kasarla, Pascal Mettes, Lorenzo Baraldi, Rita Cucchiara
CVPR 2025 Highlight

Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering

Augmenting Multimodal LLMs with Self-Reflective Tokens for Knowledge-based Visual Question Answering

Federico Cocchi, Nicholas Moratelli, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
CVPR 2025

Causal Graphical Models for Vision-Language Compositional Understanding

Causal Graphical Models for Vision-Language Compositional Understanding

Fiorenzo Parascandolo, Nicholas Moratelli, Enver Sangineto, Lorenzo Baraldi, Rita Cucchiara
ICLR 2025

Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments

Personalized Instance-based Navigation Toward User-Specific Objects in Realistic Environments

Luca Barsellotti, Roberto Bigazzi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
NeurIPS 2024, Datasets and Benchmarks track

Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models

Safe-CLIP: Removing NSFW Concepts from Vision-and-Language Models

Samuele Poppi, Tobia Poppi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
ECCV 2024

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities

Contrasting Deepfakes Diffusion via Contrastive Learning and Global-Local Similarities

Lorenzo Baraldi, Federico Cocchi, Marcella Cornia, Lorenzo Baraldi, Alessandro Nicolosi, Rita Cucchiara
ECCV 2024

BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues

BRIDGE: Bridging Gaps in Image Captioning Evaluation with Stronger Visual Cues

Sara Sarto, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
ECCV 2024

Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation

Training-Free Open-Vocabulary Segmentation with Offline Diffusion-Augmented Prototype Generation

Luca Barsellotti, Roberto Amoroso, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
Proceedings of the 2024 IEEE/CVF Conference on Computer Vision and Pattern Recognition

The Revolution of Multimodal Large Language Models: A Survey

The Revolution of Multimodal Large Language Models: A Survey

Davide Caffagni, Federico Cocchi, Luca Barsellotti, Nicholas Moratelli, Sara Sarto, Lorenzo Baraldi, Lorenzo Baraldi, Marcella Cornia, Rita Cucchiara
Findings of the Association for Computational Linguistics: ACL 2024

Courses

Scalable AI (2024/2025)
Course material
Laurea Magistrale in Ingegneria Informatica
Lorenzo Baraldi, Giuseppe Fiameni

Computer Vision and Cognitive Systems (2024/2025)
Course material
Laurea Magistrale in Ingegneria Informatica
Rita Cucchiara, Lorenzo Baraldi

AI for Automotive (2024/2025)
Electronic Engineering for Intelligent Vehicles
Rita Cucchiara, Lorenzo Baraldi

Architettura dei Calcolatori (2024/2025)
Course material · Upcoming exams
Ingegneria Informatica
Rita Cucchiara, Lorenzo Baraldi