Profile
                picture

Lorenzo Baraldi

Tenure Track Assistant Professor (RTD-B), AImageLab
ELLIS Scholar and Coordinator of the Modena Unit
University of Modena and Reggio Emilia
  • Email: lorenzo.baraldi -at- unimore.it
  • Curriculum: C.V.
  •  

News

PRIN 2022 PNRR project accepted · 07/30/2023

Our PRIN 2022 project "MUCES - a MUltimedia platform for Content Enrichment and Search in audiovisual archives" (with F. Carrara) has been accepted!

Paper accepted at ICCV 2023 · 07/14/2023

We are glad to announce that our paper "With a Little Help from your own Past: Prototypical Memory Networks for Image Captioning" has been accepted to ICCV 2023.

PRIN 2022 project accepted · 06/18/2023

Our PRIN 2022 project "MUSMA: Multimedia Understanding meets Social Media Analysis" (with G. Serra and W. Quattrociocchi) has been accepted!

Workshop and Challenge on DeepFake Analysis and Detection at ICCV 2023 · 06/03/2023

I am co-organizing the first International Workshop on DeepFake Analysis and Detection, which will be held in conjunction with ICCV 2023. See the website for more. Paper submission deadline is July, 17th AoE.

Area Chair ACM Multimedia 2023 · 04/20/2023

I will serve as Area Chair to the 30th ACM International Conference on Multimedia. The deadline for paper submission is 30th April 2023.

ELLIS Summer School on Large-Scale AI · 04/19/2023

Our Modena Elllis Unit will is organizing a Summer School of the Ellis network this year, on September 18th to 25th, at Modena Technopole. See the website for more.

Paper accepted as highlight at CVPR 2023 · 03/02/2023

We are glad to announce that our paper "Positive-Augmented Constrastive Learning for Image and Video Captioning Evaluation" has been accepted to CVPR 2023 as highlight paper (top 2.5% of submissions). Arxiv and Github.

ELLIS Scholar · 07/29/2021

I have been elected as an ELLIS Scholar in the ELLIS society, the European Laboratory for Learning and Intelligent Systems.

Interview with La Repubblica · 09/23/2020

I have been interviewed by Jaime D'Alessandro on Rep: Scienze, about Gpt-3 and Transformed-based language models. You can read the article here.

LAMV is being used at Facebook to detect harmful content · 08/05/2019

Our solution for matching and detecting copied videos, published in CVPR 2018, is now being used in production scale at Facebook to detect harmful content.

See the official announcement on the Facebook newsroom website, and the Github repository with the source code.

Older news can be found in the news archive.

Recent pre-prints

Universal Captioner: Inducing Content-Style Separation in Vision-and-Language Model Training

Universal Captioner: Inducing Content-Style Separation in Vision-and-Language Model Training
M. Cornia, L. Baraldi, G. Fiameni, R. Cucchiara

 

Towards Sustainable Video Modeling: Progressive Architecture Shrinkage for Action Recognition

Towards Sustainable Video Modeling: Progressive Architecture Shrinkage for Action Recognition
M. Tomei, L. Baraldi, G. Fiameni, S. Bronzin, R. Cucchiara

Tell Me What To Describe: Fully-Attentive Iterative Networks for Region-Controlled Image and Video Captioning

Tell Me What To Describe: Fully-Attentive Iterative Networks for Region-Controlled Image and Video Captioning
M. Cornia, L. Baraldi, R. Cucchiara

Featured publications

Complete list is available in the publications page.

Multimodal attention networks for low-level vision-and-language navigation

Multimodal attention networks for low-level vision-and-language navigation
F. Landi, M. Cornia, L. Baraldi, M. Corsini, R. Cucchiara
Computer Vision and Image Understanding

   

From Show to Tell:  A Survey on Image Captioning

From Show to Tell: A Survey on Image Captioning
M. Stefanini, M. Cornia, L. Baraldi, S. Cascianelli, G. Fiameni, R. Cucchiara
IEEE TPAMI

   

Video action detection by learning graph-based spatio-temporal interactions

Video action detection by learning graph-based spatio-temporal interactions
Matteo Tomei, Lorenzo Baraldi, Simone Calderara, Simone Bronzin, Rita Cucchiara
Computer Vision and Image Understanding

   

SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability

SMArT: Training Shallow Memory-aware Transformers for Robotic Explainability
Marcella Cornia, LORENZO BARALDI, Rita Cucchiara
International Conference on Robotics and Automation

 

Meshed-Memory Transformer for Image Captioning

Meshed-Memory Transformer for Image Captioning
MARCELLA CORNIA, MATTEO STEFANINI, LORENZO BARALDI, Rita CUCCHIARA
2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition

   

Explore and Explain: Self-supervised Navigation and Recounting

Explore and Explain: Self-supervised Navigation and Recounting
Roberto Bigazzi, Federico Landi, Marcella Cornia, Silvia Cascianelli, Lorenzo Baraldi, Rita Cucchiara
Proceedings of the 25th International Conference on Pattern Recognition

 

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters

Embodied Vision-and-Language Navigation with Dynamic Convolutional Filters
Federico Landi, Lorenzo Baraldi, Massimiliano Corsini, Rita Cucchiara
BMVC 2019

   

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions
Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
CVPR 2019

   

Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-to-Image Translation

Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-to-Image Translation
Matteo Tomei, Marcella Cornia, Lorenzo Baraldi, Rita Cucchiara
CVPR 2019

   

Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model

Predicting Human Eye Fixations via an LSTM-based Saliency Attentive Model
Marcella Cornia, Lorenzo Baraldi, Giuseppe Serra, Rita Cucchiara
IEEE TRANSACTIONS ON IMAGE PROCESSING

   

LAMV: Learning to align and match videos with kernelized temporal layers

LAMV: Learning to align and match videos with kernelized temporal layers
Lorenzo Baraldi, Matthijs Douze, Rita Cucchiara, Hervé Jégou
CVPR 2018

   

Hierarchical Boundary-Aware Neural Encoder for Video Captioning

Hierarchical Boundary-Aware Neural Encoder for Video Captioning
Lorenzo Baraldi, Costantino Grana, Rita Cucchiara
CVPR 2017

 

Teaching

Complete list is available in the teaching page.

Scalable AI (2023/2024)
Laurea Magistrale in Ingegneria Informatica
Lorenzo Baraldi, Giuseppe Fiameni

AI for Automotive (2022/2023)
Course material
Advanced Automotive Electronics Engineering, Electronics Engineering
Rita Cucchiara, Lorenzo Baraldi

Computer Vision and Cognitive Systems (2022/2023)
Course material · Upcoming exams
Laurea Magistrale in Ingegneria Informatica
Rita Cucchiara, Lorenzo Baraldi