This repository contains the data (datasets, video/user summaries, CUS evaluation) from the paper VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method. We originally created the repository in 2011 at (inactive) Google sites https://sites.google.com/site/vsummsite/.
Video summarization is one of the most important topics, potentially enabling faster browsing of large video collections and more efficient content indexing and access. Essentially, this research area consists of automatically generating a short summary of a video, which can either be static or dynamic. Static video summaries are composed of a set of keyframes extracted from the original video, while dynamic video summaries are composed of a set of shots and are produced taking into account the similarity or domain-specific relationships among all video shots.
In this project, we developed VSUMM, a methodology for producing static video summaries. The method is based on color feature extraction from video frames and a k-means clustering algorithm. We also developed a novel approach for evaluating video static summaries. In such an approach, several users manually create video summaries, which are then compared both to our approach and to a number of different techniques in the literature.
The main contributions of this paper are:
- A mechanism designed to produce static video summaries, which presents the advantages of the main concepts of related work in the video summarization;
- A new evaluation method of video summaries, which reduces the subjectivity in the evaluation task, quantifies the summary quality, and allows more objective comparisons among different techniques; and
- A statistically well-founded experimental evaluation of both the proposed summarization technique – contrasted to others in the literature – and the evaluation method.
You can download the data in a single or each zip file.
File (zip) | Size (MB) | Description |
---|---|---|
Dataset | 763 | 50 videos from Open Video. All videos are in MPEG-1 format (30 fps, 352 x 240 pixels), in color and with sound. These videos are distributed among several genres (documentary, educational, ephemeral, historical, lecture) and their duration varies from 1 to 4 minutes and approximately 75 minutes of video in total. |
User summary | 31.5 | 250 user summaries. These summaries were created manually by 50 users, each one dealing with 5 videos, meaning that each video has 5 video summaries created by 5 different users. |
VSUMM1 summary | 6.28 | 50 video summaries (ours). |
VSUMM2 summary | 4.91 | 50 video summaries (ours). |
OV summary | 6.30 | 50 video summaries (dataset providers). |
DT summary | 3.68 | 50 video summaries [Mundur et al., 2006]. |
STIMO summary (extended version of VISTO approach). | 5.86 | 50 video summaries [Furini et al., 2010]. |
File (zip) | Size (MB) | Description |
---|---|---|
Dataset | 468 | 50 video from websites like YouTube. These videos are distributed among several genres (cartoons, news, sports, commercials, tv-shows and home videos) and their duration varies from 1 to 10 minutes. |
User summary | 45.3 | 250 user summaries. These summaries were created manually by 50 users, each one dealing with 5 videos, meaning that each video has 5 video summaries created by 5 different users. |
VSUMM summary | 7.93 | 50 video summaries (ours). |
CUS evaluation method (jar, example) (update on February 2014). The jai_core and jai_codec were included in the CUS implementation.
Usage:
java -jar CUS.jar -i [input_file.txt] -o [output_file.txt] -u [number_user_summaries] -a [number_approaches] -t [threshold (default: 0.5)]
-i [input_file.txt]
: The first line contains the video directory name. The following lines contain the directory of each user summary and the summaries of each approach. A blank line separates the videos. Input file format:
[path]/[video1_name]/
[path]/[video1_name]/[user_summary1]/
[path]/[video1_name]/[user_summary2]/
[path]/[video1_name]/[user_summary3]/
[path]/[video1_name]/[user_summary4]/
[path]/[video1_name]/[user_summary5]/
[path]/[video1_name]/[approach1]/
[path]/[video1_name]/[approach2]/
[path]/[video2_name]/
[path]/[video2_name]/[user_summary1]/
[path]/[video2_name]/[user_summary2]/
[path]/[video2_name]/[user_summary3]/
[path]/[video2_name]/[user_summary4]/
[path]/[video2_name]/[user_summary5]/
[path]/[video2_name]/[approach1]/
[path]/[video2_name]/[approach2]/
-u [number_user_summaries]
: The number of user summaries for each video.-a [number_approaches]
: The number of approaches which produced the automatic summaries.-t [threshold (default: 0.5)]
: The CUS evaluation method compares each user summary directly with the automatic summaries. The color histogram is applied to compare keyframes from different summaries, and the distance between them is measured using the Manhattan distance. Two keyframes are similar if the distance between them is less than a predetermined threshold. Once two frames are matched, they are removed from the next iteration of the comparing procedure. The default value is 0.5.
Video | Name | #Frames | Duration | Summary |
---|---|---|---|---|
v21 | The Great Web of Water, segment 01 | 3,279 | 1:50 | v21 |
v22 | The Great Web of Water, segment 02 | 2,118 | 1:11 | v22 |
v23 | The Great Web of Water, segment 07 | 1,745 | 0:59 | v23 |
v24 | A New Horizon, segment 01 | 1,806 | 1:01 | v24 |
v25 | A New Horizon, segment 02 | 1,797 | 1:00 | v25 |
v26 | A New Horizon, segment 03 | 6,249 | 3:29 | v26 |
v27 | A New Horizon, segment 04 | 3,192 | 1:47 | v27 |
v28 | A New Horizon, segment 05 | 3,561 | 1:59 | v28 |
v29 | A New Horizon, segment 06 | 1,944 | 1:05 | v29 |
v30 | A New Horizon, segment 08 | 1,815 | 1:01 | v30 |
v31 | A New Horizon, segment 10 | 2,517 | 1:24 | v31 |
v32 | Take Pride in America, segment 01 | 2,691 | 1:30 | v32 |
v33 | Take Pride in America, segment 03 | 3,261 | 1:49 | v33 |
v34 | Digital Jewelry: Wearable Technology for Every Day Life | 4,204 | 3:00 | v34 |
v35 | HCIL Symposium 2002 - Introduction, segment 01 | 2,336 | 1:18 | v35 |
v36 | Senses And Sensitivity, Introduction to Lecture 1 presenter | 4,221 | 2:20 | v36 |
v37 | Senses And Sensitivity, Introduction to Lecture 2 | 3,411 | 1:53 | v37 |
v38 | Senses And Sensitivity, Introduction to Lecture 3 presenter | 4,566 | 2:32 | v38 |
v39 | Senses And Sensitivity, Introduction to Lecture 4 presenter | 5,249 | 2:55 | v39 |
v40 | Exotic Terrane, segment 01 | 2,940 | 1:38 | v40 |
v41 | Exotic Terrane, segment 02 | 2,776 | 1:32 | v41 |
v42 | Exotic Terrane, segment 03 | 2,676 | 1:29 | v42 |
v43 | Exotic Terrane, segment 04 | 4,797 | 2:40 | v43 |
v44 | Exotic Terrane, segment 06 | 2,425 | 1:21 | v44 |
v45 | Exotic Terrane, segment 08 | 2,428 | 1:40 | v45 |
v46 | America's New Frontier, segment 01 | 3,591 | 1:59 | v46 |
v47 | America's New Frontier, segment 03 | 2,166 | 1:12 | v47 |
v48 | America's New Frontier, segment 04 | 3,705 | 2:03 | v48 |
v49 | America's New Frontier, segment 07 | 3,615 | 2:00 | v49 |
v50 | America's New Frontier, segment 10 | 4,830 | 2:41 | v50 |
v51 | The Future of Energy Gases, segment 03 | 2,934 | 1:37 | v51 |
v52 | The Future of Energy Gases, segment 05 | 3,615 | 2:00 | v52 |
v53 | The Future of Energy Gases, segment 09 | 1,884 | 1:02 | v53 |
v54 | The Future of Energy Gases, segment 10 | 2,886 | 1:36 | v54 |
v55 | Oceanfloor Legacy, segment 01 | 1,740 | 0:58 | v55 |
v56 | Oceanfloor Legacy, segment 02 | 2,325 | 1:17 | v56 |
v57 | Oceanfloor Legacy, segment 04 | 3,450 | 1:55 | v57 |
v58 | Oceanfloor Legacy, segment 08 | 3,186 | 1:46 | v58 |
v59 | Oceanfloor Legacy, segment 09 | 2,106 | 1:10 | v59 |
v60 | The Voyage of the Lee, segment 05 | 2,094 | 1:09 | v60 |
v61 | The Voyage of the Lee, segment 15 | 2,094 | 1:15 | v61 |
v62 | The Voyage of the Lee, segment 16 | 2,619 | 1:27 | v62 |
v63 | Hurricane Force - A Coastal Perspective, segment 03 | 2,310 | 1:17 | v63 |
v64 | Hurricane Force - A Coastal Perspective, segment 04 | 5,310 | 2:57 | v64 |
v65 | Drift Ice as a Geologic Agent, segment 03 | 5,310 | 1:31 | v65 |
v66 | Drift Ice as a Geologic Agent, segment 05 | 2,187 | 1:12 | v66 |
v67 | Drift Ice as a Geologic Agent, segment 06 | 2,425 | 1:30 | v67 |
v68 | Drift Ice as a Geologic Agent, segment 07 | 1,950 | 1:05 | v68 |
v69 | Drift Ice as a Geologic Agent, segment 08 | 3,618 | 2:00 | v69 |
v70 | Drift Ice as a Geologic Agent, segment 10 | 1,407 | 0:46 | v70 |
@article{de2011vsumm,
title={VSUMM: A mechanism designed to produce static video summaries and a novel evaluation method},
author={De Avila, Sandra Eliza Fontes and Lopes, Ana Paula Brandao and da Luz Jr, Antonio and de Albuquerque Ara{\'u}jo, Arnaldo},
journal={Pattern recognition letters},
volume={32},
number={1},
pages={56--68},
year={2011},
publisher={Elsevier}
}
The authors are grateful to CNPq, CAPES and FAPEMIG, Brazilian research funding agencies, for the financial support to this work.