ViPER: The Video Performance Evaluation Resource
Overview
At the Language and Media Processing Lab, much of our research focuses on analyzing video for semantic content. This includes tracking people, detecting text, and so forth. ViPER is our system for evaluating our work. The Video Processing Analysis Resource is a toolkit of scripts and Java programs that enable the markup of visual data ground truth, and systems for evaluating how closely sets of result data approximate that truth.
The Performance Evaluation Problem
In order to evaluate a video analysis algorithm, or a set of algorithms, it is necessary to define a methodology. As there are many books and papers describing methods for evaluating specific types of algorithms, we decided to develop a general framework for evaluation. The basic idea common in most types of evaluations we do is a comparison between the computer generated output and some ideal version of 'Truth'.
In some subfields of vision, like document processing, it is possible to automatically generate test data. However for video processing, it is more common for a human to define the ground truth for each video clip. In order to ensure that researchers may repeat and verify evaluations, it is important to make the ground truth metadata is available to other researchers in a documented format. It is very useful to have methods of qualitatively verifying the ground truth, as well, with metadata broswers and editors. ViPER-GT provides tools for creation and editing video metadata.
There are many ways to define how correct a result data set is with respect to a ground truth data set. A metric that looks at difference in size of bounding boxes for text detection may give different results than a more goal-oriented metric which operates on the number of characters or words correctly recognized. ViPER-PE provides tools for solving the evaluation problem.
The ViPER Ground Truth Authoring Tool
ViPER-GT gives the process of authoring ground truth a Java graphical user interface. It is designed to allow frame-by-frame markup of video metadata stored in the Viper format. It is also useful for visualization. For more information, see the ViPER-GT product page.
The ViPER Performance Evaluation Tool
ViPER-PE is a command line performance evaluation tool. It offers a variety of metrics for performing comparison between video metadata files. With it, a user can select from multiple metrics to compare a result data set with ground truth data. It can give precision and recall metrics, perform frame-by-frame and object-based evaluations, and features a filtering mechanism to evaluate relevant subsets of the data. More information can be found at the ViPER-PE product page.
Other Tools
The ViPER API is a set of Java interfaces and classes that provide programmatic access to data stored in the ViPER format. It offers a generic, object-oriented view of video metadata that is aimed at evaluation. Since ViPER data is stored in XML, it is not difficult to read in the data in languages that cannot interface with Java.
ViPER-Viz is (currently) a set of UNIX scripts that enable a user to compactly visualize ground truth, analysis results, performance evaluation results, or an entire video clips, using several flexible representations.
Important Notes
The system is currently under development, so there are still bugs in the program. As such, there is no warranty on this, expressed or implied. Save early, save often. See our bug list, hosted at SourceForge, for more details.
Other Vipers
Viper is a very common name for software products. The one Google finds most related to this is the GIFT system, which uses the University of Geneva's Computer Vision Group's Viper CBIR project's work. This project is also unrelated to Prima Graphic's video board of the same name, or the car of the same name, or the snake. Nor are we related to the Purdue Video and Image Processing Laboratory.
- viper-toolkit@googlegroups.com - A publicly logged discussion about viper, with archives. News of new releases will be placed here.
- viper@cfar.umd.edu - For general questions, ideas, discussion and announcements
- viper-bugs@cfar.umd.edu - For bug reports
Publications
- V.Y. Mariano, J. Min, J.-H. Park, R. Kasturi, D. Mihalcik, D. Doermann, and T. Drayer. Performance Evaluation of Object Detection Algorithms. International Conference on Pattern Recognition, pages 965-969, 2002 .
- D. Doermann, and D. Mihalcik. Tools and Techniques for Video Performances Evaluation. ICPR, pages 167-170, 2000 .
Documentation
For the most up-to-date versions of the manual, see the HTML versions in the documentation directory. (The PDF versions may be out of date.). As always, if something is missing or inaccurate, e-mail questions to viper at cfar.umd.edu.
- Installation Instructions
- Authoring Ground Truth with ViPER-GT
- Using ViPER's Performance Evaluation Tool
- Scripting ViPER
- Introduction to the ViPER File Format
Software
To get a list of all releases we have made available, see the ViPER SourceForge Project download page. This is especially useful if you want viper-gt 3.6, the last stable version. (Note, 3.6 requires JMF and JAI.)
- ViPER Light Distribution (Recommended) [May 25, 2005]
- Full Distribution of ViPER, including source (with source code! Eclipse recommended) [May 25, 2005]
Sample Files
- Small Example File (LAMP-Moving)
- Sample Metadata File: Three people in a room with superimposed text.
- Sample Video File: a short clip of some LAMPers
- Security Footage from PETS
- Sample Metadata File: Tracking people on a security camera.
- Sample Video File: Wide aspect ratio security footage.
Extras for ViPER-GT
The ViPER-GT video annotation tool has an ability to run additional
scripts and plug-ins. Here are some example scripts. To install them, put
them in the ~/.viper/scripts
directory. In windows, download them into
the Documents and Settings/username/.viper/scripts
folder.
- Add I-Frame Descriptor: Create a new descriptor that represents every I-Frame of an MPEG. It also automatically sets the interface to 'Display When Valid'. This means that playing back the video will only display i-frames. To turn this off, click the red icon in the timeline view.
- Invalidate Elsewhen: Invalidate all descriptors on frames where the currently selected descriptor is marked as invalid. This is useful if you have a 'frames of interest' descriptor, e.g. one created by the 'Add I-Frame Descriptor' plug-in.
- Scrub Invalid Attribute Regions: Removes all data stored in attributes that are in invalid frames of their parent descriptor. Removing a check-mark from the 'V' column of the table doesn't change the value of the stored attributes; this means that there is plenty of left-over meaningless data in the file and in memory. So, scrubbing your data isn't necessary, but it results in cleaner files. People who use your data will be pleased.
Developer Information
For information about our future plans and goals, and our current progress towards them, see the developer section of the web site.
We are using SourceForge.net to host the web site and bug list, but we are using an internal CVS server for development of the software. If you wish to contribute bug fixes, you may e-mail links to the patches to viper-bugs@cfar.umd.edu. Also, feel free to send questions to viper@cfar.umd.edu.