Home >> Media Group >> Research >> ViPER
Downloads -- Documentation -- Developers -- Discussion
LAMP     The Language and Media Processing Laboratory

ViPER Toolkit Software Requirements

This page describes what the software should do when it is completed. If you are a developer and want to modify this document, please read the howto document for editing this page. The system requirements—what software and hardware are required to run the toolkit—these are mentioned at the home page and in the quick-start document.

Three related projects, the Pure Java MPEG Decoder, the chronicle widget, and the LAMP AppLoader, have their own requirements pages. Here are links to the JMPEG requirements, the chronicle requirements, and the AppLoader requirements.

ViPER Goals

ViPER is designed to support video processing evaluation. To a lesser extent, it should support document, audio, and other media processing evaluation. This is broadly divided into two categories: ground truth management and evaluation.

It should be noted that a third category, workflow management and algorithm execution, which is included with many other such systems, is not a current goal of the system. This includes things like automatic execution of algorithms and creation of ground truth from a source document. For examples, see Tapas Kanungo's work with PSET, or Chhabra's and Phillips's Benchmark for Graphics Recognition Systems. We mostly use ViPER to allow different groups to run algorithms and mail us their results, allowing them to use their own computers.

These requirements are goals for the next version of ViPER, v4. This revision will make radical changes to the ground truth tool to support new visualizations and, hopefully, a cleaner workflow. There will also be minor changes to the performance evaluation tool, including the introduction of a fourth evaluation paradigm for activity detection.

Definitions

ViPER
The Video Performance Evaluation Toolkit (or Video Performance Evaluation Toolkit), provides a set of tools for evaluating video processing algorithms, such as person tracking and text detection.
Candidate
Also known as result data, this is the set of descriptors generated by some algorithm that will be compared to the target data.
Target
Also known as truth data, this is the set of descriptors that represent the true content of the media file.

ViPER-GT: The Ground Truth Authoring and Viewing Tool

Remember Ben Shneiderman's infoviz mantra:

  1. Overview First
  2. Zoom + Filter
  3. Details on Demand
  1. Core System
    1. Metadata Management
      1. The core system must allow the plug-ins to access via the ViPER API.
      2. The core system must support loading and saving any valid viper file and saving it back, so that the two files are equivalent.
      3. The system should have a history of user files, to allow quick loading of recently used files.
      4. The system should be able to handle loading multiple files at a time, for editing and comparing results, and switching between related files in a project (e.g. different camera angles).
      5. The system may support inserting other files or merging two metadata files. If the GT system doesn't support this, a seperate tool must exist for this kind of functionality (it is often necessary to divide work among multiple truth editors, and merging is a necessary step).
    2. User Interface Mediation for Modules
      1. There must be a means of reducing the user clutter and focusing the user's attention to the work at hand. This includes a method for selecting what file to work on, what frames are important, and what descriptors are relevant to the user's task. An example would be having a selection for showing only descriptors that are relevant to another descriptor, a certain frame, or contain a certain lvalue.
      2. There should be a method for supporting DnD. 694122
    3. Basic Graphical User Interface
      1. The menus and windows must follow standard guidelines (e.g. the Windows guidelines).
      2. The main window must show the current sourcefile selected in its title bar. 694123
      3. The user should have access to a list of most recently used files, to save time looking for files to open.
    4. User Preferences Management
      1. Different modules should have their own preferences, accessable in the same manner or through the same interface.
      2. The user should be able to specify their hotkey preferences. 694133
    5. Undo/Redo 694113
      1. The undo/redo should be accessable through the standard hotkeys and place in the menu.
      2. The undo manager should have a user interface to display multiple levels of undo.
      3. Any undo manager may support transactions, to allow multiple actions to be bundled into one by a script.
    6. Documentation
      1. There must exist a clear design specification of the core system, allowing plug-in writers a clear handle on how to extend the app.
      2. There users must have a manual which describes the system from the perspective of a ground truth author or designer.
    7. Ontology Support
    8. Help System
      1. There must be some form of online help, if only a link to the documentation.
      2. There should exist context-sensitive help, explaining certain features.
  2. Canvas Module
    1. A media canvas must exist, which presents a view of the current file and frame, with the metadata appropriately represented directly upon it, and directly manipulatable when possible.
    2. The canvas must be frame accurate - i.e. it will display the frame x as per the media format specification, and consistant with time, as well.
    3. The canvas should be fast, displaying the new frame after a user suggests a change in less than a second. It would be preferrable to allow real-time playback of video, as well.
    4. The canvas should use standard vector manipulation ui metaphors and gestures, unless there is a clear reason to use different ones. 694134
    5. There should be a way to select multiple objects on the canvas and manipulate them at once. 694114
    6. There should be a way for the user to customize colors or to associate colors with each instance of an object. 694120
    7. The shapes and so forth should be drawn appropriately to the scale of the zoom. 694124
    8. In order to support more clarity while editing ground truth, the canvas should allow the user to manipulate the image quality, e.g. contrast, gamma, and possibly sharpness. 694126
    9. Should have a refresh or snapback button to go to a preferred view clear of clutter or pixel effluvia. 694128
    10. Should have standard video controls to allow real-time scan of video for errors. 694130
    11. It would be nice if there was a way to create a movie of the canvas. 694106
    12. Should have a way of tracing object tracks through a video at near-real-time. 793929
  3. Timeline Module 694089
    1. The new system must have a timeline (see LifeLines or OntoLog to get an idea of what a timeline view is) that shows an overview of the metadata along the t dimension.
    2. The timeline must support direct manipulation paradigms of some sort.
    3. The timeline must provide all the functionality currently provided by the range slider and the buttons beneath it. This includes:
      1. The user can use the timeline widget to scrub through time, dragging a slider, or something, to skip around through the video. The change must be reflected in the canvas.
      2. The user must be able to use the canvas to select frames, down to frame level precision. This may involve the use of a type-in box, or some sort of enhanced fisheye scrubbing.
      3. The user should be able to use the timeline to propagate descriptor data across frames.
      4. The user must be able to use the timeline to interpolate a descriptor between frames.
    4. The display should support dynamic-query type manipulation to display only lines that are relevant to the user.
    5. The timeline should support zooming along the time access. It may support zooming along the orthogonal, but that may prove distracting.
    6. The timeline should present the lines in a coherent ordering, and allow the user to alter the ordering (either with direct manipulation or with some sort of dynamic query type functionality).
    7. The timeline should have thumbnails of important frames (to the user). 694131
  4. Table Module (See the spec)
    1. The table module presents a spreadsheet-like table view of all instances of a descriptor type or set of descriptor types.
    2. The table module must allow direct editing of all attribute values.
    3. The table must have tabs for each object descriptor type, one for all content descriptors, and one for the file descriptor. 694118
    4. It should be possible to hide selected attributes from the view. This will help to reduce clutter and increase user focus. 694116
  5. Property Sheet Module
    1. The property sheets must provide an editable overview of a single descriptor. It may show the descriptor over the course of its life, or for the currently focussed frame.
  6. Ontology/Configuration Editor Module
    1. The system needs a config editor. The current method of editing .gtc files is a significant barrier to the utility of the toolkit.
  7. Tree View of Instances 694117
    1. The new gt should include a treeview of the instances, which will allow the user to select the appropriate ones and to compare them.

ViPER-PE: A Tool for Performance Evaluation

    1. The performance evaluation tool must take in two different representations of metadata for the same source media file and present the user with a numeric valuation of the distance between the two metadata sets.
    2. The performance evaluation tool must support multiple evaluations and scripted evaluation runs.
    3. The performance evaluation tool must support user defined types and evaluation techniques.
    4. The performance evaluation tool must allow the user to evaluate only the section or sections of the truth that is relevant to the user's interest.
    5. The performance evaluation tool must support the ViPER data format.
    6. The user may set parameters to indicate that different metadata classes represent the same thing.

ViPER-Viz: Scripts and Tools for Analyzing Evaluations