Blog for work on my Masters thesis - a survey of methods for evaluating media understanding, object detection, and pattern matching algorithms. Mostly, it is related to ViPER, the Video Performance Evaluation Resource. If you find a good reference, or would like to comment, e-mail viper at cfar.umd.edu.
Archives
Media Processing Evaluation Weblog
Tuesday, June 28, 2005
truth
A video sequence can represent any number of persons, activities, and objects. The goal of a video understanding algorithm is to automatically extract information - allowing a machine to perform a dull task that would take dozens of man hours per hour of footage. ViPER-GT is a system for that dull task. It allows a user to define a set of data that a video represents, and mark it up in painstaking, frame-by-frame detail. In a sense, ViPER-GT is a video annotation tool, in the same vein as VideoAnnEx or VideoMiner. However, it is more accurate to compare it to Anvil or other tools for performance evaluation. I will present information about ViPER-GT's predecessors (and descendants), information about its design and implementation, some quantification of the effort required to use ViPER per quality of output, and some use cases.- posted by David @ 3:24 PM
Quartz Extreme 2D + Viper = Bad
Okay, so I've finally gotten around to installing OS 10.4 on my mac, and one of the first things I did is enable Quartz 2D Extreme. This enables quick drawing. However, it also made viper's zooming features behave in... questionable ways. It may be my video card (ATI Mobility 9700 with 128MB of RAM), or it could be a driver issue, or something to do with Java in Mac OS X. When you zoom in, the resolution is incorrect - the bitmap is rendered at whatever the last resolution was. This is also independent of java version - it happens on 1.5 as well as 1.4.2. (1.5 seems faster, but I didn't test that.) I've also noticed a similar problem in iMovie when Q2dE is enabled and I resize the application window. It seems that a specific resolution image is being placed in the video ram, and then being reused after it should be replaced.- posted by David @ 12:43 PM
Thursday, June 16, 2005
scripts manager
ViPER-GT needs a utility to manage user scripts. Right now, the user installs scripts by placing them in the ~/.viper/scripts directory. This is a lot of work - especially because most file managers hide the .viper directory. A user should be able to install, upgrade and remove scripts from within viper, and maybe browse for new ones. Also, it should be possible to bind a script to a hotkey.
The scripts are already somewhat self-describing; they include a method to get their name. Improvements would be to get a default bundle or classification (for grouping scripts when they become too numerous for a single menu, or tags for browsing and discovery purposes). Other than that, the scripts shouldn't require too many changes.
There are a few things I want to emulate: Firefox, jEdit and Eclipse all have pretty much what I want, in the form of their plug-in managers. Of them, jEdit is probably the most useful to us directly; written in Swing and with a compatable license, so some of the code might be useable directly.
Required Preference Items
I would like for the scripts manager to make use of the existing infrastructure as much as possible. As such, it is a good idea to get a handle on the PrefsManager, which is described somewhat in the AppLoader specification. Basically, the PrefsManager is a layer on top of Jena to provide some functionality that was missing when I started using it. Jena is an RDF triplestore for Java. RDF is a good way to keep track of a lot of semi-structured information. XML presents a rooted tree, while RDF's data model is a graph. An RDF document is a list of subject, verb, object triples. This makes describing things with lots of loopy links easier, e.g. describing the connections between javabeans in ViPER-GT.
I also find it to be a lot easier to read and more compact than XML, at least in n3 form. (It seems that most of the data I've found on the web is in RDF/XML, which is sort of the worst of both worlds.) Anyway, to extend ViPER-GT in any way that involves munging with more than one simple component usually involves editing a lot of n3, so it is important to understand how to read it and how ViPER-GT uses it, both in the apploader packages and in some of the beans that use PrefsManager directly, like the UndoManager (which uses prefs to look up text strings for each undo item). For a simple example of using the user prefs file to track some information, look at how ViperViewMediator's getLocalPathToFile and putCanonicalToLocalMapping keep track of where a file referenced in a gt file can be found on the user's local disk.
Basically, to add elements to the RDF model, you use the changeUser(toRemove,toAdd) method on the PrefsManager. This makes sure the change happens atomically and that the appropriate listeners are notified. Later, to get information, query the unified model field directly, using the normal Jena methods (e.g. getResource and listStatements). You will need to define new properties, such as 'version' and 'last-update-check-time', and probably a 'Script' RDF class.
Where possible, you should use existing techniques and RDF vocabulary. For example, you can use the apploader HOTKEYS vocabulary to bind hotkeys to each script.
Setting Up the Management Protocol
I'd recommend doing whatever Firefox or jEdit does. I'm sure the Firefox stuff is better documented, but the jEdit stuff is browsable.
- posted by David @ 1:49 PM
Thursday, June 02, 2005
Requests from the PSU Meeting on June 1, 2005
- Improved performance: the ability to play back at full speed or faster with fewer dropped frames. While I doubt ViPER will ever acheive VirtualDub-quality scrubbing, especially while written in Java, it can probably be improved considerably. The first step is getting a profiler configured on my machine at work and trying to quantify the performance issues. It might also be a good idea to make a simple piccolo-video player to see what the performance bounds are on that sort of widget.
- Diff-type functionality. The way I see implementing this is by having one .xgtf file as the 'Primary' and having several others possibly loaded as 'secondary'. We discussed outputing four different files from a comparison, correct truth and result data, missed truth and false result, and displaying them with different colors (like the old overlay scripts).
- Shift+constrained orientation of shapes: holding shift while drawing an oriented box or line of a polygon should constrain it to one of the eight major orientations.
- Per-attribute coloring, or descriptor specific styles. For example, the VideoMining annotation tool supports coloring based on another attribute value (male = red, female = blue, etc).
- Better handling of hiding things using the tabs and column header colored sphere icons. Right now, it is too easy to change those accidentally. I should add a pop-up menu, as well, which would help make the icons easier to understand.
- A variety of improvements to the timeline were requested. The most obvious one was making the descriptor summary lines clearer and more/less meaningful. Right now, if there are more than a couple of descriptors, the line becomes unusable and slow. If there are fewer than a couple, the line is needlessly obscure. Also, the button for 'display where valid' is a horrible UI, and the roll-out of a set of descriptors is similarly useless. Also, it would be nice to constrain annotation to one subset of the video. Also, you probably shouldn't be able to drag the time-cursor to an invalid frame or type an invalid frame in to the frame number box.
- While 'Toggle Display of Invalid' is nice, it should never hide the selected attribute. This leads to the strange problem of accidentally drawing on the screen when you think you should be in 'select' mode. Even if you do mean to draw the shape, you certainly mean for it to be automatically set to 'valid', too.
- 'Advance to next descriptor' seemed well-implemented in the video miner. I'll probably steal their icon idea (an arrow pointing to a box).
- In general, viper does not provide enough support for editing existing data. Often it is easier to start from scratch than to fix a problem (e.g. an annotator made all of the faces too large).
- Event-annotation features would be nice - thumbnail display, shot-level editing, etc.