Blog for work on my Masters thesis - a survey of methods for evaluating media understanding, object detection, and pattern matching algorithms. Mostly, it is related to ViPER, the Video Performance Evaluation Resource. If you find a good reference, or would like to comment, e-mail viper at cfar.umd.edu.
Archives
Media Processing Evaluation Weblog
Friday, November 21, 2003
RFEs
I met with Dave, back from Comdex. It appears as though MPEG-2 may be required functionality for ViPER-GT. I don't know how we can do it; I suppose it is possible, though. I'll search around for a java MPEG-2 decoder, then look into using QT4J or JMF, followed by a close examination of FFMPEG and its offspring (right now, I'm looking at wrapping something higher-level, like VLC), and finally an attempt to get Jonathan back to write another decoder.
Charles is working on points, and then he'll do polygon editing. Dave hasn't indicated a need for ellipses or circles, yet. I'll be working on exporting sets of frames from the interface, and finally implementing marker setting to allow 'export frames between these markers'.
- posted by David @ 12:20 PM
Wednesday, November 19, 2003
Compiling the Current (Nov 19, 2003) version of FFMPEG with MinGW and MSYS
I had to change a few things. First, it is noteworthy that the makefiles don't have quote marks in the appropriate places (I saw this problem on the root Makefile and the libavformat Makefile in the CFLAGS declaration - the -I$(SRC) lines should read -I"$(SRC)", and so forth). The other error I received while compiling was in libavcoded, where several .c files seem to require #include <inttypes.h>
, but are missing it. These files are "ffv1.c", "jfdctfst.c" and "jfdctfst.c". This causes a bunch of warnings about duplicate typedefs, but that shouldn't make a difference.
Update: Also, I had to modify the configure script to contain quotes, around line 1036 in my version. The line version=`grep '#define FFMPEG_VERSION ' "$source_path/libavcodec/avcodec.h"
didn't have quotes around the .h file's path, as written. Also, I noticed that i wasn't getting the inttypes.h bug on my home computer, but was instead getting warnings about a function being declared twice. I assume this is due to running gcc 3.2 here and 3.3 at home.
- posted by David @ 6:55 PM
Monday, November 17, 2003
MSYS and JNI
I have been playing around with getting JNI working under Eclipse. This is possible with the CDT and MinGW. These should allow me to use FFMPEG within ViPER. Also, it might let me work on the Unix utilities included with ViPER-Full in Windows, which would really help my development process.
I'm going to try to clean up the timeline so it no longer relies on the CVS version of Piccolo, so I can roll out an alpha. I don't think I'll call it pre-alpha anymore; the VACE presentation went well enough that I think most of the major data-loss bugs are gone, and it is feature-complete enought to be as useful/more useful than ViPER 3.6, although it still is missing some datatype support (polygons, points, circle & ellipses - for more info, see the bug reports and RFEs on sf.net).
- posted by David @ 3:27 PM
Thursday, November 13, 2003
Feature Requests
At the recent demo, several people asked for a few new features. There were a few requests to support more encodings (MPEG-2 and MPEG-4) and to support audio. I'll look into using FFMPEG or QT4J, but I'm not too optimistic that we'll get the frame-accuracy and codec support we require (and JMF is lacking). Audio support would be very useful, although I'd most likely turn it off by default. The question of how to appropriately display audio while scrubbing is a bit of an issue; I'll look into how final cut handles it, I guess. I suppose I could have a seperate marker for audio tracks, and allow the user to drag that. Decoding shouldn't be impossible; there are open source MP3 decoders for java, and we already have access to the streams. (Speaking of which, there remains the bug that ViPER only lets you ground truth the first video stream in an MPEG file. I need to add that to the bug list.) The audio is useful for more than just transcription efforts, but in defining activities and people as well. Transcription and speech markup are offered far better in ATLAS, but that doesn't mean we should try to be feature-disjoint.
Another request was to keep the current frame marker static and move the rest of the timeline around it; this is similar to Daniel's idea of keeping a spatial attribute at the same posiition in the canvas and moving the video around it, but simpler to implement. I don't know how to provide a UI for this, and I should also remark that there is a third alternative (chosen by several audio editors) of jumping to the next block when the cursor nears the end of the currently visible region. I hate that behavior myself, (I prefer the SNES Zelda behavior to the NES Zelda behavior) but I can see why some people might like it (like using a strobe to detect defects in print houses).
Several requests were made for a working implementation of the change-follows-mouse-while-playing 'manual interpolation' feature. The simple version feature shouldn't be hard to implement - it is basically 'Propagate + keeping something selected & mouse in same state'. It gets harder when you want to change one aspect of the object (e.g. the centroid is correct on each frame already, but I want to adjust the orientation). This speaks to a more general sentiment some expressed of moving from the current concrete, 'value at t = x' approach to a more fluid functional/transformation approach, where the data is stored as transformations, allowing things like editing only keyframes and having the system interpolate the rest automatically, and allowing changes to keyframes to propagate to the interpolation. This means making Flash-type interpolation features, instead of the current 'generate interpolated values' feature. Perhaps having a project-file format and compiling to the gtf format is the way to go. (I've been avoiding explicitly creating project files by putting stuff in the FILE Information descriptor and the user preferences file.)
Another development was the discussion of VML, a proposed video markup language to supplant MPEG-7 and ViPER's data format. (I wouldn't miss it. I would have to rewrite the API, but I'd welcome a better, more robust data model.)
- posted by David @ 1:56 PM
Thursday, November 06, 2003
HOWTO: Create a Metadata File With ViPER 4
With the new ViPER v4 config editor, it is possible to create a new metadata schema and instances without having to edit a text file. This how-to will describe how to create a new schema, save it, and then add some data to the file. Note that the current Config editor doesn't work on files that contain instance data. (More concretely, it will fail if you edit existing parts of the schema; you can add new descriptor types, but not change existing ones.)
Editing the Schema
For a full install, the command line string viper-cfg
loads the config editor (make sure you have run source viper.config
if you are in a C-shell or . viper-cfg.sh
if you are in a Bourne shell variant). In the viper-lite installation, you can double-click on the viper-cfg.jar file in your file browser or run the command java -jar viper-cfg.jar
from the install directory at the command line.
The program starts with an empty schema loaded. From the Edit
menu, select Create New Descriptor Type
. This will create a generic OBJECT descriptor. (An Object descriptor is useful for describing things in a video that allow multiple appearances on a single frame, or can change in value over the course of the video.) Click on the OBJECT Desc0
node in the tree. This will bring up the properties list for the new descriptor type, and allow you to use the Edit
>> Create New Attribute Field
menu item to create attributes for the descriptor. To change the name of the descriptor from Desc0
, click in the Name value field (it says Desc0
). Change the name to Face
and hit Enter.
If you make a mistake, you can bring up the undo history from the Window
menu. The undo window will let you undo/redo actions by double-clicking on them.
Next, create an attribute by clicking on the Create New Attribute Field
menu item. This generates a new attribute for the currently selected descriptor type. You can edit the attribute properties (Name, Data Type, Dynamic and Default) by clicking on them. Note that Dynamic can only be altered for Object descriptors; File and Content descriptors only allow static attributes. Change the name of the new attribute from Attr0 to Location, the Data Type to obox, and dynamic to true. Note that you cannot change a descriptor with dynamic attributes to a Content or File type, either. If you wish to do so, first go through the attributes, changing the value of the Dynamic property.
Next, Save the file. This is possible through the File
menu's Save
item. Give the file a name you will remember, like "Sample1.xml". You may now exit the configuration editor.
Describing a Video
Now that you have a metadata schema, you can use it to mark up a video file. Right now, ViPER only supports MPEG-1 motion picture encoding (and its own .info method of using a list of still images as a video). Get an mpeg file you already have, or from archive.org, or, better yet, download the sample mpeg from the viper website.
Start viper-gt, either by running the viper-gt
from the command line, or clicking on viper-gt.jar. If it isn't working, follow the advice described above for starting the viper-cfg program.
Open the file you created in the Config Editor. (Use the Open
item in the File
menu.) This will open the file with no data displayed. In order to add meta data, you must first add a media file to describe.
Open the Source Files window by clicking on the Show Media Files
item in the Window
menu. This window displays all media files the loaded metadata file describes; it is currently empty. To add a file, click the Add New File button. Select the MPEG-1 video you downloaded. Click on its name in the list to select it. This will load the video into the program, and let you edit the metadata associated with it.
To create a new instance of the Face descriptor, click on the Create button beneath the Face tab on the main window. This will create a new Face instance that is valid only on the current frame and whose attributes are set to their default values. To edit the values, you can double-click on them in the table. If the attribute has a visual representation, like a bounding box, you can single-click the attribute to select it, then use the canvas (the image of the current video frame) to draw or edit the value.
- posted by David @ 1:34 PM
Monday, November 03, 2003
Meeting Notes for Monday, November 3, 2003
In today's meeting Jian Liang presented some information about the current state of the art for camera based document analysis, including describing some of the difficulties and current solutions. He didn't have enough time to present his own work. Dave reminded us that we need to see these presentations as an opportunity to share ideas and request insight from the whole team. Next week, I'll be presenting a demo of ViPER v4. I hope it is in good enough shape to demo; there are still some major features missing.