Archive for 2003/12

Blog for work on my Masters thesis - a survey of methods for evaluating media understanding, object detection, and pattern matching algorithms. Mostly, it is related to ViPER, the Video Performance Evaluation Resource. If you find a good reference, or would like to comment, e-mail viper at cfar.umd.edu.

Media Processing Evaluation Weblog

Monday, December 29, 2003

Sequences

I'm giving up on my plan for subaddressing and combining metadata sequences for now - it raises too many issues with the current architecture - and am just focussing on the simpler solution of treating each sequence as a seperate video. This will lose some of the benefits to the user (they won't be able to open just a section of the video and edit that) but it will be easier to design an interface for.

- posted by David @ 3:45 PM

Friday, December 26, 2003

E-Mail from Jonathan about his MPEG-1 Decoder

Recently, I received a message at viper-bugs asking what the limitations are on the decoder. So far as I know, there are none. So, I forwarded the message to Jonathan, who actually wrote it, and asked for his thoughts.

The one real limitation that springs to mind is the decoder isn’t designed to handle GOPs with more than 1024 pictures (i.e. GOPs where the 10-bit picture sequence number wraps). That probably wouldn’t be too terrible to fix.

It is likely that D-type pictures are not handled 100%, since no provision is made for their display, but I don’t know of any encoder that uses them.

As far as resolution, I don’t think there should be any limitations. I would have thought that VBR files would work, since I don’t use bitrate information for anything. In fact, I believe some of the test files were VBR. I’m pretty sure it has been tested with both VBR and VBR.

Both primary and encapsulated streams should work. Maybe they’re trying an MPEG-2 file? They appear to work, but produce largely empty frames.

Besides wrapping sequence numbers (unusual to have more that 1024 pictues in a GOP, though) and D-pictures (also unusual), neither of which I have ever come across, any strictly valid stream should work (even some streams with bugs can be at least partially displayed), even ones with customized quantizer matrices and such. Every weird stream I could find worked. You might need to get a copy of one of their non-working streams and analyze it a bit to see what the problem is.

– Jonathan

p.s. I think it might be possible to turn JMPEG into a component for JMF, but it would probably take some work.

- posted by David @ 1:37 PM

Friday, December 19, 2003

PhotoStuff

PhotoStuff is a MIND project for annotating images with semantic data.

- posted by David @ 1:08 PM

Thursday, December 18, 2003

Quicktime for Java

Am I the only person who wants to program in QT4J? There is only one book I can find on the subject, and it doesn't come out until February and looks to be a regurgitation of the emaciated spec.

Perhaps the best idea is to read up on the normal quicktime stuff, as the Java stuff is just a wrapper. For example, there is a good hint about how to get the frame rate of an MPEG or quicktime movie on Apple's dev site that looks as though it might be translatable into qt4j.

- posted by David @ 6:10 PM

Monday, December 15, 2003

Sequences of Media Files

We have had a request to load files by directory. This leads back to the old request to support sequences of media files (a request from Malach, I believe). My basic plan would be to support subaddressing of a media file (only open frame 12-15) and this at the same time. This will result in the problem that you will have a sequence DAG. This can be avoided by making it illegal to include a sequence as an element of another sequence, but that wouldn't be sporting.

The media-files window would be replaced with a tree view, where there is a forest of all DAG nodes without parents. There may be duplicate nodes in the tree, but it will be finite. The drop-down at the top of the screen (the URL bar, sort of ) will display those top-level ones, or the siblings of the current one in the selected context.

These could be represented in the XML file with a new, first level sequences element. Although it would be possible for a sourcefile can then refer to a sequence element or a media file directly (add a 'sourcesequence' attribute to the sourcefile element), I will instead only allow individual files to be ground truthed (the atoms), and alter the API to treat sequences appropriately. This brings back the problem of Descriptor scope. A descriptor ID is unique for a descriptor type within a single sourcefile, but there is no method for saying, say, that Jeff walks out of this video clip and into another one. This is exactly the sort of thing that sequences will have to express. (It may be useful to express concurrent videos, as well.)

Archives