Blog for work on my Masters thesis - a survey of methods for evaluating media understanding, object detection, and pattern matching algorithms. Mostly, it is related to ViPER, the Video Performance Evaluation Resource. If you find a good reference, or would like to comment, e-mail viper at cfar.umd.edu.
Archives
Media Processing Evaluation Weblog
Monday, September 29, 2003
Editable Table
The table finally allows editing in v4. One more step towards parity with v3.6?
- posted by David @ 12:56 PM
Meeting Notes: Monday, September 29, 2003
Today, Stefan Jaeger presented some of his work on handwriting recognition, as well as a brief history of the field (from c. 7000 B.C. onward).
- posted by David @ 12:50 PM
Wednesday, September 17, 2003
Java Vision Toolkit
A new owner seems to be developing a fork of Mark Powell's Java Vision Toolkit. The new JVT fork, developed by Joe Carter, is being hosted on SourceForge.
- posted by David @ 12:08 PM
Tuesday, September 16, 2003
Some Functional Specfications for GTv4
This is intended to give not only an overview of the functionality required of viper-gt, but also to give an idea of the current state of v4.
Within
Old
So far, we have implemented much of the previously existing functionality in the new version. This includes the standard two views: the table and the canvas. While both are still deficient to the original implementations, they provide enough functionality to browse an existing data file and perform a few simple operations (delete descriptors, modify strings). The canvas is still frame-accurate, and most of the editing is still frame-oriented. The system can load and save viper+xml file.
New
So far, we have focused on implementing new functionality.
- Unlimited Undo/Redo
- The Ontology Editor
- Chronicle View
- Most Recently Used File List
- Better hotkey support
- Piccolo canvas
However, there is still new stuff we would like to integrate before
finishing.
- Remote control/timeline integration
- Editable chronicle
Without
Intentional
- Open & Save as old data format – use the gtf2xml script if necessary
- P/V Checkboxes in table view – Should be replaced by other visual indication and checkboxes
- Range Slider – should be replaced by remote and chronicle view
- Drawing Toolbox – should be replaced with direct manipulation controls
Unintentional
- SNiPER View, Big Frame – The canvas currently isn’t being cloned
- Method for setting invalid/valid
- Direct editing of table entries
- Method for selecting descriptors/attributes to propagate
- Buttons for adding/deleting/duplicating descriptors
- Hide invalid in table
- Show/hide descriptors in canvas
- Direct editing of shapes
- Configure colors of shapes in canvas
- View source
- Dump images / Create movie
Future
- Ontology editor integration – would require improvements to undo/redo, etc.
- Loading multiple data files / displaying multiple source files
- Drag-and-Drop
- Property Sheet View
- Hierarchies
- Relations
- Hotkey Manager
- Dockable panes
- posted by David @ 9:45 AM
Task 2: Direct Manipulation of Boxes
This task will be broken into two major stages: developing the box editing control, and then integrating the control into ViPER-GT v4. The first stage will further be broken into several sub-parts. The first step is to get a list of boxes and render them to a Piccolo canvas. The second stage is to support user selection, followed by receiving modification events and redrawing. The last stage before integration, and likely the most complex, is to add direct manipulation controls to the boxes when selected and allow direct manipulation.
The main idea of arranging the tasks like this is to give an understanding of how piccolo works, and to develop a methodology that can be used for all the drawing types (points, oboxes, polygons). You might want to keep in mind how you would do this if you were dealing with the other data types, or, more appropriately, an extensible mix of data types.
Step 0: The Basics
First, you have to construct a list of boxes. I would recommend using a Map; this way, you can keep track of the boxes by a key object (probably an Integer or a String). More accurately, I'd recomment you use something like this:
import java.util.*; import javax.swing.event.*; public class EventfulHashMap extends HashMap { public void addChangeListener(ChangeListener cl) { listeners.add(cl); } public void removeChangeListener(ChangeListener cl) { listeners.remove(cl); } public EventfulHashMap(int initialCapacity, float loadFactor) { super(initialCapacity, loadFactor); } public EventfulHashMap(int initialCapacity) { super(initialCapacity); } public EventfulHashMap() { super(); } public EventfulHashMap(Map m) { super(m); } public void clear() { if (!isEmpty()) { super.clear(); fireChangeEvent(); } } public Object put(Object key, Object value) { Object old = super.put(key, value); boolean same = (null == old) ? (null == value) : (old.equals(value)); if (!same) { fireChangeEvent(); } return old; } public void putAll(Map m) { super.putAll(m); fireChangeEvent(); } public Object remove(Object key) { Object removed = super.remove(key); if (null != removed) { fireChangeEvent(); } return removed; } private Set listeners = new HashSet(); private void fireChangeEvent() { if (!listeners.isEmpty()) { ChangeEvent e = new ChangeEvent(this); for(Iterator iter = listeners.iterator(); iter.hasNext(); ) { ChangeListener curr = (ChangeListener) iter.next(); curr.stateChanged(e); } } } }
This will allow you to keep track of changes, and notice if a box is changed through another means.
Step 1: Rendering the Boxes
The next step is to render the boxes to a piccolo canvas.
Step 2: User Selection
This can be accomplished by registering a piccolo listener on the canvas that updates a list of currently selected boxes by key. You should probably cycle through boxes you click on. Whatever powerpoint does.
Step 3: Displaying Changes
This is just a question of responding to change events from the box storage. You might want to modify the above code to use your own Event object, instead of the simpler ChangeEvent, so you can get some idea of what state change occured without having to iterate through the Map.
Step 4: Direct Manipulation
This involves drawing handles around the box edges and allowing the user to click-and-drag the box.
Step 5: Integration
The last step is getting it to display ViPER data and integrating the class you've been working on to the current canvas system.
Repeat As Necessary
For points, oriented boxes, circles, and polygons.
- posted by David @ 9:44 AM
Wednesday, September 10, 2003
Task 1: Moving boxes
Okay, so far we have writtng a Movable
interface that lets the
attributes move around - or, at least, create instances of objects
that represent an offset applied to an old Movable
. So, in our
MoveActionListener
's actionPerformed
method, we have something to do. We still have to determine what to do
it to and how to move the Movable
.
First, you need to get the java.api.Attribute
that is currently
selected. Right now, there is no selection model, so I'd suggest that
you add a private Attribute selectedAttribute;
to the
ViperViewMediator
. Then right-click on the field definition and
select the Source->Generate Getter/Setter menu item. This pulls up a
dialog for creating get/set methods. Make sure that
getSelectedAttribute
and setSelectedAttribute
are
the only ones checked, and press OK.
We can worry about adding change listeners later. (How will this be set? I'd recommend right now making which cell is clicked in the table view altering this, but don't worry about it until you get to the point where you'd like to test your listener.)
So, the first step in your ActionListener
's
actionPerformed
method is to get the attribute. Then, you
need to get the value. For static attributes (instances of
viper.api.Attribute
whoses isDynamic()
method returns false
), you may use the
getAttrValue
method.
For dynamic attributes, you'll have to use the getAttrValueAtInstant(Instant)
method, using the Instant
from the getMajorMoment()
method of the mediator. (To determine
if it is static or dynamic, use the isDynamic
method on the attribute's AttrConfig
object.) You can then
test the value for implementation of Movable, and then cast it to
Movable
or break, as appropriate.
Next, you should parse the command string from the
ActionEvent
. This will involve using a StringTokenizer
to
get a string (N, NE, E, SE, etc) and a distance (in pixels). These can
be passed to Movable.move(int,int)
. The result can then be
saved back to the attribute using setAttrValue
or setAttrValueAtSpan(value, new Span(now,
now.next())
as appropriate
Okay, so how do you use the action listener? One method would be
to make a KeyListener
and attach it to the canvas, but this
would only work while the canvas has focus. So you should
really set up an ActionMap
, InputMap
thing, as
described in the 'How do I set up keybindings' tutorial on
java.sun.com. But that is annoying, so I made a script that can
do that using the preferences file, assuming you set up the
preferences as I showed you earlier today.
This involves adding two things to the preferences model. First, you
should add an ActionListener
description. I put them all under
the Actions
comment in the preferences file, so they stay
together. (Order of statements is unimportant to n3.) This would be of
the form:
:moveAttributeAction a lal:Action ; lal:sendsTo [ lal:listenerBean :mediator ; lal:listenerType "MoveActionListener" ] .
where MoveActionListener
is what you called the listener (ie
:mediator, here an instance of ViperViewMediator
, has a method
called getMoveActionListener
that returns an
ActionListener
). Then, to the lal:Core
(or to the
:canvas
, it doesn't matter, since this is a window hotkey, and
will apply when the focus is on any control in the window) add the
following lines:
lal:windowInputAction [ lal:hotkey "up" ; lal:actionCommand "N 1" ; lal:hasAction :moveAttributeAction ] ;
This will add the action to the lal:Core
bean, which is the
JFrame
that holds the others. This means that it won't work
when the focus is on another JFrame
(e.g. the
VideoRemote). I'll probably come up with a method for allowing
you to specify 'whole application' hotkeys, something java doesn't
have right now. I'll also probably move the video remote to a panel on
the main frame, so it shouldn't be an issue for that one, and I don't
see why you'd want to move the box while the focus is on either of the
other two popups (undo history and sourcefile selection), so it might not
be an issue at all.
See Also
- Tutorial on keybinding
- Note on how to set up your environment
- Information on how the AppLoader works:
- Notes on the AppLoader (listed as AppLoader spec, but is pretty far from that)
- RDF Schema for some of the preferences (not up to date)
- Using N3 to list RDF triples (Note: this This doesn't mention literal typing (
"12"^^xsd:int
) or string languages ("huis clos"@fr
) - Jena - RDF API for Java
- posted by David @ 6:04 PM
Notes on Developing Using Eclipse
How to edit
First, open eclipse. Depending on the file type, eclipse has different methods for getting at them and for editing them. It will often try to open files in their native editor when you double-click them (from the resource view, say). You can open them in the eclipse editor by right-clicking and getting to 'Open With...'. This won't be a problem with .java files, which it already knows to open in its java editor (different from the text editor, this editor knows about the semantics of the code and can offer additional help for java files). Files may also be edited directly in the comparison view (described below), but I'd recommend against that, except for handling merges.
Java files may also be accessed through the java views (package explorer, type list, java resource view) that are accessed through the java and java browsing perspectives. This way, you can click directly on method names or class names and leap directly to their declarations. You can also navigate through source files by ctrl-clicking on items to go to their declaration, or selecting and right-clicking to search for references to the item.
Keys | Action |
---|---|
C-x | cut |
C-c | copy |
C-v | paste |
C-z | undo |
C-y | redo |
Chift-arrow | select |
Editing itself is like any other windows text editor: typing text inserts text at the cursor (unless you press the insert key, which toggles between insert and overwrite mode) and you may select text with the mouse (click+drag) or by holding down shift while moving the cursor with the arrow keys.
How to Save
The easiest way is ctrl-s. You can also click on the save icon (the disk in the toolbar) or use the file menu's save item. Open files with changes that haven't been saved will have an asterisk in their title bar/tab thing. Eclipse will also prompt you to save before exiting, running code, synchronizing with the repository, or a few other actions.
How to Commit to the Repository
The best way to commit changes to the repository is to use the team->synchronize with repository menu option. The team menu is available on files, directories, and even java packages and classes. 'Update' means you want to get the latest version of the files from the repository, and 'Commit' will try to apply your changes. 'Synchronize' will give you a list of all differences, and let you selectively update and commit files, while giving you a file-comparison view for each changed file. You can select incoming mode (display only possible updates), outgoing mode (display things you've changed), or bidirectional (both), and you can also display only conflicts (files that have changed in both location, and may require handholding to perform a merge). Before committing, you should try to get all relevant updates and make sure your code still works as planned. Again, you can perform commit/update actions by right-clicking on the items in the 'structure compare' tree, and double-clicking brings up the content comparison view.
When you commit a change, be sure to give it a descriptive label. You probably shouldn't say what can be determined with a diff, but you should say enough so that when a developer gets the 'Team->Show in Resource History' view, they have an idea of what to look for. This means: make the first line informative, and put details in the other lines.
As to when to commit, you should commit as soon as the changes compile without adverse, far-reaching effects. Some new bugs are acceptable; preventing the project from running is not. You should commit co-dependant changes all at once, to limit the amount of time that the codebase (the HEAD branch, the only branch we're using) is incoherent / will not work. You can also re-use the log message across a single commit that is logically the same change.
Advice on What to Look for When Writing Code
One of the more interesting features of the java editor are the automatic warning and error messages that show up, as well as the notes about unfinished tasks. Errors are often underlined, and hovering over them will give the error message. Errors and warnings get little icons on the left side, which can be hovered as well; a light bulb indicates there is an auto-fix available (single-click opens a list of suggested fixes, which may or may not contain the real fix; double-clicking adds a breakpoint). On the right side is a similar list, but it won't have the quick-fixes. It gives a summary of messages through the whole file; this lets you skip to places in the file that have errors directly.
Errors will also appear in the 'Tasks' view. This is a tabular view of all errors, warnings, and 'TODO' comments in the project. You can add filters to remove some of these these, and or set up your own task keywords to add more (FIXME, or FIXME-clin, for example). This list can grow quite long, especially if you import other projects (piccolo, jena) from cvs to let you debug into them or get the javadoc displayed properly.
How to find the methods of certain classes?
As mentioned above, ctrl-click on a use will go to the definition, and the search feature allows you to look for method declarations throughout the whole workspace for use instances of a method or field.
How to run the thing after compiling
As eclipse features incremental compilation, you will rarely have to build a project. You have to be in one of the java views or perspectives (or debug view) to get the run menu to display anything useful. This project also requires 1.4 compatability (set in the window->preferences under the java menu somewhere). The easiest way is to open the file and select Run As->Java Application. This will automatically add an entry to the Run... dialog. There, you can edit the parameters to the main method and to the launching jvm. (Useful parameters to the jvm include -Xmx512m to increase the heap size and -ea to enable assertions. To pass a property, use -DpropName=value.)
If you are having trouble with compiling, you might want to save open files, refresh the project (right-click the project node in the projects view and select 'refresh'), and/or rebuild all (under the Projects menu).
Using junit
You can run junit tests like an application, assuming you are using the test(testname) naming convention, by using run as->JUnit Test.
Using javadoc
Under the Project menu, select 'Generate Javadoc'.
- posted by David @ 2:07 PM
Using sf.net to Handle Development Processes
Not long ago, I started going through the viper bug and rfe lists on sf, cleaning them up. These are pretty useful for keeping track of what we are working on, and what we are putting off. My goal is to make sure that everything I check in has a bug or rfe associated with it. Some revision control systems, like ClearCase, automatically associate log messages with bug comments. All part of Rational's goal of making software development more like flexible manufatcuring.
So, all the developers (right now, just Charles and I), have permission to edit any of the bugs and rfes. The current goal is to get out version 4 before the end of october, so the short-term goal is getting out 4.0 alpha as soon as possible. To do this, we need to triage bugs (and add missing ones) into 4.0a, 4.0b, 4.0, and future. 4.0a bugs are things we need to be feature complete and 4.0b are bugs that must be fixed for ViPER to be usable without too much hassle. 4.0 bugs are little ones that we can put off to the last day, at least until we have 4.0b out, and should be very low severity.
Developers should assign their names to the bugs before working on them. (Unless another is assigned, then they should contact the developer and discuss the approach to it). The assignments (mihalcid v. charles_lin) are useful as locks on that part of the code. I don't think you can assign one bug to multiple developers, but either way, it will help avoid duplication/possible merge conflicts.
All of this should keep us goal oriented and allow me to give decent presentations to dave every monday.
- posted by David @ 2:02 PM
Activity Evaluation Techniques & ViPER
I haven't done any evaluation of activity detection using viper yet. Most of the examples I have for ViPER involve things like text detection and person tracking. There is some support for activity detection analysis, and I would like to add more evaluation techniques. Allow me to give a description of what exists for evaluating activity detection algorithms within and without ViPER.
ViPER Tools for Activity Evaluation
Of the currently implemented techniques for evaluation, you might want to use:
- Matching segments
If each activity is marked up as a single event, this method searches for result activities that are similar in span and type. This gives you a list of events that were detected correctly using a given metric and threshold. If you vary the threshold, you can almost get viper to give you a curve, but not quite (it uses a bar graph). I've typed the results into excel to get better graphs, but I don't have any scripts to do that.
- Frame-level matching
Assuming that you are labeling frames, more so than detecting events, this can give you precision/recall for frames per activity type.
All the current metrics assume that things must overlap temporally. If there is no temporal overlap between a segment ground truthed as an event and a detected event, then they can't be counted as a match. This is annoying when, like Ayesh, you would like to compare things (in his case, frames) that all belong to the same segment as if they were temporally coincident, or if you would like to have some sort of smoothing over time.
Other Tools and Techniques for Activity Evaluation
There are several metrics and techniques not yet implemented that might be useful.
- ROC curves
This is esp. useful for when you are computing something along the lines of p(signal), which has a pretty direct mapping to event detection (events are signal, everything else is noise). There are some tricks you can do to write scripts to get viper to do this, but a tool that already implements this will likely be more appropriate while viper doesn't support it natively.
For example, to use jrocfit, you would have to first format your results in their format. This is a set of pairs /{0,1} {p(signal)}/. The ground truth would have to be converted to the format (list of true/false values, one per frame/segment, depending on how you are evaluating). I could help you with this. Then you would have to combine the two results, and then copy-and-paste them into the applet.
- BLEU variants
There are lots of other techniques, but I don't think I know of a really good way to do activity evaluation yet. I'm still looking.
- posted by David @ 1:40 PM
Monday, September 08, 2003
Meeting Notes for September 8, 2003
post-doc from japan is coming (worked on ocr)During this meeting, Dave gave a schedule for the upcoming meetings, including a Malach update next week. A presentation from Burcu (her machine translation paper accepted at a conference) is scheduled for Wednesday at 11. He also mentioned that mounting network shares on the samba server was the cause of some of the hangs we have been seeing while logging into the windows boxes.
Today's presentation was by Gang Zi. Gary's presentation was titled: Groundtruth Transformation and Font Mapping. Dave mentioned that Gary's also looking at effects of compression on OCR. Gary started with an architectural overview of his OCR evaluation system, from getting real and synthetic ground truth to evaluation. There was a focus on the importance of getting ground truth for real degraded images (size changes, perspective distortion) automatically. He previously used four feature points (one at each corner), but still had y-shift. The features were circles instead of crosses; these have better properties: the area of ellipse is pi*A*B; if pixel count of connected component is equal to pi*major*minor, it is a feature point.
But what about logos, tables, and other non-text features? The system puts boxes around logos, and contains truth for the text inside tables, but not the table lines.
Future work included work to make location of feature points more robust and reliable. One idea was to make dots unique so he can register dots automatically under rotation. He also plans to study the statistics of print-copy-fax procedure. He also went through some of the difficulties in mapping glyphs to unicode characters.
- posted by David @ 5:47 PM
Meeting Notes: Monday, September 8, 2003
- posted by David @ 3:04 PM
Thursday, September 04, 2003
Notes to Developers on the Project
Charles has started formatting and posting the notes on developing that I've been sending him. I'll probably start posting them here.
- posted by David @ 8:07 PM
Tuesday, September 02, 2003
Java's KeyStroke
Syntax Sucks
The JFC has a method for converting a String
into a KeyStroke
. It seems straightforward, but the toString method doesn't output a String that is parsable; instead, you must use this KeyStroke
2String
code. Even more annoying to me is that it uses the KeyEvent
names for keys. This is logical enough, but the names themself are horrble. Braces, brackets and parenthesis each have a different naming convention; compare: BRACELEFT, OPEN_BRACKET and LEFT_PARENTHESIS. I assume this is some sort of inheritance from somewhere else, like a standards document with which I should be familiar, but it is annoying. At least the KeyEvent
javadoc should specify all keys, like it does for PLUS.