This code is unsupported and provided "as is". Use at your own risk. It implements the dynamic models and the particle-filter tracker described in this paper: http://visual.cs.ucl.ac.uk/pubs/MotionModelPrediction/ For other parts of the pipepline, such as the feature extraction and clip classification, please check the references in the paper and visit the websites: http://lear.inrialpes.fr/people/wang/dense_trajectories http://www.csie.ntu.edu.tw/~cjlin/libsvm/ If you find any of this useful, please cite: @INPROCEEDINGS{GarciaCifuentes.etal.BMVC2012, author = {Garc{\'i}a Cifuentes, Cristina and Sturzel, Marc and Jurie, Fr\'{e}d\'{e}ric and Brostow, Gabriel J.}, title = {Motion Models that Only Work Sometimes}, booktitle = {BMVC}, year = {2012}, } Contact: cristina dot garcia_cifuentes dot 09 at alumni.ucl.ac.uk 0 - Contents -------------------------------------------------------------------------------- 1 - Data and naming conventions 2 - Tracking 3 - Evaluation 1 - Data and naming conventions -------------------------------------------------------------------------------- The dataset consists of videoclips in avi container, sparse ground-truth annotations in text format, and FAST interest point detections used by the annotation tool. See the dataset readme for more details. Each clip has a string id, which is the name of the .avi file excluding the extension. For this code to work, the clips must be converted to individual png frames under a folder called . Each frame has to be named .png, where is 5 characters long. The folders in which the input data are expected (png files, ground truth, etc.) are hard coded. They have to match the way your data is organized. Different types of input data are provided separately to facilitate selective download. However, frames and other data about the same clip can be safely put under the same folder, as long as the hard-coded directories are modified accordingly. The starting point of each track is automatically extracted from the ground-truth files. 2 - Tracking -------------------------------------------------------------------------------- The main function performing the tracking is 'track_video'. The files 'script_track_video.m' and 'script_track_video_tkr.m' provide examples on how to use it. These scripts use one .mat file per clip, named _info.mat, provided under the folder info_files. Hard coded: the number of particles, the size of the patch, etc. The tracks are stored for different combinations of motion models and noise parameters (see paper). 3 - Evaluation ------------------------------------------------------------------------------- Quality of a track ------------------ The function for evaluation is 'compute_tracking_perfo_on_video', called from 'script_compute_tracking_perfo.m'. The predicted positions are compared to sparse ground-truth landmarks along a trajectory. Four quality measures of a track are available: - '%OK': fraction of landmarks that were correctly estimated. - '%Frames': fraction of frames lying between correctly estimated landmarks. - 'AbsOK' and 'AbsFrames': as above, but absolute values rather than relative. A prediction is considered to be correct if both the x and y image coordinates are below an adjustable threshold. The four measures are computed and stored for each track for a range of such thresholds. An average over the number of tracked points in the clip is computed for a selected measure (hard coded). Performance of a motion model on a clip --------------------------------------- The function 'select_perfo' computes one score per motion model, combining the score obtained for different noise parameters. Four types of combination: - 'default': score of the default parameter value (the rest are ignored). - 'best': the best score out of all noise parameter values. - '3best': average of scores of the 3 best parameter values. - 'area': area under the curve formed by the scores of all oarameter values. The choice of score type and correctness threshold are hard coded. The script 'script_print_all_videos_table' calls 'select_perfo' and prints a table, one line per clip, with the score obtained by each motion model.