Video surveillance dataset for people matching [back]

Being able to match objects across multiple views is one of the main building blocks for coordination of multi-camera systems.
This dataset has been acquired and annotated to test the ability of algorithms to work in a constraint free settings: cameras differ for resolution, position, framerate, the scenario is non planar, a camera is hand held and moves freely and no calibration information is provided. Each video is 110 second long and contains manual annotations of each track using ID that are shared between cameras.
Dataset

Camera 1
Analog camcorder
Resolution 720 x 576
25fps
static camera
Download original
Download resampled 24fps
Camera 2
Apple iPhone
Resolution 1280 x 720
30fps
handheld moving camera
Download original
Download resampled 24fps
Camera 3
Sony DSC-RX100
Resolution 1920 x 1080
50i-25p fps
static camera
Download original
Download resampled 24fps
Camera 4
Canon IXUS 300hs
Resolution 1280 x 720
25fps
static camera
Download original
Download resampled 24fps
Camera 5
Nikon D90
Resolution 1280 x 720
30fps
static camera
Download original
Download resampled 24fps
Full Dataset
Ground truth
For each video is it available a ground truth file that contains the bounding boxes of each track identified by an ID that is the same for all the cameras.
Each track is defined with an ID followed by a sequence of bounding boxes:
< track > ::= < id > < bb_list >
< bb_list > ::= < bb > < bb_list > | < bb >
< bb > ::= < frame > < x > < y > < w > < h >
< id > ::= string
< x > ::= integer
< y > ::= integer
< w > ::= integer
< h > ::= integer
Publications
  • Luca Zini, Francesca Odone and Andrea Cavallaro, Multi-view matching of articulated objects, IEEE Transactions on IEEE Transactions on Circuits and Systems for Video Technology (to appear)
Contacts and acknoledgements

This is a joint work betweeen UniversitÓ degli Studi di Genova (It) and Queen Mary University of London (UK)

[back]