Tracking Comparison#
Note
Work in progress, we are planning to add more explainations to each algorithm.
Prepare dataset#
First, we’ll prepare the dataset. To perform tuning we also need the ground truth tracking data.
import sportslabkit as slk
from sportslabkit.logger import set_log_level
dataset_path = slk.datasets.get_path("top_view")
path_to_csv = sorted(dataset_path.glob("annotations/*.csv"))[0]
path_to_mp4 = sorted(dataset_path.glob("videos/*.mp4"))[0]
root = slk.utils.get_git_root()
cam = slk.Camera(path_to_mp4)
# For the sake of speed, we'll only use the first 10 frames
n_frames = 10
frames = cam[:n_frames]
bbdf_gt = slk.load_df(path_to_csv)
# TODO: Hopefully we can get rid of this
if bbdf_gt.index[0] == 0:
bbdf_gt.index += 1
bbdf_gt = bbdf_gt[:n_frames]
SORT Tracker#
Next, we’ll set up the tracker to be used, focusing on SORT, a simple yet effective Tracker in SportsLabKit. SORT relies on both a detection model and a motion model, each with specific configurations. Here are the details:
Detection Model -
YOLOv8xconf- Confidence threshold for detecting objects (Default: 0.5)iou- Intersection-over-Union threshold for suppressing duplicate detections (Default: 0.3)imgsz- Image size to which the input is resized, affecting detection (Default: 2560x2560 pixels)
Motion Model -
KalmanFilterMotionModeldt- Time step between consecutive measurements, crucial for predicting object position (Default: 1/30)process_noise- Noise in the process model, representing uncertainty in motion prediction (Default: 1e-4)measurement_noise- Noise in the measurements, representing sensor noise (Default: 1e-1)confidence_scaler- Factor to scale the confidence in prediction, adjusting the influence of measurements vs predictions (Default: 0.5)
Tracking Algorithm -
SORTTrackermetric- The cost metric to use for assignmentmetric_gate- The gate threshold for the cost metricmax_staleness- The number of frames to wait before removing a trackmin_length- The number of frames to wait before confirming a track
These configurations are essential in setting up the SORT Tracker, and the given defaults provide a good starting point. They can be tuned further as needed to align with the specific use case and environment.
from sportslabkit.mot import SORTTracker
det_model = slk.detection_model.load(
model_name='yolov8',
model=root/'models/yolov8/soccer_top_view-model=yolov8x-imgsz=2048.pt',
conf=0.25,
iou=0.6,
imgsz=960,
device='mps',
classes=0,
augment=True,
max_det=35
)
motion_model = slk.motion_model.load(
model_name='kalmanfilter',
dt=1/30,
process_noise=500,
measurement_noise=10,
confidence_scaler=1
)
matching_fn = slk.matching.SimpleMatchingFunction(
metric=slk.metrics.IoUCMM(use_pred_box=True),
gate=0.9
)
tracker = SORTTracker(
detection_model=det_model,
motion_model=motion_model,
matching_fn=matching_fn,
max_staleness=2,
min_length=2
)
bbdf_pred = tracker.track(frames)
hota = slk.metrics.hota_score(bbdf_gt, bbdf_pred)["HOTA"]
print(f"HOTA Score before Tuning (SORTTracker): {hota:.3f}")
Tracking Progress: 100%|██████████| 10/10 [00:12<00:00, 1.24s/it, Active: 28, Dead: 5]
HOTA Score before Tuning (SORTTracker): 0.498
import optuna
hparam_search_space = {
'self': {},
'motion_model':{
'process_noise': {'type': 'logfloat', 'low': 100,'high': 1000},
'measurement_noise': {'type': 'logfloat','low': 0.1,'high': 100},
},
'matching_fn':{
'gate': {'type': 'float', 'low': 0.1, 'high': 1}
}
}
sampler = optuna.samplers.CmaEsSampler()
pruner = optuna.pruners.HyperbandPruner()
best_params, best_hota, study = tracker.tune_hparams(
frames_list=[frames],
bbdf_gt_list=[bbdf_gt],
n_trials=10,
reuse_detections=True, # Changed to TRUE to reuse detections
hparam_search_space=hparam_search_space,
verbose=False, # Changed to False to quiet down the output
sampler=sampler,
pruner=pruner,
return_study=True,
)
print(f"Best params: {best_params}")
print(f"HOTA score after tuning (SORTTracker): {best_hota:.3f}")
tune_hparams:0284 💬| Hyperparameter search space:
tune_hparams:0286 💬| self:
tune_hparams:0286 💬| motion_model:
tune_hparams:0288 💬| process_noise: {'type': 'logfloat', 'low': 100, 'high': 1000}
tune_hparams:0288 💬| measurement_noise: {'type': 'logfloat', 'low': 0.1, 'high': 100}
tune_hparams:0286 💬| matching_fn:
tune_hparams:0288 💬| gate: {'type': 'float', 'low': 0.1, 'high': 1}
Detecting frames for reuse: 100%|██████████| 10/10 [00:06<00:00, 1.54it/s]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 146.03it/s, Active: 30, Dead: 7]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 197.38it/s, Active: 28, Dead: 6]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 101.45it/s, Active: 31, Dead: 11]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 200.27it/s, Active: 29, Dead: 6]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 188.96it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 146.40it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 183.85it/s, Active: 30, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 195.18it/s, Active: 28, Dead: 7]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 192.69it/s, Active: 29, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 187.59it/s, Active: 29, Dead: 7]
Best params: {'self': {}, 'motion_model': {'process_noise': 285.79024851377466, 'measurement_noise': 0.148890060403773}, 'matching_fn': {'gate': 0.6024539532390903}}
HOTA score after tuning (SORTTracker): 0.500
DeepSORT Tracker#
from sportslabkit.mot import DeepSORTTracker
slk.logger.set_log_level('INFO')
det_model = slk.detection_model.load(
model_name='yolov8',
model=root/'models/yolov8/soccer_top_view-model=yolov8x-imgsz=2048.pt',
conf=0.25,
iou=0.6,
imgsz=960,
device='mps',
classes=0,
augment=True,
max_det=35
)
image_model = slk.image_model.load(
model_name='mobilenetv2_x1_0',
image_size=(32,32),
device='mps'
)
motion_model = slk.motion_model.load(
model_name='kalmanfilter',
dt=1/30,
process_noise=500,
measurement_noise=10,
confidence_scaler=1
)
matching_fn = slk.matching.MotionVisualMatchingFunction(
motion_metric=slk.metrics.IoUCMM(use_pred_box=True),
motion_metric_gate=0.2,
visual_metric=slk.metrics.CosineCMM(),
visual_metric_gate=0.2,
beta=0.9,
)
tracker = DeepSORTTracker(
detection_model=det_model,
image_model=image_model,
motion_model=motion_model,
matching_fn=matching_fn,
max_staleness=2,
min_length=2
)
bbdf_pred = tracker.track(frames)
hota = slk.metrics.hota_score(bbdf_gt, bbdf_pred)["HOTA"]
print(f"HOTA Score before Tuning (DeepSORTTracker): {hota:.3f}")
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
Tracking Progress: 100%|██████████| 10/10 [00:16<00:00, 1.68s/it, Active: 36, Dead: 26]
HOTA Score before Tuning (DeepSORTTracker): 0.408
import optuna
hparam_search_space = {
'motion_model':{
'process_noise': {'type': 'logfloat', 'low': 100,'high': 1000},
'measurement_noise': {'type': 'logfloat','low': 0.1,'high': 100},
},
'matching_fn':{
'motion_metric_gate': {'type': 'float', 'low': 1e-4, 'high': 1},
'visual_metric_gate': {'type': 'float', 'low': 1e-4, 'high': 1},
'beta': {'type': 'float', 'low': 1e-4, 'high': 1},
}
}
sampler = optuna.samplers.CmaEsSampler()
pruner = optuna.pruners.HyperbandPruner()
best_params, best_hota, study = tracker.tune_hparams(
frames_list=[frames],
bbdf_gt_list=[bbdf_gt],
n_trials=10,
reuse_detections=True, # Changed to TRUE to reuse detections
hparam_search_space=hparam_search_space,
verbose=False, # Changed to False to quiet down the output
sampler=sampler,
pruner=pruner,
return_study=True,
)
print(f"Best HOTA: {best_hota:.3f}")
print(f"HOTA score after tuning (DeepSORTTracker): {best_hota:.3f}")
tune_hparams:0284 💬| Hyperparameter search space:
tune_hparams:0286 💬| motion_model:
tune_hparams:0288 💬| process_noise: {'type': 'logfloat', 'low': 100, 'high': 1000}
tune_hparams:0288 💬| measurement_noise: {'type': 'logfloat', 'low': 0.1, 'high': 100}
tune_hparams:0286 💬| matching_fn:
tune_hparams:0288 💬| motion_metric_gate: {'type': 'float', 'low': 0.0001, 'high': 1}
tune_hparams:0288 💬| visual_metric_gate: {'type': 'float', 'low': 0.0001, 'high': 1}
tune_hparams:0288 💬| beta: {'type': 'float', 'low': 0.0001, 'high': 1}
Detecting frames for reuse: 100%|██████████| 10/10 [00:07<00:00, 1.28it/s]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00, 6.65it/s, Active: 33, Dead: 11]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00, 8.55it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 10.53it/s, Active: 29, Dead: 7]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 10.63it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00, 6.20it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00, 8.50it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 10.55it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 10.68it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00, 8.92it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00, 9.80it/s, Active: 28, Dead: 5]
Best HOTA: 0.499
HOTA score after tuning (DeepSORTTracker): 0.499
ByteTrack Tracker#
from sportslabkit.mot import BYTETracker
slk.logger.set_log_level('INFO')
det_model = slk.detection_model.load(
model_name='yolov8',
model=root/'models/yolov8/soccer_top_view-model=yolov8x-imgsz=2048.pt',
conf=0.25,
iou=0.6,
imgsz=960,
device='mps',
classes=0,
augment=True,
max_det=35
)
image_model = slk.image_model.load(
model_name='mobilenetv2_x1_0',
image_size=(32,32),
device='mps'
)
motion_model = slk.motion_model.load(
model_name='kalmanfilter',
dt=1/30,
process_noise=500,
measurement_noise=10,
confidence_scaler=1
)
first_matching_fn = slk.matching.MotionVisualMatchingFunction(
motion_metric=slk.metrics.IoUCMM(use_pred_box=True),
motion_metric_gate=0.2,
visual_metric=slk.metrics.CosineCMM(),
visual_metric_gate=0.2,
beta=0.9,
)
second_matching_fn = slk.matching.SimpleMatchingFunction(
metric=slk.metrics.IoUCMM(use_pred_box=True),
gate=0.9,
)
tracker = BYTETracker(
detection_model=det_model,
image_model=image_model,
motion_model=motion_model,
first_matching_fn=first_matching_fn,
second_matching_fn=second_matching_fn,
detection_score_threshold=0.6,
max_staleness=2,
min_length=2
)
bbdf_pred = tracker.track(frames)
hota = slk.metrics.hota_score(bbdf_gt, bbdf_pred)["HOTA"]
print(f"HOTA Score before Tuning (BYTETracker): {hota:.3f}")
Tracking Progress: 100%|██████████| 10/10 [08:00<00:00, 48.03s/it, Active: 33, Dead: 13]
HOTA Score before Tuning (BYTETracker): 0.424
import optuna
hparam_search_space = {
'self': {
'detection_score_threshold': {'type': 'float', 'low': 0.1, 'high': 0.5},
},
'motion_model':{
'process_noise': {'type': 'logfloat', 'low': 100,'high': 1000},
'measurement_noise': {'type': 'logfloat','low': 0.1,'high': 100},
},
'first_matching_fn':{
'motion_metric_gate': {'type': 'float', 'low': 1e-4, 'high': 1},
'visual_metric_gate': {'type': 'float', 'low': 1e-4, 'high': 1},
'beta': {'type': 'logfloat', 'low': 1e-4, 'high': 1},
},
'second_matching_fn':{
'gate': {'type': 'float', 'low': 0.1, 'high': 1}
}
}
sampler = optuna.samplers.CmaEsSampler()
pruner = optuna.pruners.HyperbandPruner()
best_params, best_hota, study = tracker.tune_hparams(
frames_list=[frames],
bbdf_gt_list=[bbdf_gt],
n_trials=10,
reuse_detections=True, # Changed to TRUE to reuse detections
hparam_search_space=hparam_search_space,
verbose=False, # Changed to False to quiet down the output
sampler=sampler,
pruner=pruner,
return_study=True,
)
print(f"Best HOTA: {best_hota:.3f}")
print(f"HOTA score after tuning (BYTETracker): {best_hota:.3f}")
tune_hparams:0284 💬| Hyperparameter search space:
tune_hparams:0286 💬| self:
tune_hparams:0288 💬| detection_score_threshold: {'type': 'float', 'low': 0.1, 'high': 0.5}
tune_hparams:0286 💬| motion_model:
tune_hparams:0288 💬| process_noise: {'type': 'logfloat', 'low': 100, 'high': 1000}
tune_hparams:0288 💬| measurement_noise: {'type': 'logfloat', 'low': 0.1, 'high': 100}
tune_hparams:0286 💬| first_matching_fn:
tune_hparams:0288 💬| motion_metric_gate: {'type': 'float', 'low': 0.0001, 'high': 1}
tune_hparams:0288 💬| visual_metric_gate: {'type': 'float', 'low': 0.0001, 'high': 1}
tune_hparams:0288 💬| beta: {'type': 'logfloat', 'low': 0.0001, 'high': 1}
tune_hparams:0286 💬| second_matching_fn:
tune_hparams:0288 💬| gate: {'type': 'float', 'low': 0.1, 'high': 1}
Detecting frames for reuse: 0%| | 0/10 [00:00<?, ?it/s]
Detecting frames for reuse: 100%|██████████| 10/10 [05:23<00:00, 32.39s/it]
Tracking Progress: 100%|██████████| 10/10 [00:59<00:00, 5.93s/it, Active: 23, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:18<00:00, 1.85s/it, Active: 25, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:14<00:00, 1.42s/it, Active: 23, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:23<00:00, 2.31s/it, Active: 26, Dead: 1]
Tracking Progress: 100%|██████████| 10/10 [00:08<00:00, 1.25it/s, Active: 27, Dead: 1]
Tracking Progress: 100%|██████████| 10/10 [00:04<00:00, 2.18it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:18<00:00, 1.83s/it, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:20<00:00, 2.04s/it, Active: 27, Dead: 1]
Tracking Progress: 100%|██████████| 10/10 [00:11<00:00, 1.17s/it, Active: 26, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:12<00:00, 1.23s/it, Active: 25, Dead: 0]
Best HOTA: 0.506
HOTA score after tuning (BYTETracker): 0.506
TeamTrack Tracker#
Work in progress