Tracking Comparison#

Note

Work in progress, we are planning to add more explainations to each algorithm.

Prepare dataset#

First, we’ll prepare the dataset. To perform tuning we also need the ground truth tracking data.

import sportslabkit as slk
from sportslabkit.logger import set_log_level

dataset_path = slk.datasets.get_path("top_view")
path_to_csv = sorted(dataset_path.glob("annotations/*.csv"))[0]
path_to_mp4 = sorted(dataset_path.glob("videos/*.mp4"))[0]

root = slk.utils.get_git_root()
cam = slk.Camera(path_to_mp4)

# For the sake of speed, we'll only use the first 10 frames
n_frames = 10
frames = cam[:n_frames]

bbdf_gt = slk.load_df(path_to_csv)
# TODO: Hopefully we can get rid of this 
if bbdf_gt.index[0] == 0:
    bbdf_gt.index += 1
bbdf_gt = bbdf_gt[:n_frames]

SORT Tracker#

Next, we’ll set up the tracker to be used, focusing on SORT, a simple yet effective Tracker in SportsLabKit. SORT relies on both a detection model and a motion model, each with specific configurations. Here are the details:

  • Detection Model - YOLOv8x

    • conf - Confidence threshold for detecting objects (Default: 0.5)

    • iou - Intersection-over-Union threshold for suppressing duplicate detections (Default: 0.3)

    • imgsz - Image size to which the input is resized, affecting detection (Default: 2560x2560 pixels)

  • Motion Model - KalmanFilterMotionModel

    • dt - Time step between consecutive measurements, crucial for predicting object position (Default: 1/30)

    • process_noise - Noise in the process model, representing uncertainty in motion prediction (Default: 1e-4)

    • measurement_noise - Noise in the measurements, representing sensor noise (Default: 1e-1)

    • confidence_scaler - Factor to scale the confidence in prediction, adjusting the influence of measurements vs predictions (Default: 0.5)

  • Tracking Algorithm - SORTTracker

    • metric - The cost metric to use for assignment

    • metric_gate - The gate threshold for the cost metric

    • max_staleness - The number of frames to wait before removing a track

    • min_length - The number of frames to wait before confirming a track

These configurations are essential in setting up the SORT Tracker, and the given defaults provide a good starting point. They can be tuned further as needed to align with the specific use case and environment.

from sportslabkit.mot import SORTTracker

det_model = slk.detection_model.load(
    model_name='yolov8',
    model=root/'models/yolov8/soccer_top_view-model=yolov8x-imgsz=2048.pt',
    conf=0.25,
    iou=0.6,
    imgsz=960,
    device='mps',
    classes=0,
    augment=True,
    max_det=35
)

motion_model = slk.motion_model.load(
    model_name='kalmanfilter',
    dt=1/30,
    process_noise=500,
    measurement_noise=10,
    confidence_scaler=1
)

matching_fn = slk.matching.SimpleMatchingFunction(
    metric=slk.metrics.IoUCMM(use_pred_box=True),
    gate=0.9
)

tracker = SORTTracker(
    detection_model=det_model,
    motion_model=motion_model,
    matching_fn=matching_fn,
    max_staleness=2,
    min_length=2
)

bbdf_pred = tracker.track(frames)
hota = slk.metrics.hota_score(bbdf_gt, bbdf_pred)["HOTA"]
print(f"HOTA Score before Tuning (SORTTracker): {hota:.3f}")
Tracking Progress: 100%|██████████| 10/10 [00:12<00:00,  1.24s/it, Active: 28, Dead: 5]
HOTA Score before Tuning (SORTTracker): 0.498
import optuna

hparam_search_space = {
    'self': {},
    'motion_model':{
        'process_noise': {'type': 'logfloat', 'low': 100,'high': 1000},
        'measurement_noise': {'type': 'logfloat','low': 0.1,'high': 100},
    },
    'matching_fn':{
        'gate': {'type': 'float', 'low': 0.1, 'high': 1}
    }
}

sampler = optuna.samplers.CmaEsSampler()
pruner = optuna.pruners.HyperbandPruner()

best_params, best_hota, study = tracker.tune_hparams(
    frames_list=[frames],
    bbdf_gt_list=[bbdf_gt],
    n_trials=10,
    reuse_detections=True, # Changed to TRUE to reuse detections
    hparam_search_space=hparam_search_space,
    verbose=False, # Changed to False to quiet down the output
    sampler=sampler,
    pruner=pruner,
    return_study=True,
)

print(f"Best params: {best_params}")
print(f"HOTA score after tuning (SORTTracker): {best_hota:.3f}")
tune_hparams:0284  💬| Hyperparameter search space: 
tune_hparams:0286  💬| self: 
tune_hparams:0286  💬| motion_model: 
tune_hparams:0288  💬| 	process_noise: {'type': 'logfloat', 'low': 100, 'high': 1000} 
tune_hparams:0288  💬| 	measurement_noise: {'type': 'logfloat', 'low': 0.1, 'high': 100} 
tune_hparams:0286  💬| matching_fn: 
tune_hparams:0288  💬| 	gate: {'type': 'float', 'low': 0.1, 'high': 1} 
Detecting frames for reuse: 100%|██████████| 10/10 [00:06<00:00,  1.54it/s]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 146.03it/s, Active: 30, Dead: 7]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 197.38it/s, Active: 28, Dead: 6]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 101.45it/s, Active: 31, Dead: 11]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 200.27it/s, Active: 29, Dead: 6]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 188.96it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 146.40it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 183.85it/s, Active: 30, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 195.18it/s, Active: 28, Dead: 7]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 192.69it/s, Active: 29, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 187.59it/s, Active: 29, Dead: 7]
Best params: {'self': {}, 'motion_model': {'process_noise': 285.79024851377466, 'measurement_noise': 0.148890060403773}, 'matching_fn': {'gate': 0.6024539532390903}}
HOTA score after tuning (SORTTracker): 0.500

DeepSORT Tracker#

from sportslabkit.mot import DeepSORTTracker

slk.logger.set_log_level('INFO')
det_model = slk.detection_model.load(
    model_name='yolov8',
    model=root/'models/yolov8/soccer_top_view-model=yolov8x-imgsz=2048.pt',
    conf=0.25,
    iou=0.6,
    imgsz=960,
    device='mps',
    classes=0,
    augment=True,
    max_det=35
)

image_model = slk.image_model.load(
    model_name='mobilenetv2_x1_0',
    image_size=(32,32),
    device='mps'
)

motion_model = slk.motion_model.load(
    model_name='kalmanfilter',
    dt=1/30,
    process_noise=500,
    measurement_noise=10,
    confidence_scaler=1
)

matching_fn = slk.matching.MotionVisualMatchingFunction(
    motion_metric=slk.metrics.IoUCMM(use_pred_box=True),
    motion_metric_gate=0.2,
    visual_metric=slk.metrics.CosineCMM(),
    visual_metric_gate=0.2,
    beta=0.9,
)

tracker = DeepSORTTracker(
    detection_model=det_model,
    image_model=image_model,
    motion_model=motion_model,
    matching_fn=matching_fn,
    max_staleness=2,
    min_length=2
)

bbdf_pred = tracker.track(frames)
hota = slk.metrics.hota_score(bbdf_gt, bbdf_pred)["HOTA"]
print(f"HOTA Score before Tuning (DeepSORTTracker): {hota:.3f}")
[W NNPACK.cpp:64] Could not initialize NNPACK! Reason: Unsupported hardware.
Tracking Progress: 100%|██████████| 10/10 [00:16<00:00,  1.68s/it, Active: 36, Dead: 26]
HOTA Score before Tuning (DeepSORTTracker): 0.408
import optuna

hparam_search_space = {
    'motion_model':{
        'process_noise': {'type': 'logfloat', 'low': 100,'high': 1000},
        'measurement_noise': {'type': 'logfloat','low': 0.1,'high': 100},
    },
    'matching_fn':{
        'motion_metric_gate': {'type': 'float', 'low': 1e-4, 'high': 1},
        'visual_metric_gate': {'type': 'float', 'low': 1e-4, 'high': 1},
        'beta': {'type': 'float', 'low': 1e-4, 'high': 1},
    }
}

sampler = optuna.samplers.CmaEsSampler()
pruner = optuna.pruners.HyperbandPruner()

best_params, best_hota, study = tracker.tune_hparams(
    frames_list=[frames],
    bbdf_gt_list=[bbdf_gt],
    n_trials=10,
    reuse_detections=True, # Changed to TRUE to reuse detections
    hparam_search_space=hparam_search_space,
    verbose=False, # Changed to False to quiet down the output
    sampler=sampler,
    pruner=pruner,
    return_study=True,
)

print(f"Best HOTA: {best_hota:.3f}")
print(f"HOTA score after tuning (DeepSORTTracker): {best_hota:.3f}")
tune_hparams:0284  💬| Hyperparameter search space: 
tune_hparams:0286  💬| motion_model: 
tune_hparams:0288  💬| 	process_noise: {'type': 'logfloat', 'low': 100, 'high': 1000} 
tune_hparams:0288  💬| 	measurement_noise: {'type': 'logfloat', 'low': 0.1, 'high': 100} 
tune_hparams:0286  💬| matching_fn: 
tune_hparams:0288  💬| 	motion_metric_gate: {'type': 'float', 'low': 0.0001, 'high': 1} 
tune_hparams:0288  💬| 	visual_metric_gate: {'type': 'float', 'low': 0.0001, 'high': 1} 
tune_hparams:0288  💬| 	beta: {'type': 'float', 'low': 0.0001, 'high': 1} 
Detecting frames for reuse: 100%|██████████| 10/10 [00:07<00:00,  1.28it/s]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00,  6.65it/s, Active: 33, Dead: 11]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00,  8.55it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 10.53it/s, Active: 29, Dead: 7]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 10.63it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00,  6.20it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00,  8.50it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 10.55it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:00<00:00, 10.68it/s, Active: 28, Dead: 5]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00,  8.92it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:01<00:00,  9.80it/s, Active: 28, Dead: 5]
Best HOTA: 0.499
HOTA score after tuning (DeepSORTTracker): 0.499

ByteTrack Tracker#

from sportslabkit.mot import BYTETracker

slk.logger.set_log_level('INFO')
det_model = slk.detection_model.load(
    model_name='yolov8',
    model=root/'models/yolov8/soccer_top_view-model=yolov8x-imgsz=2048.pt',
    conf=0.25,
    iou=0.6,
    imgsz=960,
    device='mps',
    classes=0,
    augment=True,
    max_det=35
)

image_model = slk.image_model.load(
    model_name='mobilenetv2_x1_0',
    image_size=(32,32),
    device='mps'
)

motion_model = slk.motion_model.load(
    model_name='kalmanfilter',
    dt=1/30,
    process_noise=500,
    measurement_noise=10,
    confidence_scaler=1
)

first_matching_fn = slk.matching.MotionVisualMatchingFunction(
    motion_metric=slk.metrics.IoUCMM(use_pred_box=True),
    motion_metric_gate=0.2,
    visual_metric=slk.metrics.CosineCMM(),
    visual_metric_gate=0.2,
    beta=0.9,
)

second_matching_fn = slk.matching.SimpleMatchingFunction(
    metric=slk.metrics.IoUCMM(use_pred_box=True),
    gate=0.9,
)

tracker = BYTETracker(
    detection_model=det_model,
    image_model=image_model,
    motion_model=motion_model,
    first_matching_fn=first_matching_fn,
    second_matching_fn=second_matching_fn,
    detection_score_threshold=0.6,
    max_staleness=2,
    min_length=2
)

bbdf_pred = tracker.track(frames)
hota = slk.metrics.hota_score(bbdf_gt, bbdf_pred)["HOTA"]
print(f"HOTA Score before Tuning (BYTETracker): {hota:.3f}")
Tracking Progress: 100%|██████████| 10/10 [08:00<00:00, 48.03s/it, Active: 33, Dead: 13]
HOTA Score before Tuning (BYTETracker): 0.424
import optuna

hparam_search_space = {
    'self': {
        'detection_score_threshold': {'type': 'float', 'low': 0.1, 'high': 0.5},
    },
    'motion_model':{
        'process_noise': {'type': 'logfloat', 'low': 100,'high': 1000},
        'measurement_noise': {'type': 'logfloat','low': 0.1,'high': 100},
    },
    'first_matching_fn':{
        'motion_metric_gate': {'type': 'float', 'low': 1e-4, 'high': 1},
        'visual_metric_gate': {'type': 'float', 'low': 1e-4, 'high': 1},
        'beta': {'type': 'logfloat', 'low': 1e-4, 'high': 1},
    },
    'second_matching_fn':{
        'gate': {'type': 'float', 'low': 0.1, 'high': 1}
    }
}

sampler = optuna.samplers.CmaEsSampler()
pruner = optuna.pruners.HyperbandPruner()

best_params, best_hota, study = tracker.tune_hparams(
    frames_list=[frames],
    bbdf_gt_list=[bbdf_gt],
    n_trials=10,
    reuse_detections=True, # Changed to TRUE to reuse detections
    hparam_search_space=hparam_search_space,
    verbose=False, # Changed to False to quiet down the output
    sampler=sampler,
    pruner=pruner,
    return_study=True,
)

print(f"Best HOTA: {best_hota:.3f}")
print(f"HOTA score after tuning (BYTETracker): {best_hota:.3f}")
tune_hparams:0284  💬| Hyperparameter search space: 
tune_hparams:0286  💬| self: 
tune_hparams:0288  💬| 	detection_score_threshold: {'type': 'float', 'low': 0.1, 'high': 0.5} 
tune_hparams:0286  💬| motion_model: 
tune_hparams:0288  💬| 	process_noise: {'type': 'logfloat', 'low': 100, 'high': 1000} 
tune_hparams:0288  💬| 	measurement_noise: {'type': 'logfloat', 'low': 0.1, 'high': 100} 
tune_hparams:0286  💬| first_matching_fn: 
tune_hparams:0288  💬| 	motion_metric_gate: {'type': 'float', 'low': 0.0001, 'high': 1} 
tune_hparams:0288  💬| 	visual_metric_gate: {'type': 'float', 'low': 0.0001, 'high': 1} 
tune_hparams:0288  💬| 	beta: {'type': 'logfloat', 'low': 0.0001, 'high': 1} 
tune_hparams:0286  💬| second_matching_fn: 
tune_hparams:0288  💬| 	gate: {'type': 'float', 'low': 0.1, 'high': 1} 
Detecting frames for reuse:   0%|          | 0/10 [00:00<?, ?it/s]
Detecting frames for reuse: 100%|██████████| 10/10 [05:23<00:00, 32.39s/it]
Tracking Progress: 100%|██████████| 10/10 [00:59<00:00,  5.93s/it, Active: 23, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:18<00:00,  1.85s/it, Active: 25, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:14<00:00,  1.42s/it, Active: 23, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:23<00:00,  2.31s/it, Active: 26, Dead: 1]
Tracking Progress: 100%|██████████| 10/10 [00:08<00:00,  1.25it/s, Active: 27, Dead: 1]
Tracking Progress: 100%|██████████| 10/10 [00:04<00:00,  2.18it/s, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:18<00:00,  1.83s/it, Active: 28, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:20<00:00,  2.04s/it, Active: 27, Dead: 1]
Tracking Progress: 100%|██████████| 10/10 [00:11<00:00,  1.17s/it, Active: 26, Dead: 0]
Tracking Progress: 100%|██████████| 10/10 [00:12<00:00,  1.23s/it, Active: 25, Dead: 0]
Best HOTA: 0.506
HOTA score after tuning (BYTETracker): 0.506

TeamTrack Tracker#

Work in progress