Evaluation

The iMED Challenge evaluates held-out surgical scenes using task-specific metrics aligned with the official Synapse task definitions: geometric accuracy for iMED PE and photometric fidelity for iMED NVS.

Evaluation metrics

iMED PE

Primary metric: Absolute Trajectory Error (ATE) after scale alignment to the metric trajectory

Also reported: % Successfully Registered Frames, Runtime

iMED NVS

Primary metrics: PSNR, SSIM

Also reported: Training FPS, Rendering FPS

Metric summary

Task Per-sequence computation Aggregate summary Missing outputs
iMED PE ATE after scale alignment to the metric trajectory; successful registration percentage and runtime are also reported Mean ATE over all test sequences Missing outputs count as failure cases with zero successful registrations
iMED NVS PSNR and SSIM computed on held-out Endoscope 1 views; training FPS and rendering FPS are also reported Mean PSNR and mean SSIM over all test sequences Missing outputs contribute zero-quality frames in the summary

Data usage policy

External data

Publicly available external datasets and pretrained models are allowed, provided their use is documented clearly in the submitted method description.

Private data

Private clinical data and other non-public institutional datasets are not allowed for training, tuning, or evaluation.

Validation and test philosophy

  • Participants are expected to self-evaluate on public development data using official scripts.
  • Intermediate validation submissions are planned before the final test phase.
  • Final rankings are computed only on the official test scenes.

Publication policy

Teams with valid submissions are eligible for inclusion in the joint challenge paper, with up to three named contributors per team depending on total participation. Teams must submit a structured method report and may later publish separate papers that cite the official challenge publication.