VISTA3D — Multi-class Worker (v3)

Full-body CT multi-class segmentation in the browser. Single sliding pass producing 118 class channels (MONAI EVERYTHING_PROMPT), via tile-based GPU accumulator. Encoder/decoder/post_mapping run once per patch, shared across all classes; only the final class-embedding matmul scales with N_cls. Runs in a Web Worker — UI stays live during the multi-minute sliding.  ·  ← v2 (single-class)
✓ Node M1 verified (2026-04-24): 213×213×163 / 18 patches / 118 classes → 71.5 s total, 2 tiles × 1.75/1.74 GB accum. Liver voxels recall 99.23% vs single-class reference. Single-class rel_rms 4.4e-7 vs MONAI SlidingWindowInferer. All fp32, same precision as PyTorch MPS (~1e-6 floor).
⚠ Download: ~786 MB weights + 28 MB canonical CT.
✨ Dual memory mode — auto-picks per device: Real clinical CT (512×512×300 → canonical 239×239×200) pushes peak to ~10 GB — fits M1 Pro/Max UMA fast; M1 8 GB / 8 GB discrete should pick Safe. 2 GB integrated still requires fp16 (not yet).
Demo CT uses a Python-preprocessed canonical (bit-exact vs MONAI; shows "liver recall 99.23%" parity). Upload path runs our on-device preprocess (ScaleIntensityRange + Orient→RAS + Spacing→1.5mm iso, trilinear align_corners=False); diverges ~5% rel_rms from VISTA3D's training align_corners=True convention but fine for upload visualization.

Log

Axial slice viewer

CT canonical input
Label map (argmax + threshold)
Overlay