SynthSeg 2.0 — 33-class brain MRI

Contrast-agnostic whole-brain segmentation (Iglesias group, MGH; Apache-2.0). Single 5-level 3D U-Net (13.24 M params) → softmax → GaussianBlur(σ=0.5) → argmax. WebGPU + WASM forward, all post-network ops in SIMD128 host helpers. Verified 100.0000 % bit-exact against ORT-CPU 256³ on real T1.
✓ Direct 256³ inference, no tiling. The fused synthseg_skip_up_conv3d.wgsl kernel eliminates dec lvl 3's 4.83 GB cat buffer (would exceed WebGPU's 4 GB single-buffer cap) by reading skip_0 and dec2_bn directly with on-the-fly nearest 2× upsample.
⚠ Memory: the full 256³ × 33-class logits accumulator is 2.21 GB fp32. Apple Silicon UMA (16 GB+) is comfortable; on a < 4 GB-tab discrete-GPU machine this may OOM. Forward stays on the GPU; readback is auto-chunked into 3 × ≤ 0.7 GB MAP_READ slabs to dodge the Dawn / wasm_webgpu > 2 GB binding limit.
idle

Log

Orthogonal slices (T1 + 33-class overlay)