custom_ops ViT-B/16 web test

ImageNet 1000-class Vision Transformer (B/16: patch=16, hidden=768, 12 blocks, 12 heads). Input is the precomputed x.bin (224×224 RGB, ImageNet-normalized). All compute (WebGPU + WASM) runs in a dedicated Web Worker so the page stays responsive during model init.

custom_ops ViT-B/16 — verification harness

Top-5

Top-5 (PyTorch reference)

Log