This browser complements the narrative Benchmark Results and TabArena Results pages with an interactive explorer backed by the published public data bundles.
The benchmark tab covers the public HDLSS Val-18 / Val-19 / Val-20 / Val-21 bundle, including per-run metrics, profile summaries, dataset metadata, and the SOTA comparison bands shown in the results page. The Auto Router tab summarizes the packaged V25 router evidence and candidate policy. The TabArena tab loads the public general-tabular snapshot when that bundle is available at publish time.
Use the benchmark tab when you want to slice the HDLSS validation surface by family, campaign, campaign scope, tier, or domain. Use the Auto Router tab when you want to inspect the V25 training-CV policy, candidate selections, and current holdout status. Use the TabArena tab when you want to inspect the general-tabular comparison snapshot and the per-dataset gap against the current official best method. For a compact guide to the campaign families and datasets exposed here, see Browser Data Guide. When you want the published seeds and run settings behind a profile, open the Profile Config Browser.
Interactive browser
Interactive result explorer
Explore the published benchmark results at your own pace, compare the strongest profiles, and jump into the exact seeds and run settings behind any profile when you want the full picture.
Best filtered profile per dataset. Click a point to focus the dataset detail view.
Family frontier
Mean filtered profile performance by experiment family.
Dataset detail
Top profiles for the selected dataset, or the global filtered frontier when nothing is selected.
V25 policy slices
Training-CV aggregate and the latest available frozen-router holdout context.
Candidate selection counts
How often each supported V25 candidate was selected by the calibrated policy.
Training datasets
Out-of-fold V25 policy deltas against the current default, aggregated by dataset.
Leaderboard snapshot
Overall Elo ladder with the current `tabnetics (general)` row highlighted.
Per-dataset gap to best official method
Positive deltas are behind the official best; negative deltas mean tabnetics wins that dataset slice.
Documentation and webpages on this site are generated from authoritative internal sources using a combination of deterministic rules and generative AI. Errors are possible. Please report issues via GitHub Discussions or email [email protected].