TabArena Benchmark Results

Context: TabArena is a NeurIPS 2025 benchmark suite of 38 classification datasets within a larger living tabular benchmark. This page positions the current tabnetics general run against the current TabArena leaderboard using the same core scoring model: task-weighted pairwise battles, MLE Elo with bootstrap confidence intervals, average rank, win-rate, MRR, and normalized score.

Note: this is an informational snapshot showing how tabnetics competes on general tabular data. General-tabular optimization is not currently the focus of the library, and this page should be read as a reference point rather than as a formal leaderboard submission.

Interactive browser: The same public snapshot is also available in the static Results Browser alongside the HDLSS benchmark explorer.

General profile

Run configuration

Parameter	Value
Profile	general
Seeds	42
Max training samples	50000
Task timeout	3600 s
Workers	12
Classifier oracle	MNPO hybrid
Leaderboard bootstrap	200 rounds

Current run snapshot

On the 38 classification datasets currently covered by the merged general profile run, tabnetics (general) receives Elo 1012.1 in the overall leaderboard-style comparison, with normalized score 0.105.

The corresponding binary and multiclass leaderboard rows are shown below, followed by the dataset-level results from this run against the current official benchmark table.

Tabnetics row

method	Elo	Elo 95% CI	Score	Rank	Winrate	MRR
tabnetics (general)	1012.1	+105/-121	0.105	32.56	0.283	0.116

Nearby overall leaderboard rows

method	Elo	Score	Rank	Winrate
NN_TORCH (default)	1079.2	0.017	29.68	0.348
FASTAI (default)	1040.5	0.038	31.39	0.309
tabnetics (general)	1012.1	0.105	32.56	0.283
RF (default)	1000	0.022	33.04	0.272
LR (tuned + ensemble)	980.6	0.027	33.78	0.255

Binary leaderboard row

method	Elo	Elo 95% CI	Score	Rank	Winrate	MRR
tabnetics (general)	1008.3	+135/-151	0.076	32.84	0.276	0.099

Multiclass leaderboard row

method	Elo	Elo 95% CI	Score	Rank	Winrate	MRR
tabnetics (general)	1027.3	+269/-481	0.213	31.49	0.307	0.176

Per-dataset comparison against the current official best method

Dataset	problem_type	Bal. Acc.	Tabnetics metric_error	Best official metric_error	Best official method	Delta vs best	Dataset rank	Selected model
APSFailure	binary	0.957	0.0139	0.0071	TABICL (default)	0.0069	42	mnpo_lr
Amazon_employee_access	binary	0.756	0.1437	0.1168	CAT (tuned)	0.0269	11	mnpo_rf
Bank_Customer_Churn	binary	0.766	0.1589	0.1256	TABPFNV2 (tuned)	0.0333	39	mnpo_lr
Bioresponse	binary	0.788	0.1459	0.1243	XGB (tuned + ensemble)	0.0216	36	mnpo_rf
Diabetes130US	binary	0.573	0.39	0.3277	GBM (tuned + ensemble)	0.0623	41	mnpo_lr
E-CommereShippingData	binary	0.688	0.2715	0.2557	TABPFNV2 (default)	0.0159	42	mnpo_lr
Fitness_Club	binary	0.735	0.1911	0.1781	TABPFNV2 (default)	0.013	32	mnpo_lr
GiveMeSomeCredit	binary	0.71	0.2281	0.1329	TABM (tuned + ensemble)	0.0953	42	mnpo_lr
HR_Analytics_Job_Change_of_Data_Scientists	binary	0.73	0.2208	0.1947	TABICL (default)	0.0261	42	mnpo_rf
Is-this-a-good-customer	binary	0.678	0.298	0.2495	EBM (default)	0.0485	41	mnpo_lr
Marketing_Campaign	binary	0.784	0.1342	0.0806	TABPFNV2 (tuned + ensemble)	0.0536	42	mnpo_lr
NATICUSdroid	binary	0.931	0.0199	0.0126	TABICL (default)	0.0074	40	mnpo_lr
bank-marketing	binary	0.697	0.2395	0.2344	CAT (default)	0.0051	27	mnpo_lr
blood-transfusion-service-center	binary	0.734	0.2189	0.2445	FASTAI (tuned + ensemble)	-0.0256	1	mnpo_lr
churn	binary	0.856	0.0934	0.0695	MNCA (default)	0.0238	38	mnpo_xgb
coil2000_insurance_policies	binary	0.632	0.3008	0.2268	TABPFNV2 (tuned + ensemble)	0.0739	40	mnpo_lr
credit-g	binary	0.661	0.2452	0.2037	GBM (tuned + ensemble)	0.0416	42	mnpo_lr
credit_card_clients_default	binary	0.695	0.271	0.2121	TABICL (default)	0.0589	42	mnpo_nb
customer_satisfaction_in_airline	binary	0.939	0.0138	0.0049	REALMLP (tuned + ensemble)	0.0089	36	mnpo_rf
diabetes	binary	0.805	0.1137	0.1556	TABPFNV2 (default)	-0.0419	1	mnpo_lr
hazelnut-spread-contaminant-detection	binary	0.927	0.0244	0.0076	TABDPT (default)	0.0168	22	mnpo_lgbm
heloc	binary	0.715	0.2051	0.1987	TABPFNV2 (tuned + ensemble)	0.0064	27	mnpo_lr
in_vehicle_coupon_recommendation	binary	0.75	0.1748	0.1483	TABM (tuned + ensemble)	0.0265	23	mnpo_lgbm
jm1	binary	0.654	0.2904	0.2239	TABICL (default)	0.0665	44	mnpo_lr
kddcup09_appetency	binary	0.74	0.1867	0.1542	CAT (default)	0.0325	25	mnpo_lr
online_shoppers_intention	binary	0.821	0.0996	0.0627	TABPFNV2 (tuned + ensemble)	0.0369	42	mnpo_lr
polish_companies_bankruptcy	binary	0.777	0.0681	0.0187	TABPFNV2 (tuned + ensemble)	0.0494	31	mnpo_lgbm
qsar-biodeg	binary	0.865	0.0822	0.0615	TABICL (default)	0.0207	40	mnpo_lr
seismic-bumps	binary	0.651	0.2824	0.2166	TABICL (default)	0.0658	42	mnpo_lr
taiwanese_bankruptcy_prediction	binary	0.837	0.0993	0.0547	REALMLP (tuned + ensemble)	0.0446	42	mnpo_lr
MIC	multiclass	0.373	2.2882	0.4303	TABM (tuned + ensemble)	1.858	45	mnpo_elastic_net_lr
SDSS17	multiclass	0.964	0.1221	0.0723	RF (tuned + ensemble)	0.0498	37	mnpo_rf
anneal	multiclass	0.894	0.1004	0.0156	TABPFNV2 (default)	0.0847	42	mnpo_lr
hiva_agnostic	multiclass	0.349	1.43	0.1738	RF (tuned)	1.2562	45	mnpo_knn
maternal_health_risk	multiclass	0.84	0.3628	0.4048	TABDPT (default)	-0.0419	1	mnpo_xgb
splice	multiclass	0.975	0.1048	0.0993	TABPFNV2 (tuned + ensemble)	0.0056	5	mnpo_xgb
students_dropout_and_academic_success	multiclass	0.693	0.6439	0.5266	TABPFNV2 (tuned + ensemble)	0.1173	42	mnpo_lr
website_phishing	multiclass	0.885	0.2802	0.2215	TABPFNV2 (tuned + ensemble)	0.0587	32	mnpo_rf

Interpretation

This page is intended as an informational view of the current tabnetics run against the current TabArena leaderboard. It provides a general-tabular reference point while the library remains focused primarily on HDLSS problems.

Documentation and webpages on this site are generated from authoritative internal sources using a combination of deterministic rules and generative AI. Errors are possible. Please report issues via GitHub Discussions or email [email protected].