Background — Methods and References

Tabnetics implements both novel contributions developed as part of this project and established methods from the feature selection, distribution fitting, and HDLSS classification literature. This document maps each component to its theoretical foundation.

Novel contributions

These components were developed specifically for tabnetics or adapted here to the HDLSS setting. Where there is a direct conceptual precursor, it is cited explicitly.

MNPO — Nash Multi-Portfolio Optimization

The core aggregation engine. Tabnetics’ MNPO formulates method-portfolio selection as a multiplayer game and solves for a Nash equilibrium via KL-regularized mirror descent on the method-weight simplex (Freund & Schapire, 1999). The multiplayer Nash framing draws conceptual inspiration from Wu et al., Multiplayer Nash Preference Optimization, which generalizes Nash-style optimization from two-player to multiplayer preference settings in the context of LLM alignment.

However, the tabnetics adaptation differs from Wu et al. in several fundamental ways that make it a distinct contribution rather than a direct application:

Fixed methods vs. evolving policies — In Wu et al., players are LLM policies that update their parameters across iterations. In tabnetics, the selector surface consists of 40 fixed feature-selection methods (39 engineered selectors plus a random-baseline reference); the “game” determines portfolio weights over them, not policy updates.
Heterogeneous oracles vs. shared preference model — Wu et al.’s formal convergence guarantees (§3.1 of the paper) apply only to the homogeneous case where all players share one preference oracle. Tabnetics uses 2–11 heterogeneous oracles (performance, stability, complexity, etc.), placing it in the general-sum regime where the paper explicitly states no formal convergence guarantees hold (§3.3).
Small-sample regime — Wu et al. train on 60K+ samples with unlimited reward-model queries. Tabnetics estimates each pairwise preference from 5 CV folds, yielding a 6-point discrete scale with substantial quantization noise.
Pairwise, not Plackett-Luce — The paper’s key multiplayer theoretical contribution (Plackett-Luce listwise comparisons) is not used; tabnetics constructs standard pairwise preference matrices.

The solver itself — mirror descent on a simplex with KL regularization toward a reference prior — is mathematically well-established independently of the Wu et al. paper. What Tabnetics adds to HDLSS:

Method portfolios instead of policy populations — mirror descent selects portfolio weights over heterogeneous HDLSS selectors and classifier candidates.
HDLSS-specific oracle utilities — the utility matrix is built from balanced accuracy, stability, complexity, robustness, and diversity signals that matter when p >> n.
Pipeline-level integration — the Nash portfolio is embedded inside a distribution-aware, regime-aware, validation-gated HDLSS pipeline rather than used as a standalone optimization objective.

Key novel elements inside this HDLSS adaptation:

Multi-oracle pairwise preference framework — oracles cast pairwise preferences between candidate method subsets; these are fused via weighted voting or Banzhaf indices.
Banzhaf / Shapley weighting for oracles — oracle influence weights are computed from cooperative game theory rather than fixed by hand.
CVaR oracle — a tail-risk oracle that optimizes conditional value-at-risk over fold-level balanced accuracy.
Complementarity oracle — measures feature-set complementarity via partial information decomposition (PID) or mutual-information redundancy terms.
Oracle redundancy penalty — detects and down-weights oracles whose recommendations are collinear, preventing double-counting.
Adaptive portfolio sizing — the number of methods retained by MNPO scales with dataset difficulty and the distribution of oracle scores.

Regime-gated pipeline routing

A lightweight regime detector classifies datasets into HDLSS tiers (extreme, moderate, mild) and routes each tier to a pre-configured pipeline profile. This avoids running expensive methods (e.g., copula knockoffs on $n < 40$ datasets) where they are statistically unreliable.

Auto-router score model

The V25 auto-router is a packaged runtime selector that predicts balanced accuracy and macro-F1 for a finite set of supported pipeline candidates from dataset-computable descriptors. It turns campaign evidence into an always-on calibration layer: users can run the pipeline without manually choosing validation-era flags, while the model is constrained to profiles that were actually observed in the validation corpus.

The router is deliberately conservative. It excludes dataset identity, validation tiers, and holdout labels from the input feature vector, uses dataset-level cross-validation during training, and applies calibrated thresholds so uncertain cases fall back to the current default-like candidate.

Distribution fitting as a preprocessing stage

While individual distribution families are standard, using distribution fitting as a CDF-based preprocessing step inside a feature-selection pipeline — with bootstrap-calibrated goodness-of-fit tests, L-moment prescreening, and multimodal fallback — is a pipeline-level contribution.

Tri-gate validation protocol

A three-level promotion framework (method-gate → portfolio-gate → campaign-gate) ensures that pipeline changes are validated at the portfolio level with paired statistical tests (Wilcoxon signed-rank) across the full benchmark catalog.

Implemented methods

Each section lists the methods implemented in tabnetics and the papers they are based on.

Implementation note: several methods and benchmark backends are exposed through optional third-party libraries (for example Boruta, SHAP, MAPIE, FLAML, TabPFN, and pytabkit). Their upstream licenses/terms still apply when those integrations are enabled; see Using Tabnetics -> Third-party integrations and licenses.

Feature selection — stability-based

Method	Reference
Stability Selection (Lasso)	Meinshausen & Bühlmann. “Stability selection.” J. Royal Statistical Society B, 72(4):417–473, 2010.
Complementary Subsampling	Shah & Samworth. “Variable selection with error control.” J. Royal Statistical Society B, 75(1):55–80, 2013.
TIGRESS	Haury et al. “TIGRESS: Trustful Inference of Gene REgulation using Stability Selection.” BMC Systems Biology, 6:145, 2012.
IPSS (Integrated Path Stability Selection)	Melikechi et al. “Integrated path stability selection.” arXiv:2403.15877, 2024.
Cluster Stability Selection	Faletto & Bien. “Cluster stability selection.” Computational Statistics & Data Analysis, 177:107579, 2022.

Feature selection — knockoff-based

Method	Reference
Copula Knockoffs (D-vine)	Román-Vásquez et al. “Vine copula knockoff filter for high-dimensional controlled variable selection.” arXiv:2410.00650, 2024.
Knockoff Filter (general framework)	Candès et al. “Panning for gold: ‘model-X’ knockoffs for high dimensional controlled variable selection.” J. Royal Statistical Society B, 80(3):551–577, 2018.
Derandomized Knockoffs	Ren & Candès. “Derandomizing knockoffs.” arXiv:2205.00556, 2022.

Feature selection — filter and information-theoretic

Method	Reference
mRMR (Minimum Redundancy Maximum Relevance)	Peng, Long & Ding. “Feature selection based on mutual information.” IEEE Trans. Pattern Analysis & Machine Intelligence, 27(8):1226–1238, 2005.
JMI (Joint Mutual Information)	Yang & Moody. “Data visualization and feature selection: new algorithms for nongaussian data.” NIPS, 1999.
CMIM (Conditional Mutual Information Maximisation)	Fleuret. “Fast binary feature selection with conditional mutual information.” JMLR, 5:1531–1555, 2004.
FCBF (Fast Correlation-Based Filter)	Yu & Liu. “Efficient feature selection via analysis of relevance and redundancy.” JMLR, 5:1205–1224, 2004.
HSIC Lasso	Climente-González et al. “Block HSIC Lasso: model-free biomarker detection for ultra-high dimensional data.” Bioinformatics, 35(14):i427–i435, 2019.

Feature selection — tree and wrapper

Method	Reference
Boruta	Kursa & Rudnicki. “Feature selection with the Boruta package.” J. Statistical Software, 36(11):1–13, 2010.
RFECV (Recursive Feature Elimination)	Guyon et al. “Gene selection for cancer classification using support vector machines.” Machine Learning, 46:389–422, 2002.
TreeSHAP	Lundberg et al. “From local explanations to global understanding with explainable AI for trees.” Nature Machine Intelligence, 2:56–67, 2020.

Feature selection — multiclass-specific

Method	Reference
Nearest Shrunken Centroids	Tibshirani et al. “Diagnosis of multiple cancer types by shrunken centroids of gene expression.” PNAS, 99(10):6567–6572, 2002.
OVA ensemble	Rifkin & Klautau. “In defense of one-vs-all classification.” JMLR, 5:101–141, 2004. Representative source for the one-vs-all decomposition that the Tabnetics selector adapts to feature ranking.
ECOC class-aware decomposition	Dietterich & Bakiri. “Solving multiclass learning problems via error-correcting output codes.” JAIR, 2:263–286, 1995.
SIR / SAVE / PFC (sufficient dimension reduction)	Li. “Sliced inverse regression for dimension reduction.” JASA, 86(414):316–327, 1991; Cook & Weisberg. “Discussion of Li (1991).” JASA, 86(414):328–332, 1991; Cook. “Principal fitted components for dimension reduction in regression.” Statistical Science, 22(1):1–26, 2008.

Feature selection — pairwise and AUC-based

Method	Reference
WMW AUC filter	Bamber. “The area above the ordinal dominance graph and the area below the receiver operating characteristic graph.” Journal of Mathematical Psychology, 12(4):387–415, 1975. Provides the ROC/AUC interpretation behind the Wilcoxon-Mann-Whitney ranking used here.
k-TSP (k Top Scoring Pairs)	Tan et al. “Simple decision rules for classifying human cancers from gene expression profiles.” Bioinformatics, 21(20):3896–3904, 2005.
Joint AUC+L1 selector	Ma et al. “Prediction-based structured variable selection through the receiver operating characteristic curves.” Biometrics, 67(3):896–905, 2011. Representative source for sparse ROC/AUC-aware logistic selection in the same family as the Tabnetics implementation.

Feature selection — game-theoretic weights

Concept	Reference
Multiplayer Nash preference framing (conceptual inspiration for Tabnetics MNPO)	Wu et al. Multiplayer Nash Preference Optimization. arXiv:2509.23102, 2025. Tabnetics draws conceptual inspiration from the multiplayer Nash framing but differs fundamentally in player semantics, oracle structure, and data regime (see MNPO section above).
Multiplicative weights / online mirror descent	Freund & Schapire. “Adaptive game playing using multiplicative weights.” Games and Economic Behavior, 29(1–2):79–103, 1999. The algorithmic foundation for MNPO’s equilibrium solver.
Banzhaf value (oracle weighting)	Wang & Jia. “Data Banzhaf: A Robust Data Valuation Framework for Machine Learning.” AISTATS, 2023.
Kernel Banzhaf	Liu et al. “KernelSHAP-IQ: Weighted Least Square Optimization for Shapley Interactions.” arXiv:2405.10852, 2024.
Shapley value	Shapley. “A value for n-person games.” Contributions to the Theory of Games, 2:307–317, 1953.
QRE (Quantal Response Equilibrium)	McKelvey & Palfrey. “Quantal response equilibria for normal form games.” Games and Economic Behavior, 10(1):6–38, 1995.

Distribution fitting

Component	Reference
Parametric families (20+)	Standard implementations: normal, log-normal, gamma, Weibull, beta, GEV, GPD, Johnson $S_B$/$S_U$, skew-normal, folded-normal, inverse-Gaussian, Burr III/XII, Dagum, sinh-arcsinh, etc. via `scipy.stats`.
L-moment prescreening	Hosking. “L-moments: analysis and estimation of distributions using linear combinations of order statistics.” J. Royal Statistical Society B, 52(1):105–124, 1990.
Bootstrap-calibrated GOF	Parametric bootstrap following Efron & Tibshirani. An Introduction to the Bootstrap, 1994, to calibrate Kolmogorov–Smirnov and Cramér–von Mises p-values for small samples.
Maximum product spacing (MPS)	Ranneby. “The maximum spacing method. An estimation method related to the maximum likelihood method.” Scandinavian J. Statistics, 11(2):93–112, 1984.
CRPS scoring	Gneiting & Raftery. “Strictly proper scoring rules, prediction, and estimation.” JASA, 102(477):359–378, 2007.

Batch correction

Method	Reference
ComBat	Johnson, Li & Rabinovic. “Adjusting batch effects in microarray expression data using empirical Bayes methods.” Biostatistics, 8(1):118–127, 2007.

Classification

Method	Reference
PLS-DA	Barker & Rayens. “Partial least squares for discrimination.” J. Chemometrics, 17(3):166–173, 2003.
Sparse PLS-DA	Lê Cao, Boitard & Besse. “Sparse PLS discriminant analysis: biologically relevant feature selection and graphical displays for multiclass problems.” BMC Bioinformatics, 12:253, 2011. The `spls_da_classifier` backend follows this family and uses BER-driven component selection rather than variance-maximization heuristics.
DLDA (Diagonal LDA)	Dudoit, Fridlyand & Speed. “Comparison of discrimination methods for the classification of tumors using gene expression data.” JASA, 97(457):77–87, 2002.
HDRDA-style regularized DA	Yata & Aoshima. “Effective PCA for high-dimension, low-sample-size data with noise reduction via geometric representations.” J. Multivariate Analysis, 105:193–215, 2012; Aoshima & Yata. “Two-stage procedures for high-dimensional data.” Sequential Analysis, 30(4):356–399, 2011. `hdrda` is a lightweight internal backend anchored to this HDLSS regularization/noise-reduction line rather than a byte-for-byte reproduction of a single package.
Distance-Weighted Discrimination (DWD)	Marron et al. “Distance-weighted discrimination.” JASA, 102(480):1267–1271, 2007. The `dwd_classifier` backend follows the DWD family and uses a generalized-mean-distance style optimization path for practical sklearn compatibility.
ECOC multiclass wrappers	Dietterich & Bakiri. “Solving multiclass learning problems via error-correcting output codes.” JAIR, 2:263–286, 1995. `ecoc_hdrda`, `ecoc_dwd`, and `ecoc_svm_linear` wrap binary-capable HDLSS backends with an ECOC scaffold.
Random Fourier Features + LR	Rahimi & Recht. “Random Features for Large-Scale Kernel Machines.” NeurIPS, 2007. `rff_lr` adds a controlled nonlinear kernel approximation without leaving the linear-model training regime.
Nearest subspace classifier	Tsuda. “Subspace classifier in the Hilbert space.” Pattern Recognition Letters, 20(5):513–519, 1999. `near_subspace` is the classical nearest-subspace / reconstruction-error family adapted to the current HDLSS pipeline.
Spatial-median DA	Hall, Titterington & Xue. “Median-Based Classifiers for High-Dimensional Data.” JASA, 104(488):1597–1608, 2009. `spatial_median_da` is a lightweight robust distance classifier anchored to spatial/geometric-median HDLSS work rather than an exact reproduction of a single published estimator.
Copula discriminant analysis	Han, Zhao & Liu. “CODA: High Dimensional Copula Discriminant Analysis.” JMLR, 14:629–671, 2013; Tekle & de Leon. “Gaussian copula distributions for mixed data, with application in discrimination.” J. Statistical Computation and Simulation, 86(9):1643–1659, 2016. `copula_da` is a simplified Gaussian-copula-style backend intended for the pipeline’s CDF-to-Gaussian feature space.
TabPFN	Hollmann et al. “Accurate predictions on small data with a tabular foundation model.” Nature, 637:319–326, 2025.
TabM	Gorishniy et al. “TabM: Advancing Tabular Deep Learning With Parameter-Efficient Ensembling.” ICLR, 2025. Two backends: numpy approximation (`tabm`) and official PyTorch implementation via pytabkit (`tabm_official`).
RealMLP	Holzmüller et al. “Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data.” NeurIPS, 2024. Two backends: numpy approximation (`realmlp`) and official RealMLP-TD via pytabkit (`realmlp_td`).
CPDA (Copula Probabilistic DA)	Internal contribution. Copula-based probabilistic discriminant analysis for HDLSS classification; fits marginal CDFs per class and models joint dependence via a Gaussian copula.
pytabkit	Holzmüller et al. “Better by Default: Strong Pre-Tuned MLPs and Boosted Trees on Tabular Data.” NeurIPS, 2024. Optional backend library providing sklearn-compatible wrappers for TabM and RealMLP-TD.
Conformal prediction (MAPIE)	Taquet et al. “MAPIE: an open-source library for distribution-free uncertainty quantification.” arXiv:2207.12274, 2022.
UBayFS	Jenul et al. “UBayFS: An R package for user guided feature selection.” JOSS, 7(79):4848, 2022.

Tabnetics treats conformal prediction as an uncertainty and efficiency layer, not as a point-accuracy optimizer. For the singleton-rate / compactness interpretation used in the validation analyses, see Wang, Sun & Dobriban 2025 and Hallberg Szabadváry et al. 2025.

The public classifier surface now mixes exact paper-driven reproductions, lightweight family-inspired implementations, and wrapper-style deployment helpers. In particular, hdrda, near_subspace, spatial_median_da, and copula_da should be read as reference-anchored practical variants of those method families, while ecoc_* entries are deployment wrappers around binary-capable backends.

Multi-omics

Component	Reference
DIABLO-style multi-block PLS	Singh et al. “DIABLO: an integrative approach for identifying key molecular drivers from multi-omics assays.” Bioinformatics, 35(17):3055–3062, 2019.
MINT batch correction	Rohart et al. “MINT: a multivariate integrative method to identify reproducible molecular signatures across independent experiments and platforms.” BMC Bioinformatics, 18:128, 2017.
Multi-omics review (cancer)	Cai et al. “Machine learning for multi-omics data integration in cancer.” iScience, 25(2):103798, 2022.

Benchmark datasets

Tabnetics includes a curated registry of HDLSS benchmark datasets. Key sources:

Source	Reference
CuMiDa (curated microarrays)	Feltes et al. “CuMiDa: An extensively curated microarray database for benchmarking and testing of machine learning approaches.” J. Computational Biology, 26(4):376–386, 2019.
de Souto benchmark	de Souto et al. “Clustering cancer gene expression data: a comparative study.” BMC Bioinformatics, 9:497, 2008.
Statnikov multi-category	Statnikov et al. “A comprehensive comparison of random forests and support vector machines for microarray-based cancer classification.” BMC Bioinformatics, 9:319, 2008.
MAQC-II consortium	Shi et al. “The MicroArray Quality Control (MAQC)-II study of common practices for the development and validation of microarray-based predictive models.” Nature Biotechnology, 28:827–838, 2010.
Leukemia (Golub)	Golub et al. “Molecular classification of cancer: class discovery and class prediction by gene expression monitoring.” Science, 286(5439):531–537, 1999.
MLL leukemia	Armstrong et al. “MLL translocations specify a distinct gene expression profile that distinguishes a unique leukemia.” Nature Genetics, 30:41–47, 2002.
Glioma (Nutt)	Nutt et al. “Gene expression-based classification of malignant gliomas.” Cancer Research, 63(7):1602–1607, 2003.