Computation
See recent articles
Showing new listings for Wednesday, 9 April 2025
- [1] arXiv:2406.08366 (replaced) [pdf, html, other]
-
Title: Highest Probability Density Conformal RegionsSubjects: Methodology (stat.ME); Computation (stat.CO)
This paper proposes a new method for finding the highest predictive density set or region, within the heteroscedastic regression framework. This framework enjoys the property that any highest predictive density set is a translation of some scalar multiple of a highest density set for the standardized regression error, with the same prediction accuracy. The proposed method leverages this property to efficiently compute conformal prediction regions, using signed conformal inference, kernel density estimation, in conjunction with any conditional mean, and scale estimators. While most conformal prediction methods output prediction intervals, this method adapts to the target. When the target is multi-modal, the proposed method outputs an approximation of the smallest multi-modal set. When the target is uni-modal, the proposed method outputs an approximation of the smallest interval. Under mild regularity conditions, we show that these conformal prediction sets are asymptotically close to the true smallest prediction sets. Because of the conformal guarantee, even in finite sample sizes the method has guaranteed coverage. With simulations and a real data analysis we demonstrate that the proposed method is better than existing methods when the target is multi-modal, and gives similar results when the target is uni-modal. Supplementary materials, including proofs and additional images, are available online.
- [2] arXiv:2406.16465 (replaced) [pdf, other]
-
Title: Genealogical processes of sequential Monte Carlo methods and other non-neutral population models under rapid mutationSubjects: Probability (math.PR); Populations and Evolution (q-bio.PE); Computation (stat.CO)
We show that genealogical trees arising from a broad class of non-neutral models of population evolution converge to the Kingman coalescent under a suitable rescaling of time. As well as non-neutral biological evolution, our results apply to genetic algorithms encompassing the prominent class of sequential Monte Carlo (SMC) methods. The time rescaling we need differs slightly from that used in classical results for convergence to the Kingman coalescent, which has implications for the performance of different resampling schemes in SMC algorithms. In addition, our work substantially simplifies earlier proofs of convergence to the Kingman coalescent, and corrects an error common to several earlier results.
- [3] arXiv:2410.09697 (replaced) [pdf, html, other]
-
Title: Provable Convergence and Limitations of Geometric Tempering for Langevin DynamicsSubjects: Machine Learning (stat.ML); Machine Learning (cs.LG); Computation (stat.CO)
Geometric tempering is a popular approach to sampling from challenging multi-modal probability distributions by instead sampling from a sequence of distributions which interpolate, using the geometric mean, between an easier proposal distribution and the target distribution. In this paper, we theoretically investigate the soundness of this approach when the sampling algorithm is Langevin dynamics, proving both upper and lower bounds. Our upper bounds are the first analysis in the literature under functional inequalities. They assert the convergence of tempered Langevin in continuous and discrete-time, and their minimization leads to closed-form optimal tempering schedules for some pairs of proposal and target distributions. Our lower bounds demonstrate a simple case where the geometric tempering takes exponential time, and further reveal that the geometric tempering can suffer from poor functional inequalities and slow convergence, even when the target distribution is well-conditioned. Overall, our results indicate that geometric tempering may not help, and can even be harmful for convergence.
- [4] arXiv:2411.02770 (replaced) [pdf, html, other]
-
Title: A spectral mixture representation of isotropic kernels to generalize random Fourier featuresComments: 19 pages, 16 figuresSubjects: Machine Learning (cs.LG); Probability (math.PR); Computation (stat.CO); Machine Learning (stat.ML)
Rahimi and Recht (2007) introduced the idea of decomposing positive definite shift-invariant kernels by randomly sampling from their spectral distribution. This famous technique, known as Random Fourier Features (RFF), is in principle applicable to any such kernel whose spectral distribution can be identified and simulated. In practice, however, it is usually applied to the Gaussian kernel because of its simplicity, since its spectral distribution is also Gaussian. Clearly, simple spectral sampling formulas would be desirable for broader classes of kernels. In this paper, we show that the spectral distribution of positive definite isotropic kernels in $\mathbb{R}^{d}$ for all $d\geq1$ can be decomposed as a scale mixture of $\alpha$-stable random vectors, and we identify the mixing distribution as a function of the kernel. This constructive decomposition provides a simple and ready-to-use spectral sampling formula for many multivariate positive definite shift-invariant kernels, including exponential power kernels, generalized Matérn kernels, generalized Cauchy kernels, as well as newly introduced kernels such as the Beta, Kummer, and Tricomi kernels. In particular, we retrieve the fact that the spectral distributions of these kernels are scale mixtures of the multivariate Gaussian distribution, along with an explicit mixing distribution formula. This result has broad applications for support vector machines, kernel ridge regression, Gaussian processes, and other kernel-based machine learning techniques for which the random Fourier features technique is applicable.
- [5] arXiv:2412.11692 (replaced) [pdf, html, other]
-
Title: A partial likelihood approach to tree-based density modeling and its application in Bayesian inferenceSubjects: Methodology (stat.ME); Statistics Theory (math.ST); Computation (stat.CO); Machine Learning (stat.ML)
Tree-based priors for probability distributions are usually specified using a predetermined, data-independent collection of candidate recursive partitions of the sample space. To characterize an unknown target density in detail over the entire sample space, candidate partitions must have the capacity to expand deeply into all areas of the sample space with potential non-zero sampling probability. Such an expansive system of partitions often incurs prohibitive computational costs and makes inference prone to overfitting, especially in regions with little probability mass. Thus, existing models typically make a compromise and rely on relatively shallow trees. This hampers one of the most desirable features of trees, their ability to characterize local features, and results in reduced statistical efficiency. Traditional wisdom suggests that this compromise is inevitable to ensure coherent likelihood-based reasoning in Bayesian inference, as a data-dependent partition system that allows deeper expansion only in regions with more observations would induce double dipping of the data. We propose a simple strategy to restore coherency while allowing the candidate partitions to be data-dependent, using Cox's partial likelihood. Our partial likelihood approach is broadly applicable to existing likelihood-based methods and, in particular, to Bayesian inference on tree-based models. We give examples in density estimation in which the partial likelihood is endowed with existing priors on tree-based models and compare with the standard, full-likelihood approach. The results show substantial gains in estimation accuracy and computational efficiency from adopting the partial likelihood.