Hemant Ishwaran

Professor, Graduate Program Director, Director of Statistical Methodology, Division of Biostatistics, University of Miami

Spotlight

Lu M. and Ishwaran H. (2021). Cure and death play a role in understanding dynamics for COVID-19: data-driven competing risk compartmental models, with and without vaccination. To appear in PLOS ONE. [pdf] [supplemental pdf]

Lee D.K., Chen N. and Ishwaran H. (2021). Boosted nonparametric hazards with time-dependent covariates. Ann. Statist, 49(4), 2101-2128. [pdf]

Research Interests

Big Data, Boosting, Cancer Staging, Causal Inference, Forests (Trees, Ensembles), Missing data, Nonparametric Bayes, Survival (Machine Learning), Variable Selection (Frequentist and Bayes)

Brief Biography

Postdoctoral Fellow, Health Care Policy, Harvard University, 1995
PhD Statistics, Yale University, 1993
MSc Applied Statistics, Oxford University, 1988
BSc Mathematical Statistics, U of Toronto, 1987

Selected New and Old Papers [Full List]

Mantero A. and Ishwaran H. (2021). Unsupervised random forests. Stat. Anal. Data Mining, 14(2), 144-167. [pdf]

O'Brien R. and Ishwaran H. (2019). A random forests quantile classifier for class imbalanced data. Pattern Recognit., 90, 232-249. [pdf] [html]

Ishwaran H. and Lu M. (2019). Standard errors and confidence intervals for variable importance in random forest regression, classification, and survival. Stat. Med., 38, 558-582. [pdf]

Lu M., Sadiq S., Feaster D.J. and Ishwaran H. (2018). Estimating individual treatment effect in observational data using random forest methods. J. Comp. Graph. Statist, 27(1), 209-219 [pdf] [arXiv:1701.05306]

Pande A., Li L. Rajeswaran J., Ehrlinger J., Kogalur U.B., Blackstone E.H and Ishwaran H. (2017). Boosted multivariate trees for longitudinal data. Machine Learning, 106(2), 277-305. [pdf]

Tang F. and Ishwaran H. (2017). Random forest missing data algorithms. Stat. Anal. Data Mining, 10, 363–377. [pdf] arXiv:1701.05305

Ishwaran H. (2015). The effect of splitting on random forests. Mach. Learning, 99, 75-118. [pdf]

Ehrlinger J. and Ishwaran H. (2012). Characterizing L2Boosting. Ann. Statist, 40, 1074-1101. [pdf]

Ishwaran H., Kogalur U.B., Gorodeski E.Z., Minn A.J. and Lauer M.S. (2010). High-dimensional variable selection for survival data. J. Amer. Stat. Assoc, 105, 205-217. [pdf]

Ishwaran H., Blackstone E.H., Hansen. C.A. and Rice T.W. (2009). A novel approach to cancer staging: application to esophageal cancer. Biostatistics, 10, 603-620. [pdf]

Ishwaran H., James L.F. and Zarepour M. (2009). An alternative to the m out of n bootstrap. J. Stat. Plann. Inference, 139, 788-801. [pdf]

Ishwaran H., Kogalur U.B., Blackstone E.H. and Lauer M.S. (2008). Random survival forests. Ann. Appl. Statist., 2, 841-860. [pdf]

Ishwaran H. (2007). Variable importance in binary regression trees and forests. Electronic J. Statist., 1, 519-537.

Ishwaran H. and Rao J.S. (2005). Spike and slab variable selection: frequentist and Bayesian strategies. Ann. of Stat., 33, 730-773. [pdf]

Ishwaran H. and Rao J.S. (2003). Detecting differentially expressed genes in microarrays using Bayesian model selection. J. Amer. Stat. Assoc., 98, 438-455. [pdf]

Ishwaran H. and James L.F. (2003). Generalized weighted Chinese restaurant processes for species sampling mixture models. Stat. Sinica, 13, 1211-1235. [pdf]

Ishwaran H. and Zarepour M. (2002). Exact and approximate sum-representations for the Dirichlet process. Can. J. Statist. 30, 269-283. [pdf]

Ishwaran H. and James L.F. (2001). Gibbs sampling methods for stick-breaking priors. J. Amer. Stat. Assoc. 96, 161-173. [pdf]

Ishwaran H., James L.F. and Sun J. (2001). Bayesian model selection in finite mixtures by marginal density decompositions. J. Amer. Stat. Assoc. 96, 1316-1332. [pdf]

Ishwaran H. and Gatsonis C. (2000). A general class of hierarchical ordinal regression models with applications to correlated ROC analysis. Can. J. Statist., 28, 731-750. [pdf]

Ishwaran H. (1999). Information in semiparametric mixtures of exponential families. Ann. Statist., 27, 159-177. [pdf]

Ishwaran H. (1996). Identifiability and rates of estimation for scale parameters in location mixture models. Ann. Statist., 24, 1560-1571. [pdf]


randomForestSRC

Fast OpenMP parallel computing of unified Breiman random forests for regression, classification, survival analysis competing risks, multivariate, unsupervised, quantile regression, and class imbalanced q-classification. Missing data imputation, includes missForest and multivariate missForest. Fast subsampling random forests. Confidence intervals for variable importance. Minimal depth variable selection. Visualize trees on your Safari or Google Chrome browser. Anonymous random forests for data privacy. 

NEW! Mahalanobis splitting  for correlated outcomes in multivariate regression.

R package (CRAN build) Github (beta builds)

randomForestSRC vignettes

spikeslab

Spike and slab R package for high-dimensional linear regression models. Uses a generalized elastic net for variable selection. Parallel process enabled. [pdf]


BAMarray (3.0)

Java software for microarray data using Bayesian Analysis of Variance for Microarrays (BAM) [pdf]

boostmtree

Boosted multivariate trees for longitudinal data [pdf]

boostmtree

R package implementing Friedman's gradient descent boosting algorithm for longitudinal data using multivariate tree base learners. A time-covariate interaction effect is modeled using penalized B-splines (P-splines) with estimated adaptive smoothing parameter.


l2boost

Componentwise boosting for linear regression [pdf]

boostmtree

R package efficient implementation of Friedman's gradient boosting algorithm with L2-loss function and linear learner componentwise boosting. Includes the elasticNet data augmentation of Ehrlinger and Ishwaran (2012), which adds an L2 penalization (lambda) similar to the elastic net.