The simultaneous testing of multiple hypotheses is common to the analysis of high-dimensional data sets. The two-group model, first proposed by Efron, identifies significant comparisons by allocating observations to a mixture of an empirical null and an alternative distribution. In the Bayesian nonparametrics literature, many approaches have suggested using mixtures of Dirichlet Processes in the two-group model framework. Here, we investigate employing mixtures of two-parameter Poisson-Dirichlet Processes instead, and show how they provide a more flexible and effective tool for large-scale hypothesis testing. Our model further employs nonlocal prior densities to allow separation between the two mixture components. We obtain a closed-form expression for the exchangeable partition probability function of the two-group model, which leads to a straightforward Markov Chain Monte Carlo implementation. We compare the performance of our method for large-scale inference in a simulation study and illustrate its use on both a prostate cancer data set and a case-control microbiome study of the gastrointestinal tracts in children from underdeveloped countries who have been recently diagnosed with moderate-to-severe diarrhea.

Denti, F., Guindani, M., Leisen, F., Lijoi, A., Wadsworth, W. D., Vannucci, M., Two-group Poisson-Dirichlet mixtures for multiple testing, <<BIOMETRICS>>, 2020; 77 (2): 622-633. [doi:10.1111/biom.13314] [http://hdl.handle.net/10807/201724]

Two-group Poisson-Dirichlet mixtures for multiple testing

Denti, Francesco
Primo
;
2021

Abstract

The simultaneous testing of multiple hypotheses is common to the analysis of high-dimensional data sets. The two-group model, first proposed by Efron, identifies significant comparisons by allocating observations to a mixture of an empirical null and an alternative distribution. In the Bayesian nonparametrics literature, many approaches have suggested using mixtures of Dirichlet Processes in the two-group model framework. Here, we investigate employing mixtures of two-parameter Poisson-Dirichlet Processes instead, and show how they provide a more flexible and effective tool for large-scale hypothesis testing. Our model further employs nonlocal prior densities to allow separation between the two mixture components. We obtain a closed-form expression for the exchangeable partition probability function of the two-group model, which leads to a straightforward Markov Chain Monte Carlo implementation. We compare the performance of our method for large-scale inference in a simulation study and illustrate its use on both a prostate cancer data set and a case-control microbiome study of the gastrointestinal tracts in children from underdeveloped countries who have been recently diagnosed with moderate-to-severe diarrhea.
2021
Inglese
Denti, F., Guindani, M., Leisen, F., Lijoi, A., Wadsworth, W. D., Vannucci, M., Two-group Poisson-Dirichlet mixtures for multiple testing, <<BIOMETRICS>>, 2020; 77 (2): 622-633. [doi:10.1111/biom.13314] [http://hdl.handle.net/10807/201724]
File in questo prodotto:
Non ci sono file associati a questo prodotto.

I documenti in IRIS sono protetti da copyright e tutti i diritti sono riservati, salvo diversa indicazione.

Utilizza questo identificativo per citare o creare un link a questo documento: https://hdl.handle.net/10807/201724
Citazioni
  • ???jsp.display-item.citation.pmc??? 0
  • Scopus 3
  • ???jsp.display-item.citation.isi??? 2
social impact