R/BHC: fast Bayesian hierarchical clustering for microarray data

Abstract Background Although the use of clustering methods has rapidly become one of the standard computational AB@9 approaches in the literature of microarray gene expression data analysis, little attention has been paid to uncertainty in the results obtained.Results We present an R/Bioconductor port of a fast novel algorithm for Bayesian agglomerative hierarchical clustering and demonstrate its use in clustering gene expression microarray data.The method performs bottom-up hierarchical clustering, using a Dirichlet Process (infinite mixture) to model uncertainty in the data and Bayesian model Blouse selection to decide at each step which clusters to merge.Conclusion Biologically plausible results are presented from a well studied data set: expression profiles of A.thaliana subjected to a variety of biotic and abiotic stresses.

Our method avoids several limitations of traditional methods, for example how many clusters there should be and how to choose a principled distance metric.

Leave a Reply

Your email address will not be published. Required fields are marked *