My blogs reporting quantitative financial analysis, artificial intelligence for stock investment & trading, and latest progress in signal processing and machine learning

Thursday, July 7, 2011

When Bayes Meets Big Data

In the June Issue of The ISBA Bulletin, Michael Jordan wrote an article titled "The Era of Big Data". The article discussed the possibility and challenges to apply Bayesian techniques to Big Data (e.g. terabytes, petabytes, exabytes and zettabytes). Michael pointed out several advantages of Bayes over non-Bayes, which I quote here:

(1) Analyses of Big Data often have an exploratory flavor rather than a confirmatory flavor. Some of the concerns over family-wise error rates that bedevil classical approaches to exploratory data analysis are mitigated in the Bayesian framework.

(2) In the sciences, Big Data problems often arise in the context of “standard models,” which are often already formulated in probabilistic terms. That is, significant prior knowledge is often present and directly amenable to Bayesian inference.

(3) Consider a company wishing to offer personalized services to tens of millions of users. Large amounts of data will have been collected for some users, but for most users there will be little or no data. Such situations cry out for Bayesian hierarchical modeling.

(4) The growing field of Bayesian nonparametrics provides tools for dealing with situations in which phenomena continue to emerge as data are collected. For example, Bayesian nonparametrics not only provides probability models that yield power-law distributions, but it provides inferential machinery that incorporate these distributions.

Based on my experience on compressed sensing, I feel that Bayes provides a more flexible way to exploit structured sparsity. Such power gained from Bayes cannot be gained from non-Bayes methods. However, Bayes is computationally demanding. So, combining Bayes and non-Bayes is my research theme in compressed sensing. This is why I wrote the two papers:

Z.Zhang, B.D.Rao, Iterative Reweighted Algorithms for Sparse Signal Recovery with Temporally Correlated Source Vectors, ICASSP 2011

Z. Zhang, B.D.Rao, Exploiting Correlation in Sparse Signal Recovery Problems: Multiple Measurement Vectors, Block Sparsity, and Time-Varying Sparsity, ICML 2011 Workshop on Structured Sparsity





Above Pictures: Nepenthes. jamban (growing in my patio)
This rare species was discovered in the island of Sumatra in Indonesian in 2005. The pitchers have a unique toilet shape, so the plant was affectionately called jamban, which means toilet in Indonesian.

No comments:

Post a Comment

Note: Only a member of this blog may post a comment.