My blogs reporting quantitative financial analysis, artificial intelligence for stock investment & trading, and latest progress in signal processing and machine learning

Wednesday, May 29, 2013

Use Block Sparse Bayesian Learning (BSBL) for Practical Problems

In LinkedIn, Phil, Leslie, and me have a productive discussion on the use of BSBL and EM-GM-AMP in practical problems. The whole discussions can be seen at here. For convenience, I copied my words on the use of BSBL and related issues below.

Below are several practical examples where intra-block correlation exists and BSBL can be used. In fact, there are many examples in practice.

1. Localization of distributed sources (not point-sources). In EEG/MEG source localization, the sources are generally not just a point. They have areas, so they are called distributed sources. When modeling the problem using a sparse linear regression model: y=Ax + v, the coefficient vector x is expected to contain several nonzero blocks, each nonzero block corresponding to a distributed source. Entries in the same nonzero block are highly correlated to each other in amplitude, since they are all associated with the same source.

(Remark: There are a number of work assuming the sources are point-sources as well. Thus the problem becomes a traditional DOA problem. However, in this case, BSBL can be applied as well, which I will explain later.)


2. In compressed sensing of ECG, an ECG signal has clear block structure (generally all the blocks are nonzero, but some blocks are very close to zero), and in each “block” the entries are highly correlated in amplitude. One can see the Fig.1 in my T-BME paper ( http://dsp.ucsd.edu/~zhilin/papers/Zhang_TBME2012.pdf ) to get such feeling.

(Remark: Almost all the physiological signals have correlation in the time domain, although some of them have not clear block-structure. However, BSBL can also be used for such signals. I will explain later).


3. One should note that the mathematical model used in compressed sensing is a linear regression model, and such a model has numerous applications in almost every field. For many applications, the regression coefficients have block structure. In each block, the entries are associated with a same physical “force” or a “factor”, and thus correlated in their amplitude.


In fact, as long as a signal has block structure, then it is highly possible that intra-block correlation also exists. Even if the intra-block correlation= 0, it is not harmful to run BSBL, because, as you can see from my T-SP paper, it also has excellent performance (and outperforms all the known block-structure-based algorithms) when intra-block correlation = 0. You do not need to test whether the correlation exists. You just need to run. That’s it. 



 Now I want to emphasize that BSBL can be used in the general-sparse problems (i.e., no structure in the coefficient vector x). This is due to two factors.

1.One factor is that truly sparse signals do not exist in most practical problems; in fact, they are non-sparse.

As I have mentioned in our personal communication, and also mentioned by Phil and many other people, the truly sparse signal does not exist in most practical problems. Those “sparse” signals are compressive; most of their entries in the time domain (or the coefficients in some transformed domains) are close to zero but strictly nonzero. In other words, they are non-sparse ! Using general-sparse recovery algorithms (such as Lasso, CoSaMP, OMP, FOCUSS, the basic SBL) can recover those entries with large amplitude. But they always have challenges in recovery of the entries with small amplitude. So, the quality of the recovered data has a “glass ceilling”, which is not very high.

However, BSBL has a unique property, i.e., recovering non-sparse signals (or signals with non-sparse representation coefficients) very well. As for the recovery of non-sparse signals, I do have a number of work showing this. Please refer to my T-BME papers:
[1] Compressed Sensing for Energy-Efficient Wireless Telemonitoring of Noninvasive Fetal ECG via Block Sparse Bayesian Learning
[2] Compressed Sensing of EEG for Wireless Telemonitoring with Low Energy Consumption and Inexpensive Hardware

In fact, we have achieved much better results than [1-2], which will be release soon.


2.The second factor is that BSBL is a kind of multiscale regression algorithms (although very naïve).


In DOA or similar applications (e.g. earth-quake detection, brain source localization, or some communication problems), the matrix A (remind the model: y=Ax + v) is highly coherent (i.e., columns of A are highly correlated to each other). In this case, even if x is a very sparse vector, this problem is very difficult, especially in noisy situations. Using the basic SBL can get better performance than most existing algorithms. But we found that using BSBL can get much better performance. This is mainly because that BSBL, using the block partition, divides the whole search space (i.e. the number of whole locations in x) into a number of sub search spaces (i.e. the number of candidate nonzero blocks). This makes the localization problems become easier. Since x is sparse, generally the nonzero blocks are only a few and zero blocks are many. During iteration, the zero blocks are deleted in BSBL gradually. And the problem becomes easier and easier with iteration. I can dynamically change the block partition in BSBL according to some criterion. However, experience showed that it is not necessary to do this. BSBL can eventually find the correct locations of nonzero entries in x (although the rest entries in a nonzero block have very small amplitude).

The Fig.4 in my ICASSP 2012 paper ( http://sccn.ucsd.edu/~zhang/Zhang_ICASSP2012.pdf ) may give some feeling on what I said “the rest entries in a nonzero block have very small amplitude”.



Last, but not least, I want to say that a comparison between BSBL with other algorithms is helpful to everybody. I will be very happy to see the result. But I suggest one performs the comparison in some practical problems, since the readers generally come from various application fields with questions similar like “which algorithms is the best one for my problems”. Putting the comparison in specific practical problems is more informative and really helpful. Computer simulations have several problems, such as the suitable performance index, the consistency of the simulation model with the underlying models in practical problems, what’s the criterion to choose the algorithms (e.g. for an audio compressed sensing problem, does one think it is a general-sparse recovery problem, or a block-sparse recovery problem, or a non-sparse recovery problem?). Due to these issues, the conclusions may be not solid, and even more or less misleading.

MSE is generally not a good performance index. For example, for image quality, MSE is not recommended (what’s the suitable performance index for images is still a hot topic in the image processing field). In my experience on compressed sensing of EEG, I even found that MSE always misleading when I compared BSBL with my STSBL algorithms. This is why in my two T-BME papers (and other papers coming out), I used a task-oriented performance measure criterion. That is, after recovered the data,

(1) performing task-required signal processing or pattern recognition on the recovered data, obtaining result A;
(2) performing the same task-required signal processing or pattern recognition on the original data (or the recovered data by another algorithm), obtaining result B;
(3) comparing result A with result B, which tell me which algorithm has better data recovery ability. 



PS:
Many people asked me whether my BSBL codes can be used for complex-valued problems. My answer is YES. But you need to transform your complex-valued problem into a real-valued problem, as I showed here: https://sites.google.com/site/researchbyzhang/publication3/BSBL4complex.pdf?attredirects=0
This transform is very simple. You just need no more than 1 minute to do this.

I will update my BSBL codes in the near future such that no need to do the transform.





Friday, April 12, 2013

Compressed Sensing of EEG Using Wavelet Dictionary Matrices

Since my paper "Compressed Sensing of EEG for Wireless Telemonitoring with Low Energy Consumption and Inexpensive Hardware (IEEE T-BME, vol.60, no.1, 2013)" has been published, lots of people asked me how to do the compressed sensing of EEG using wavelets. Their problem was that Matlab has no function to generate the DWT basis matrix (i.e. the matrix D in my paper). One has to generate such matrices using other wavelet toolboxes. Now I updated my codes, where I gave a guide to generate such dictionary matrices using the wavelab (http://www-stat.stanford.edu/~wavelab/) , and there is a demo to show how to use a DWT basis matrix as the dictionary matrix for compressed sensing of EEG (demo_useDWT.m). The codes ('Compressed Sensing of EEG_v2.zip) can be downloaded at here

BTW: Please keep in mind that EEG is generally not sparse except to some special situations.