My blogs reporting quantitative financial analysis, artificial intelligence for stock investment & trading, and latest progress in signal processing and machine learning

## Friday, April 15, 2011

### Is MMV more suitable for dynamic environment (time-varying sparsity) than KF-CS and LS-CS? (1)

It is viewed that the MMV model is an special case of the dynamic sparse model, and thus some people probably believe that algorithms for the MMV model may not be suitable for the dynamic sparse model. Here I'll give an experiment to show that it is not the case. It is more probably that algorithms for the MMV model is more suitable for the dynamic sparse model.

First, I'll give the necessary background on the two models.

The MMV model (i.e. multiple measurement vector model) is an extension of the basic compressed sensing model. It is expressed as:
Y = AX + V,
where Y is the N by L measurement matrix (known), A is the N by M dictionary matrix (known), and X is the unknown M by L coefficient matrix (unknown). V is the unknown noise matrix. Clearly, when L=1, the model is the basic compressed sensing model (the model from which the Lasso, Basis Pursuit, Matching Pursuit and other numerous algorithms were derived). In the MMV model, a key assumption is that there are few nonzero rows in X (i.e.  the common sparsity assumption). It is worthy mentioning that under mild conditions algorithms' recovery performance is exponentially increased with increasing L (i.e. the number of measurement vectors). And in many applications (such as MRI image series processing, Source localization, etc) we can always obtain several or many measurement vectors. Many algorithms have been proposed, such as M-SBL [1] (assuming each row in X has no temporal correlation) and T-SBL/T-MSBL [2] (assuming each row in X could have temporal correlation).

However, in applications we cannot ensure all the column vectors in X have the same support (i.e. indexes of the nonzero elements in a column vector). It is more probably that the support of columns is slowly changing, i.e. the columns in X have time-varying sparsity patterns. This model is generally called the dynamic sparse model. Several algorithms have been proposed, such as the Kalman Filtered Based Compressed Sensing (KF-CS) [3], Least-Square Compressed Sensing (LS-CS) [4], and the algorithm based on Belief Propagation [5].

Since the common sparsity assumption in the MMV model is not satisfied in the dynamic sparse model, one may think that algorithms derived in the MMV model may not be suitable for the dynamic sparse model. However, as long as the number of total nonzero rows in X is not too many (e.g. less than (N+L)/2 [6]), MMV algorithms can be used for the dynamic sparse model, and even have better performance.

Here is an experiment, which compares the M-SBL,  KF-CS and LS-CS. The simulation data and the KF-CS and LS-CS codes all come from the author's website: http://home.engineering.iastate.edu/~namrata/research/LSCS_KFCS_code.zip. The M-SBL code comes from my website: http://dsp.ucsd.edu/~zhilin/ARSBL_1p2.zip.

I directly used the data generated by the m-file: runsims_final.m in the LSCS_KFCS_code.zip downloaded via the above link,  and used the original matlab command to call KF-CS and LS-CS algorithms, i.e:
[x_upd,T_hat] = kfcs_full(y,Pi0,A,Q,R)
[x_upd_lscs,T_hat_lscs,x_upd_csres] = lscs_full(y,Pi0,A,Q,R);
This commands were given by the author. I didn't change anything.

The command to call M-SBL is given by:
[x_sbl,gamma,gamma_used,k] = MSBL(A, y, R(1,1), 0, 1e-4, 1000, 0);

Experiment was repeated for 63  trials. The averaged results (from Time 0 to Time 60) are shown in the following pic (click the pic for large view):
Here we can see, both KF-CS  and LS-CS have large errors when the sparsity profile of the columns in X changes (indicated by the four large bumps). In contrast, M-SBL does not have this phenomenon and has much small errors in most duration. KF-Genie indicates the Kalman-Filter estimation when the support of each column in X is known.

The averaged elapse time is (in my notebook):
KF-CS is 168.5 seconds
LS-CS is 168.3 seconds
M-SBL is  19.7 second

I have to emphasize, if we segment the whole data into several shorter segments, and then perform M-SBL on each segment and concatenate the results from each segment, we can get better performance.

So, the conclusion is, the MMV model has great potential for the problem of time-varying sparsity patterns. This is somewhat like that people generally view each short segment of a nonstationary signal as a stationary signal. So, viewing a time-varying sparsity model as concatenation of several MMV models is probably a better way to solve the time-varying sparsity model.

Some of the results can be found in [7].

Reference:

[1] D.P. Wipf and B.D. Rao, An Empirical Bayesian Strategy for Solving the Simultaneous Sparse Approximation Problem, IEEE Transactions on Signal Processing, vol. 55, no. 7, July 2007

[2] Z.Zhang, B.D.Rao, Sparse Signal Recovery with Temporally Correlated Source Vectors Using Sparse Bayesian Learning, accepted by IEEE Journal of Selected Topics in Signal Processing, [arXiv:1102.3949]

[3] Namrata Vaswani, Kalman Filtered Compressed Sensing, IEEE Intl. Conf. Image Proc. (ICIP), 2008

[4] Namrata Vaswani, "LS-CS-residual (LS-CS): Compressive Sensing on the Least Squares Residual", IEEE Trans. Signal Processing, August 2010

[5] J.Ziniel, L.C.Potter, P.Schniter, Tracking and smoothing of time-varying sparse signals via approximate belief propagation, Asilomar 2010

[6] SF Cotter, BD Rao, Sparse solutions to linear inverse problems with multiple measurement vectors, IEEE Trans. on Signal Processing, 2005

1. Hi! I think it's an interesting idea to bring in the MMV ideas in to look at the dynamic compressive sensing problem, but there's one thing missing, I think, in the comparison. You compare to the KF-CS and LS-CS which are meant to try to run in real-time, while (as far as I know) MMV techniques are batch processing techniques. This is why they suffer so dramatically at the points where the sparsity changes a great deal - those curves are exactly the curves that appear in other real-time tracking solutions (non-genie Kalman, RLS etc...).

So my questions are: 1) did you adapt the MMV solution to run causally? If not, also the timing is deceptive since a causal run with a moving window will need ~19s while the causal solutions only seem to take ~3s per iteration.
2) Have to compared to other batch processing techniques (for instance Giannakis has some batch processing \ell_1 solutions for the dynamic problem)?

2. Hi, shabtai,

Thank you for your comments. Yes, KF-CS and LS-CS can run in real-time. However, MMV algorithms are not absolutely batch algorithms. They can run in block on-line style, i.e, collecting several measurement vectors (say, 3 or 5) and then processing the small-sized batch, obtaining the reconstruction results for this batch, and then continuing the next small batch. In this blog-poster, I showed the result of MSBL on the whole data. And in the series-2 and series-3 posters, I showed the results of MSBL implemented in block on-line style. In practice, it is possible to collect several measurement vectors and then apply MMV algorithms on the collected measurement vectors. Since lots of MMV algorithms can have fast speed, in some applications real-time requirement may also be satisfied by MMV algorithms. (I have to say, SBL algorithms are not the very fast ones. However, there exist super fast MMV algorithms)

As we know, (under mild conditions) algorithms performance exponentially increases with increasing number of measurement vectors (even collect two vectors!), I think using MMV in dynamic CS has benefits.

(1) I think the block on-line method is strictly not a causal way (actually is a smooth way). In the experiment of this poster, I applied MSBL on the whole data, and the total time was about 19s. In this experiment, the input parameter MIN_GAMMA was set to be 1e-4 (pls. see the MSBL file).

In the experiment of the series-3 poster (http://marchonscience.blogspot.com/2011/04/is-mmv-more-suitable-for-dynamic_18.html), I used the block on-line method. The total elapse time (to process the whole data) was about 3s. However, in this experiment the input argument MIN_GAMMA for MSBL was set to be 1e-3. Note that different values of MIN_GAMMA can result in significant change in speed. You can download the code of MSBL and try different values of MIN_GAMMA and you will be surprised by the speed change (could be more than 100 times faster or slower).

(2) I didn't compare their algorithms. It is interesting to see such comparison. However, since the code of MSBL (and my T-MSBL/T-SBL soon) can be downloaded from the link I provided, every one, if interested in such comparison, can carry out the experiments and I am very happy to see the results.

Again, thank you for your comments. If I cannot explain clear, feel free to tell me.

3. Hi Zhilin,

Thanks for the response! I'm still not sure about the timing - it seems very dependent on the parameters, so I'll have to play around with it a bit on my own.

Thanks again,