Since my previous work focused on ICA with applications to ECG, I have strong interests in the compressed sensing applied to ECG telemonitoring via wireless body-area networks. This is a promising application of compressed sensing because the ECG signal is "believed" sparse and compressed sensing can save much power. Thus, I read dozens of papers on this emerging application. But I'd to say, I am totally confused by current works on this direction. My main confusion is that there is few work seriously considering the noise.
You may ask: where is the noise? Let's see the basic compressed sensing model:
y = A x + v.
Of course, providing the sensor devices have high quality, the noise vector v can be very small. However, the signal x (i.e. the recorded ECG signal before compression) has strong noise!!! Note that the application is telemonitoring via wireless body-area networks. Simply put, a device (run by battery) is put on your body to record various physiological data and then send these data (via blue-tooth) to your cell-phone, iphone, ipad, ect for advanced processing, and then these data are further sent to remote terminals for other use. In this application, you are free to walk around. Your each movement, even a very small movement, may result in large disturbance and noise in the recorded signal.
To get a basic feeling about this, I paste an ECG signal recorded from a pregnant women's abdomen, who quietly lies on a bed (not walks). So the major noise comes from her breathe. (I know generally ECG sensors are put on chest. This example is just to show the noise amplitude and how it changes the sparsity of the signal.) Let's see the raw ECG data:
Can you see the noise from her breathe? Is the signal sparse or compressible? You may use some threshold to remove the noise, but you can lost some important components of the ECG signal (e.g. P wave, T wave, etc). Also, the threshold should be data-adaptive. Since different people have ECG with different amplitudes, and the contact quality of sensor to skin also affects the signal amplitude, you need some algorithms to adaptively choose a suitable threshold. And the threshold algorithm also can increase the complexity of chip design and power consuming, which make this application of compressed sensing impossible. Note that the women was quietly lying on the bed. In the real application of body-area networks, the noise from arm movement, walk, or even run is extremely larger than this.
So, I strongly suggest that future work in this topic should seriously consider the noise from movement, and should derive "super" compressed sensing algorithms for this application. And the use of the MIT-BIH dataset (has been used in many existing papers) is thus not suitable. In one of my papers in preparation, I tried many famous algorithms and all of them failed. A main reason is the field of compressed sensing is lack of algorithms considering the noise from signal itself.
My blogs reporting quantitative financial analysis, artificial intelligence for stock investment & trading, and latest progress in signal processing and machine learning
Monday, September 19, 2011
Sunday, September 11, 2011
Erroneous analyses widely exist in neuroscience (and beyond)
Today Neuroskeptic posted a new blog entry: "Neuroscience Fails Stats 101?", which introduced a recently published paper:
S.Nieuwenhuis, B.U.Forstmann, and E-J Wagenmakers, Erroneous analyses of interactions in neuroscience: a problem of significance, Nature Neuroscience, vol. 14, no. 9, 2011
The paper mainly discusses the significant tests. However, I'd to say, when people apply machine learning techniques to neuroscience data (e.g. EEG, fMRI), erroneous analyses (even logically wrong) also exist. Sometimes the erroneous analyses are not explicitly, but more harmful.
One example is the application of ICA on the EEG/MEG/fMRI data. A key assumption of ICA is the independence or uncorrelation of "sources". This assumption is obviously violated in these neuroscience data. But some people seem to be too brave when using ICA to do analysis.
I am not saying using ICA to analyze neuroscience data is wrong. My point is: people should be more careful when using it:
(1) First, you should deeply understand ICA. You need to read enough classical papers, or even carefully read a book (e.g. A.Hyvarinen's book: independent component analysis).
I saw some people only read one or two papers and then jumped to the "ICA-analysis" job. Due to the availability of various ICA toolboxes for neuroscience, some people even didn't read any paper, and even could not correctly write the basic ICA model (really!).
It's very dangerous. This is because ICA is a complicated model and unfortunately, neuroscience is a more complicated field (probably the most complicated field in science). In the world there is nobody that have exact knowledge on the "sources" of EEG/MEG/fMRI data. As a result, people don't know whether the ICA separation is successful. This is different to other fields, where people can easily know whether their ICA is successful. For example, when people use ICA to separate speech signals, they can listen the separated signals to know whether the ICA separation is successful or not. But in neuroscience, you CAN NOT. We still lacks of much knowledge on these "sources" of EEG/MEG/fMRI data. This requires the analyzers to deeply understand the mathematical tools they are using: the sensitivity, the robustness, the all kinds of possibility of failure, etc.
It has been observed that ICA can split a signal emitted from an active brain area into two or more "independent sources". It has been observed that ICA only provides a temporal-averaged spatial distribution. It has also been observed that ICA fails when several brain activity are coupled. However, all these warnings are ignored by those brave people.
(2) Be careful when using two or more advanced machine learning analysis (e.g. ICA separation in a domain and then ICA separation in another domain, ICA followed by another exploring data analysis, etc). Due to the inconsistency of ICA models and neuroscience data, errors always exist. However, we don't have any knowledge on the errors from ICA. So, the errors from ICA is unpredictable, and such errors can also be unpredictably amplified when we use another advanced machine learning algorithm after ICA. The same goes to the use of other advanced algorithms successively.
In summary, ICA is a tiger, and to control it, the controller needs to be very skilled; otherwise, the controller will be seriously harmed by it.
---------------------------------------------------------------------------------------------
Nepenthes. x dyeriana.
This nepenthes was gaven by my friend, Bob, as a gift. It is a rare hybrid. Photo was taken by my friend Luo.
S.Nieuwenhuis, B.U.Forstmann, and E-J Wagenmakers, Erroneous analyses of interactions in neuroscience: a problem of significance, Nature Neuroscience, vol. 14, no. 9, 2011
The paper mainly discusses the significant tests. However, I'd to say, when people apply machine learning techniques to neuroscience data (e.g. EEG, fMRI), erroneous analyses (even logically wrong) also exist. Sometimes the erroneous analyses are not explicitly, but more harmful.
One example is the application of ICA on the EEG/MEG/fMRI data. A key assumption of ICA is the independence or uncorrelation of "sources". This assumption is obviously violated in these neuroscience data. But some people seem to be too brave when using ICA to do analysis.
I am not saying using ICA to analyze neuroscience data is wrong. My point is: people should be more careful when using it:
(1) First, you should deeply understand ICA. You need to read enough classical papers, or even carefully read a book (e.g. A.Hyvarinen's book: independent component analysis).
I saw some people only read one or two papers and then jumped to the "ICA-analysis" job. Due to the availability of various ICA toolboxes for neuroscience, some people even didn't read any paper, and even could not correctly write the basic ICA model (really!).
It's very dangerous. This is because ICA is a complicated model and unfortunately, neuroscience is a more complicated field (probably the most complicated field in science). In the world there is nobody that have exact knowledge on the "sources" of EEG/MEG/fMRI data. As a result, people don't know whether the ICA separation is successful. This is different to other fields, where people can easily know whether their ICA is successful. For example, when people use ICA to separate speech signals, they can listen the separated signals to know whether the ICA separation is successful or not. But in neuroscience, you CAN NOT. We still lacks of much knowledge on these "sources" of EEG/MEG/fMRI data. This requires the analyzers to deeply understand the mathematical tools they are using: the sensitivity, the robustness, the all kinds of possibility of failure, etc.
It has been observed that ICA can split a signal emitted from an active brain area into two or more "independent sources". It has been observed that ICA only provides a temporal-averaged spatial distribution. It has also been observed that ICA fails when several brain activity are coupled. However, all these warnings are ignored by those brave people.
(2) Be careful when using two or more advanced machine learning analysis (e.g. ICA separation in a domain and then ICA separation in another domain, ICA followed by another exploring data analysis, etc). Due to the inconsistency of ICA models and neuroscience data, errors always exist. However, we don't have any knowledge on the errors from ICA. So, the errors from ICA is unpredictable, and such errors can also be unpredictably amplified when we use another advanced machine learning algorithm after ICA. The same goes to the use of other advanced algorithms successively.
In summary, ICA is a tiger, and to control it, the controller needs to be very skilled; otherwise, the controller will be seriously harmed by it.
---------------------------------------------------------------------------------------------
Nepenthes. x dyeriana.
This nepenthes was gaven by my friend, Bob, as a gift. It is a rare hybrid. Photo was taken by my friend Luo.
Bayesian Group Lasso Using Non-MCMC?
Recently I read several papers on Bayesian group Lasso. A common characteristics of these works is that they adopt the MCMC approach for inference. Due to MCMC, these algorithms unfortunately perform very very slowly. I am wondering whether there exists a Bayesian group Lasso without the aid of MCMC?
Thursday, September 1, 2011
2011 Impact Factor of Journals
The newest JCR report has come out in June. The following are some journals in my research scope. Of course, impact factors do not reflect all the things of a paper; a paper published in a journal with high impact factor does not mean it is better than a paper published in another journal with lower impact factor. So, just for fun.
NeuroImage (Impact Factor: 5.932)
Signal Processing:
IEEE Signal Processing Magazine (Impact Factor: 5.86)
IEEE Transactions on Signal Processing (TSP) (Impact Factor: 2.651)
IEEE Journal of Selected Topics in Signal Processing (J-STSP) (Impact Factor: 2.647)
Elsevier Signal Processing (Impact Factor: 1.351)
IEEE Signal Processing Letters (Impact Factor: 1.165)
EURASIP Journal on Advances in Signal Processing (EURASIP JASP) (Impact Factor: 1.012)
Biomedical Signal Processing:
NeuroImage (Impact Factor: 5.932)
Human Brain Mapping (Impact Factor: 5.107)
IEEE Transactions on Medical Imaging (Impact Factor: 3.545)
IEEE Transactions on Neural Systems and Rehabilitation Engineering (Impact Factor: 2.182)
Journal of Neuroscience Method (Impact Factor: 2.1)
IEEE Transactions on Biomedical Engineering (Impact Factor: 1.782)
---------------------------------------------------------------
Nepenthes. jamban
The following picture has won the first prize in POTM Contest in July. It is my first time to win it :)
Subscribe to:
Posts (Atom)