Today Neuroskeptic posted a new blog entry: "Neuroscience Fails Stats 101?", which introduced a recently published paper:
S.Nieuwenhuis, B.U.Forstmann, and E-J Wagenmakers, Erroneous analyses of interactions in neuroscience: a problem of significance, Nature Neuroscience, vol. 14, no. 9, 2011
The paper mainly discusses the significant tests. However, I'd to say, when people apply machine learning techniques to neuroscience data (e.g. EEG, fMRI), erroneous analyses (even logically wrong) also exist. Sometimes the erroneous analyses are not explicitly, but more harmful.
One example is the application of ICA on the EEG/MEG/fMRI data. A key assumption of ICA is the independence or uncorrelation of "sources". This assumption is obviously violated in these neuroscience data. But some people seem to be too brave when using ICA to do analysis.
I am not saying using ICA to analyze neuroscience data is wrong. My point is: people should be more careful when using it:
(1) First, you should deeply understand ICA. You need to read enough classical papers, or even carefully read a book (e.g. A.Hyvarinen's book: independent component analysis).
I saw some people only read one or two papers and then jumped to the "ICA-analysis" job. Due to the availability of various ICA toolboxes for neuroscience, some people even didn't read any paper, and even could not correctly write the basic ICA model (really!).
It's very dangerous. This is because ICA is a complicated model and unfortunately, neuroscience is a more complicated field (probably the most complicated field in science). In the world there is nobody that have exact knowledge on the "sources" of EEG/MEG/fMRI data. As a result, people don't know whether the ICA separation is successful. This is different to other fields, where people can easily know whether their ICA is successful. For example, when people use ICA to separate speech signals, they can listen the separated signals to know whether the ICA separation is successful or not. But in neuroscience, you CAN NOT. We still lacks of much knowledge on these "sources" of EEG/MEG/fMRI data. This requires the analyzers to deeply understand the mathematical tools they are using: the sensitivity, the robustness, the all kinds of possibility of failure, etc.
It has been observed that ICA can split a signal emitted from an active brain area into two or more "independent sources". It has been observed that ICA only provides a temporal-averaged spatial distribution. It has also been observed that ICA fails when several brain activity are coupled. However, all these warnings are ignored by those brave people.
(2) Be careful when using two or more advanced machine learning analysis (e.g. ICA separation in a domain and then ICA separation in another domain, ICA followed by another exploring data analysis, etc). Due to the inconsistency of ICA models and neuroscience data, errors always exist. However, we don't have any knowledge on the errors from ICA. So, the errors from ICA is unpredictable, and such errors can also be unpredictably amplified when we use another advanced machine learning algorithm after ICA. The same goes to the use of other advanced algorithms successively.
In summary, ICA is a tiger, and to control it, the controller needs to be very skilled; otherwise, the controller will be seriously harmed by it.
Nepenthes. x dyeriana.
This nepenthes was gaven by my friend, Bob, as a gift. It is a rare hybrid. Photo was taken by my friend Luo.