Development of signal analysis softwares


Christophe Pouzat CNRS Paris
Samuel Garcia CNRS Lyon

The scientific community using multi-electrode arrays (MEAs) is currently facing a number of problems:

The volume of data generated during an experiment is exploding exponentially. The size of the data generated by extracellular recordings can range from a few tens of MB to a few tens of GB per hour of acquisition. In addition to the problem of storage, the methodological problems of analysing this data are particularly acute.

Analysis of extracellular signals, whether at the level of cell discharges (action potentials or spikes) or network phenomena (local field potential, LFP), requires increasingly sophisticated expertise. What’s more, the level of ‘noise’ contained in these signals, combined with the great complexity of the methods, precludes a purely automatic analysis. As a result, the time spent analysing the data routinely and significantly exceeds the time spent on acquisition.

The neurophysiology community suffers from poor networking of analysis methods and tools. Some laboratories are fortunate enough to have strong analytical skills (researchers or engineers). This often leads to the development of algorithms used ‘locally’. Unfortunately, these developments are not widely disseminated in the community, and the scenario where a student or postdoctoral trainee has to develop a complete set of tools all by himself is unfortunately all too common. This situation is made all the more frustrating by the fact that all these toolkits that clone each other are often less well programmed than algorithms freely available on the Net.

The burgeoning literature presenting new methods is often difficult to access for neurobiologists wishing to try them out. The algorithms are not easy to find. This leads to several scenarios: either the methods are not used, or they are implemented in a risky and time-consuming way, or there is a long search to obtain the original codes.
Sometimes, the community has several apparently incompatible analysis methods at its disposal. The classic example is spike sorting, which involves assigning each cell discharge detected in the raw data to different neurons. Few laboratories have the time or expertise to compare different methods. The choice of a method by a laboratory is therefore often made without any real comparison and competition between the algorithms available.
We propose to respond to these problems by developing an ‘open’ approach in this ‘analysis tools’ theme. The term ‘open’ covers two aspects here:

  • The software already available in the GDR laboratories (OpenElectrophy, SpikeOMatic and STAR, etc.) is free and their source codes are available (translation of the term open source).
  • The openness of these tools should also be understood as the availability to the user of a wide range of methods that can be applied to the same dataset. We are therefore talking about openness with regard to methods. The idea behind this last point stems from the conviction that the data should ‘decide’ which method to use.

The main thrusts of the theme will therefore be as follows:

  • Identify the tools already available and in use in the GDR laboratories.
  • Pool these tools and make them compatible with each other.
  • Centralising different data sets so that algorithms can be tested and compared.

In a way, we are proposing to extend the concept of « database » to that of « methods database ».