The amount of available data relevant for clinical decision support is rising not only rapidly but at the same time much faster than our ability to analyze and interpret them. Thus, the potential of the data to contribute to determining the diagnosis, therapy and prognosis of an individual patient is not appropriately exploited.
The hopes to obtain benefit from the data for an individual patient must be accompanied by a reliable and diligent biostatistical analysis which faces serious challenges not always clear to non-statisticians. The aim of this paper is to discuss principles of statistical analysis of big data in research and routine applications in clinical medicine, focusing on particular aspects of psychiatry.
The paper brings arguments in favor of the idea that the biostatistical analysis of data in a specialty field requires different approaches and different experience compared to other clinical fields. This is illustrated by a description of common complications of the analysis of psychiatric data.
Challenges of the analysis of big data in both psychiatric research and routine practice are explained, which are far from a routine service activity exploiting standard methods of multivariate statistics and/or machine learning. Important research questions, which are important in the current psychiatric research, are presented and discussed from the biostatistical point of view.