08.00.13 Mathematical and instrumental methods of Economics
-
BASIC REQUIREMENTS FOR DATA ANALYSIS METHODS (ON THE EXAMPLE OF CLASSIFICATION TASKS)
08.00.13 Mathematical and instrumental methods of Economics
DescriptionThere is a need to clean up the classification methods. This will increase their role in solving applied problems, in particular, in the diagnosis of materials. For this, first of all, it is necessary to develop requirements that classification methods must satisfy. The initial formulation of such requirements is the main content of this work. Mathematical classification methods are considered as part of the applied statistics methods. The natural requirements to the considered methods of data analysis and the presentation of calculation results arising from the achievements and ideas accumulated by the national probabilistic and statistical scientific school are discussed. Concrete recommendations are given on a number of issues, as well as criticism of individual errors. In particular, data analysis methods must be invariant with respect to the permissible transformations of the scales in which the data are measured, i.e. methods should be adequate in the sense of measurement theory. The basis of a specific statistical method of data analysis is always one or another probabilistic model. It should be clearly described, its premises justified - either from theoretical considerations, or experimentally. Data processing methods intended for use in real-world problems should be investigated for stability with respect to the tolerances of the initial data and model premises. The accuracy of the solutions given by the method used should be indicated. When publishing the results of statistical analysis of real data, it is necessary to indicate their accuracy (confidence intervals). As an estimate of the predictive power of the classification algorithm, it is recommended to use predictive power instead of the proportion of correct forecasts. Mathematical research methods are divided into "exploratory analysis" and "evidence-based statistics." Specific requirements for data processing methods arise in connection with their "docking" during sequential execution. The article discusses limits of applicability of probabilistic-statistical methods. Concrete statements of classification problems and typical errors when applying various methods for solving them are also considered