Bauman Moscow State Technical University
Author list of organization
List of articles written by the authors of the organization
-
NON-NUMERICAL DATA STATISTICS IS A CENTRAL PART OF MODERN APPLIED STATISTICS
08.00.13 Mathematical and instrumental methods of Economics
DescriptionIn 1979, non-numerical data statistics was singled out as an independent area of applied statistics. Initially, the term "statistics of objects of non-numerical nature" was used to denote this area of mathematical methods of economics. Our basic non-numeric statistics textbook is called "Non-Numeric Statistics". Non-numerical data statistics is one of the four main areas of applied statistics (along with number statistics, multidimensional statistical analysis, statistics of time series and random processes). Statistics of non-numerical data are divided into statistics in spaces of a general nature and sections devoted to specific types of non-numerical data (statistics of interval data, statistics of fuzzy sets, statistics of binary relations, etc.). Currently, statistics in spaces of a general nature is the central part of applied statistics, and non-numeric data statistics including it is the main area of applied statistics. This statement is confirmed, in particular, by the analysis of publications in the section "Mathematical Research Methods" of the journal "Industrial Laboratory. Diagnostics of Materials" - the main place of publication of russian studies on applied statistics. This article is devoted to the analysis of the basic ideas of non-numerical data statistics against the background of the development of applied statistics from the perspective of a new paradigm of mathematical research methods. Various types of non-numeric data are described. The historical path of statistical science is analyzed. We have discussed the development of statistics of non-numerical data. The article analyzes basic ideas of statistics in spaces of a general nature: average values, laws of large numbers, extreme statistical problems, nonparametric estimates of the probability density, classification methods (diagnostics and cluster analysis), statistics of the integral type. Some statistical methods for analyzing data lying in specific spaces of non-numeric nature are briefly considered: non-parametric statistics (real distributions usually differ significantly from normal), statistics of fuzzy sets, theory of expert estimates (the Kemeny median is a sample average of expert orderings), etc. We have also discussed some unsolved problems in statistics of nonnumeric data
-
SYSTEM OF MODELS AND METHODS OF TESTING THE HOMOGENEITY OF TWO INDEPENDENT SAMPLES
08.00.13 Mathematical and instrumental methods of Economics
DescriptionThe new paradigm of mathematical research methods allows us to give a systematic analysis of various statements of statistical analysis problems and methods for solving them, based on a probabilistic-statistical model of generating data accepted by the researcher. Methods for testing the homogeneity of two independent samples - a classic area of mathematical statistics. For more than 110 years since the publication of the fundamental Student’s article, various criteria have been developed for testing the statistical hypothesis of homogeneity in various statements, and their properties have been studied. However, the need for streamlining the totality of the scientific results found is urgent. It is necessary to analyze the whole variety of problem statements for testing the statistical hypotheses of the homogeneity of two independent samples, as well as the corresponding statistical criteria. This analysis is devoted to this article. It contains a summary of the main results concerning the methods for testing the homogeneity of two independent samples, and a comparative study of them, allowing the system to analyze the diversity of such methods in order to select the most appropriate for processing specific data. Based on the basic probabilistic-statistical model, the main statements of the problem of testing the homogeneity of two independent samples are formulated. A comparative analysis of the Student and Cramer - Welch criteria, designed to test the homogeneity of mathematical expectations, is given, a recommendation on the widespread use of the Cramer - Welch criterion is substantiated. From nonparametric methods for testing homogeneity, the criteria of Wilcoxon, Smirnov, Lehmann - Rosenblatt are considered. Dismantled two myths about the Wilcoxon criteria. Based on the analysis of the publications of the founders, the incorrectness of the term "Kolmogorov – Smirnov criterion" is shown. To verify absolute homogeneity, i.e. coincidence of the distribution functions of samples, it is recommended to use the Lehmann - Rosenblatt criterion. The current problems of the development and application of nonparametric criteria are discussed, including the difference between nominal and real significance levels, making it difficult to compare power of criteria, and the need to take into account coincidences of sample values (from the point of view of the classical theory of mathematical statistics, the probability of coincidences is 0)
-
PRICING METHOD BASED ON THE ESTIMATION OF DEMAND FUNCTION
08.00.13 Mathematical and instrumental methods of Economics
DescriptionWhen solving some problems of economics and management at an enterprise, it becomes necessary to determine the retail price of a product or service at a known wholesale price or producer price. We offer to determine the retail price based on an analysis of a survey of potential consumers about the maximum possible price for the product or service in question. We calculate the retail price on the basis of optimizing the economic effect equal to the product of the result from the sale of one unit of goods by the demand function, which we estimate by interviewing consumers. To solve the optimization problem, we approximate the demand function using the least squares method. As examples, the linear and power models of the demand function are analyzed. Ways of further development of the proposed approach are discussed. Unresolved scientific problems are formulated. Methods for estimating the demand function in the context of a large number of repetitions of respondents and their tendency to “round numbers” require further elaboration, as a result of which the Kolmogorov criterion cannot be used to determine the accuracy of the restoration of the demand function. Various parametric and non-parametric approaches of regression analysis should be adapted to the problem of restoring the dependence of demand on price, as well as methods for solving the corresponding optimization problems
-
BASIC REQUIREMENTS FOR DATA ANALYSIS METHODS (ON THE EXAMPLE OF CLASSIFICATION TASKS)
08.00.13 Mathematical and instrumental methods of Economics
DescriptionThere is a need to clean up the classification methods. This will increase their role in solving applied problems, in particular, in the diagnosis of materials. For this, first of all, it is necessary to develop requirements that classification methods must satisfy. The initial formulation of such requirements is the main content of this work. Mathematical classification methods are considered as part of the applied statistics methods. The natural requirements to the considered methods of data analysis and the presentation of calculation results arising from the achievements and ideas accumulated by the national probabilistic and statistical scientific school are discussed. Concrete recommendations are given on a number of issues, as well as criticism of individual errors. In particular, data analysis methods must be invariant with respect to the permissible transformations of the scales in which the data are measured, i.e. methods should be adequate in the sense of measurement theory. The basis of a specific statistical method of data analysis is always one or another probabilistic model. It should be clearly described, its premises justified - either from theoretical considerations, or experimentally. Data processing methods intended for use in real-world problems should be investigated for stability with respect to the tolerances of the initial data and model premises. The accuracy of the solutions given by the method used should be indicated. When publishing the results of statistical analysis of real data, it is necessary to indicate their accuracy (confidence intervals). As an estimate of the predictive power of the classification algorithm, it is recommended to use predictive power instead of the proportion of correct forecasts. Mathematical research methods are divided into "exploratory analysis" and "evidence-based statistics." Specific requirements for data processing methods arise in connection with their "docking" during sequential execution. The article discusses limits of applicability of probabilistic-statistical methods. Concrete statements of classification problems and typical errors when applying various methods for solving them are also considered
-
TECHNOLOGY TRANSFER MODELS BETWEEN THE DEFENSE AND CIVIL SECTOR OF ECONOMY
08.00.13 Mathematical and instrumental methods of Economics
DescriptionThe article considers the problem of increasing the efficiency of budget expenditures due to the transfer of military technology to the civilian sector of the economy. An analysis of foreign experience has shown that private companies are widely involved in a number of states to solve some of the infrastructure problems in the military sphere. In the USA, private companies provide communications and provide other information services to state power structures, which makes it possible to develop private business on the one hand and save budget expenses on the other. An analysis of domestic experience has shown that the use of military technologies for the production of civilian products and services in some cases can significantly save time and other resources. A model for the interaction of civilian companies with the defense complex and a diffusion model of military technologies have been developed. The article proposes creation of new structures that solve the problems of adapting military technologies to the requirements of civilian customers, as well as a database of adapted technologies and a technical investment center that supports small and medium-sized enterprises in the acquisition of equipment and technical documentation. The authors believe that the approaches proposed in the article to solving the problem of technology transfer will stimulate innovative activity in the country, reduce import dependence and increase the efficiency of budget expenditures
-
PROBABILITY-STATISTICAL MODELS OF CORRELATION AND REGRESSION
08.00.13 Mathematical and instrumental methods of Economics
DescriptionThe correlation and determination coefficients are widely used in statistical data analysis. According to measurement theory, Pearson's linear paired correlation coefficient is applicable to variables measured on an interval scale. It cannot be used in the analysis of ordinal data. The nonparametric Spearman and Kendall rank coefficients estimate the relationship of ordinal variables. The critical value when testing the significance of the difference of the correlation coefficient from 0 depends on the sample size. Therefore, using the Chaddock Scale is incorrect. When using a passive experiment, the correlation coefficients are reasonably used for prediction, but not for control. To obtain probabilistic-statistical models intended for control, an active experiment is required. The effect of outliers on the Pearson correlation coefficient is very large. With an increase in the number of analyzed sets of predictors, the maximum of the corresponding correlation coefficients — indicators of approximation quality noticeably increases (the effect of “inflation” of the correlation coefficient). Four main regression analysis models are considered. Models of the least squares method with a determinate independent variable are distinguished. The distribution of deviations is arbitrary, however, to obtain the limit distributions of parameter estimates and regression dependences, we assume that the conditions of the central limit theorem are satisfied. The second type of model is based on a sample of random vectors. The dependence is nonparametric, the distribution of the two-dimensional vector is arbitrary. The estimation of the variance of an independent variable can be discussed only in the model based on a sample of random vectors, as well as the determination coefficient as a quality criterion for the model. Time series smoothing is discussed. Methods of restoring dependencies in spaces of a general nature are considered. It is shown that the limiting distribution of the natural estimate of the dimensionality of the model is geometric, and the construction of an informative subset of features encounters the effect of "inflation coefficient correlation". Various approaches to the regression analysis of interval data are discussed. Analysis of the variety of regression analysis models leads to the conclusion that there is no single “standard model”