A Maximum‐Entropy Based Heuristic for Density Estimation from Data in Histogram Form*
针对不同来源直方图数据分组不一致的问题,提出基于最大熵的启发式方法,利用已知均值和中位数估计密度,适用于商业、市场研究及风险分析等场景。
ABSTRACT We look at a specific but pervasive problem in the use of secondary or published data in which the data are summarized in a histogram format, perhaps with additional mean or median information provided; two published sources yield histogram‐type summaries involving the same variable, but the two sources do not group the values of the variable the same way; the researcher wishes to answer a question using information from both data streams; and the original, detailed data underlying the published summary, which could give a better answer to the question, are unavailable. We review relevant aspects of maximum‐entropy (ME) estimation, and develop a heuristic for generating ME density estimates from data in histogram form when additional means and medians may be known. Application examples from several business and scientific areas illustrate the heuristic's use. Areas of application include business and social or market research, risk analysis, and individual risk profile analysis. Some instructional or classroom applications are possible as well.