由随机森林确定的目标函数的约束优化

Constrained optimization of objective functions determined from random forests

Production and Operations Management · 2022

被引 25

人大 AFT50UTD24ABS 4

Max Biggs · 弗吉尼亚大学
Rim Hariss · 麦吉尔大学
Georgia Perakis · 麻省理工学院通讯

中文导读

研究如何用混合整数线性规划方法，在随机森林评估下做出受约束的最优决策，并证明通过采样少量树可得到近似最优解，适用于投资和陪审团选择等场景。

Abstract

In this paper, we examine a data‐driven optimization approach to making optimal decisions as evaluated by a trained random forest, where these decisions can be constrained by an arbitrary polyhedral set. We model this optimization problem as a mixed‐integer linear program. We show this model can be solved to optimality efficiently using pareto‐optimal Benders cuts for ensembles containing a modest number of trees. We consider a random forest approximation that consists of sampling a subset of trees and establish that this gives rise to near‐optimal solutions by proving analytical guarantees. In particular, for axis‐aligned trees, we show that the number of trees we need to sample is sublinear in the size of the forest being approximated. Motivated by this result, we propose heuristics inspired by cross‐validation that optimize over smaller forests rather than one large forest and assess their performance on synthetic datasets. We present two case studies on a property investment problem and a jury selection problem. We show this approach performs well against other benchmarks while providing insights into the sensitivity of the algorithm's performance for different parameters of the random forest.

数学优化机器学习运筹学决策科学

阅读原文 ↗