通过最大化逼近能力的最优深度神经网络

Optimal deep neural networks by maximization of the approximation power

Computers and Operations Research · 2023

被引 10

ABS 3

Hector Calvo-Pardo
Tullio Mancini
José Olmo 通讯

中文导读

提出一种针对给定规模的深度神经网络的最优架构，通过最大化ReLU激活函数网络逼近的线性区域数量下界来优化宽度和深度，蒙特卡洛模拟和波士顿房价数据集验证其优于交叉验证和网格搜索。

Abstract

We propose an optimal architecture for deep neural networks of given size. The optimal architecture obtains from maximizing the lower bound of the maximum number of linear regions approximated by a deep neural network with a ReLu activation function. The accuracy of the approximation function relies on the neural network structure characterized by the number, dependence and hierarchy between the nodes within and across layers. We show how the accuracy of the approximation improves as we optimally choose the width and depth of the network. A Monte-Carlo simulation exercise illustrates the outperformance of the optimized architecture against cross-validation methods and gridsearch for linear and nonlinear prediction models. The application of this methodology to the Boston Housing dataset confirms empirically the outperformance of our method against state-of the-art machine learning models.

深度学习神经网络架构函数逼近机器学习优化蒙特卡洛方法

阅读原文 ↗