Conceitos essenciais
XNet, a novel neural network architecture utilizing Cauchy activation functions, demonstrates superior performance compared to Kolmogorov-Arnold Networks (KANs) and Multi-Layer Perceptrons (MLPs) in function approximation, solving partial differential equations, and time series forecasting.
Resumo
The paper presents a comprehensive comparison of three neural network architectures - XNet, Kolmogorov-Arnold Networks (KANs), and Multi-Layer Perceptrons (MLPs) - across various computational tasks.
Function Approximation:
- XNet outperforms KANs in approximating both irregular and high-dimensional functions, exhibiting smoother transitions at discontinuities and higher accuracy.
- XNet achieves significantly lower Mean Squared Error (MSE), Root Mean Squared Error (RMSE), and Mean Absolute Error (MAE) compared to KANs, especially for the Heaviside step function and complex high-dimensional scenarios.
Solving Partial Differential Equations (PDEs):
- Within the Physics-Informed Neural Network (PINN) framework, XNet demonstrates superior performance in solving the 2D Poisson equation compared to both MLPs and KANs.
- A width-200 XNet is 50 times more accurate and 2 times faster than a 2-layer width-10 KAN in solving the Poisson equation.
Time Series Forecasting:
- By integrating XNet into the LSTM architecture, the authors introduce the XLSTM model, which consistently outperforms traditional LSTM models in accuracy and reliability across various time series datasets, including synthetic and real-world financial data.
The authors conclude that XNet's enhanced function approximation capabilities, computational efficiency, and versatility make it a promising alternative to established neural network architectures, with potential applications in diverse domains such as image recognition, computer vision, and scientific computing.
Estatísticas
XNet with 64 basis functions achieves an MSE of 8.99e-08, RMSE of 3.00e-04, and MAE of 1.91e-04 in approximating the Heaviside step function, outperforming a [1,1] KAN with 200 grids.
In approximating the 2D function f(x,y) = exp(sin(πx) + y^2), XNet with 5,000 basis functions achieves an MSE of 3.9767e-07, RMSE of 6.3061e-04, and MAE of 4.0538e-04, while a KAN with a [2,1,1] structure achieves an MSE of 3.0227e-07, RMSE of 5.4979e-04, and MAE of 1.6344e-04.
For the high-dimensional function exp(1/2 * sin(π(x1^2 + x2^2)) + x3*x4), XNet with 5,000 basis functions achieves an MSE of 2.3079e-06, RMSE of 1.5192e-03, and MAE of 8.3852e-04, while a KAN with a [4,2,2,1] structure achieves an MSE of 2.6151e-03, RMSE of 5.1138e-02, and MAE of 3.6300e-02.
In solving the 2D Poisson equation, a width-200 XNet achieves an MSE of 1.0937e-09, RMSE of 3.3071e-05, and MAE of 2.1711e-05, outperforming a 2-layer width-10 KAN (MSE of 5.7430e-08, RMSE of 2.3965e-04, MAE of 1.8450e-04) and a PINN with a [2,20,20,1] structure (MSE of 1.7998e-05, RMSE of 4.2424e-03, MAE of 2.3300e-03).
Citações
"XNet significant improves speed and accuracy across various tasks in both low and high-dimensional spaces, redefining the scope of data-driven model development and providing substantial improvements over established time series models like LSTMs."
"Inspired by the mathematical precision of the Cauchy integral theorem, Li et al. (2024) introduced the XNet architecture, a novel neural network model that incorporates a uniquely designed Cauchy activation function."
"Empirical evaluations reveal that the Cauchy activation function possesses a localized response with decay at both ends, significantly benefiting the approximation of localized data segments."