Główne pojęcia
The authors propose a two-stage graph pointer network (GPN) model that can efficiently solve large-scale quadratic assignment problems (QAP) using reinforcement learning.
Streszczenie
The authors first extend the original graph pointer network (GPN) to solve matrix input traveling salesman problems (TSP), which is a generalization of the Euclidean TSP. They then further extend the model to solve QAP, which is an even more generalized combinatorial optimization problem.
The key aspects of the proposed approach are:
-
Matrix Input TSP Extension:
- The extended GPN model takes a distance matrix as input, eliminating the need for Euclidean coordinates.
- The authors demonstrate that removing the LSTM component from the original GPN encoder can accelerate the inference without decreasing the accuracy.
-
Two-Stage GPN for QAP:
- The authors introduce the distance-flow product (DFP) matrix as the input to the GPN model for QAP.
- They propose a two-stage GPN architecture, where the first stage selects the focused block in the DFP matrix, and the second stage generates the solution using the elements in the selected block.
- This two-stage approach allows the model to generate a permutation of size n from the n^2 x n^2 DFP matrix.
-
Experimental Evaluation:
- The authors evaluate the proposed models on benchmark instances from TSPLIB and QAPLIB.
- The results show that the extended GPN for matrix input TSP outperforms the original GPN, especially for larger problem sizes.
- The two-stage GPN for QAP provides semi-optimal solutions for most benchmark instances, outperforming a greedy algorithm and being faster than conventional heuristic methods.
The authors demonstrate the effectiveness of their two-stage GPN approach in solving large-scale QAP instances efficiently using reinforcement learning.
Statystyki
The authors report the following key metrics and figures:
The gap between the best-known solution and the obtained solution for matrix input TSP ranges from 2% to 32%.
The gap between the best-known solution and the obtained solution for QAP ranges from 9% to 30%, except for the chr instances which have too many zeros in the input matrices.
The execution time of the proposed two-stage GPN for QAP is 50.5 times faster than the WAITS heuristic method for the tai50a instance.
Cytaty
"The results show that, in almost all cases, our two-stage GPN provides better solutions than those provided by the greedy algorithm."
"Our two-stage GPN outperforms conventional heuristic methods in terms of the execution time, while the solution quality is inferior to conventional methods."