A Prior Information-Based Reinforcement Learning Algorithm for the 2d Online Irregular Bin Packing Problem with Uncertain Demands

Media type: E-Book

Title: A Prior Information-Based Reinforcement Learning Algorithm for the 2d Online Irregular Bin Packing Problem with Uncertain Demands

Contributor: Ren, He [VerfasserIn]; Zhong, Rui [VerfasserIn]

imprint: [S.l.]: SSRN, [2023]

Extent: 1 Online-Ressource (34 p)

Language: English

DOI: 10.2139/ssrn.4463495

Identifier:

Keywords: deep reinforcement learning ; Prior information ; Physical constraint ; 2D-OIBPP

Origination:

Footnote:

Description: Online 2D irregular bin packing problem (2D-OIBPP) with uncertain demands is a critical challenge in the manufacturing and assembly industry, where a robot is required to place incoming objects within a limited time to maximize overall packing density. The high demands on real-time and optimality make it difficult to obtain a satisfactory solution. In this paper, we propose a deep reinforcement learning approach that leverages physical constraints and prior information to achieve better performance. Our method can handle two common working conditions of the online packing process, where the robot knows either the information of both the next object and the caught one or only the information of the caught object. To enhance the generalization ability of our approach, we use vision technology to convert the occupancy of the bin usage into a matrix represented by 0 and 1 and propose a new state representation no-fit matrix.The experimental results demonstrate that our algorithm achieves a human-level performance in most cases, providing an approximate optimal solution in a short time while balancing the real-time and optimal requirements of industrial production. Moreover, we propose a novel reinforcement learning training method called Maximum Worth Reinforcement Learning(MWRL) to solve the optimization problem with incomplete Markov chains, which has achieved better performance in the comparison study. Finally, we demonstrate the effectiveness of our algorithm in the real world by implementing it with a robotic manipulator

Access State: Open Access

Search in field:

Recently searched for: