Deep Reinforcement Learning for Master Bay Planning on Container Vessels

Publikation: Konferencebidrag - EJ publiceret i proceeding eller tidsskriftKonferenceabstrakt til konferenceForskningpeer review


Container vessel stowage planning concerns itself with placing containers on port load lists onto vessels. A stowage plan can be decomposed in master bay planning to assign cargo to general areas and slot planning to place cargo into slots of bays [1]. The goal is to maximize vessel utilization on the fronthaul and minimize operational costs by creating robust stowage plans [1]. Due to stowage and seaworthiness rules, this optimization problem belongs to the class of NP-hard problems [1]. The first phase of another hierarchically decomposed NP-hard problem has recently been solved through deep reinforcement learning (DRL), which performs well on large and complex unseen instances [2]. To our knowledge, there is one piece of work on DRL for stowage optimization, which solves an incomplete slot planning problem with a single port of discharge and uniform containers [3]. Hence, novel DRL formulations are necessary to efficiently solve realistic and unseen instances.

This paper will experiment with DRL by formulating the master bay planning problem (MBP) as a finite Markov decision process, which is defined in terms of states, i.e. load lists, bay traits and capacities, and remaining cargo onboard, actions, i.e. allocating containers to bays, and rewards, i.e. total stowage cost, capacity utilization on fronthaul, and satisfying seaworthiness and stowage rules. As container allocation has no direct reward, a neural net is used to prescribe appropriate actions. Iteratively, the agent searches for an optimal policy that maximizes rewards through Markovian decision-making, whereafter network parameters are updated by proximal policy optimization. This stochastic on-policy method limits the risk of disrupting changes to solutions. Furthermore, a simulation environment will enable learning from unobserved problem instances.

If DRL scales well for MBP, this opens a new venue for stowage research. We recognize sufficient data on vessel types, stacking rules in slot planning, and revenue management in stowage plans as future challenges. However, we believe that more AI research advances the field toward fully automated stowage planning.
Publikationsdato14 sep. 2022
Antal sider2
StatusUdgivet - 14 sep. 2022
BegivenhedInternational Conference on Computational Logistics 2022 - Universitat Pompeu Fabra, Barcelona, Spanien
Varighed: 21 sep. 202223 sep. 2022


KonferenceInternational Conference on Computational Logistics 2022
LokationUniversitat Pompeu Fabra


Dyk ned i forskningsemnerne om 'Deep Reinforcement Learning for Master Bay Planning on Container Vessels'. Sammen danner de et unikt fingeraftryk.