Deep Reinforcement Learning for Master Bay Planning on Container Vessels

Research output: Contribution to conference - NOT published in proceeding or journalConference abstract for conferenceResearchpeer-review


Container vessel stowage planning concerns itself with placing containers on port load lists onto vessels. A stowage plan can be decomposed in master bay planning to assign cargo to general areas and slot planning to place cargo into slots of bays [1]. The goal is to maximize vessel utilization on the fronthaul and minimize operational costs by creating robust stowage plans [1]. Due to stowage and seaworthiness rules, this optimization problem belongs to the class of NP-hard problems [1]. The first phase of another hierarchically decomposed NP-hard problem has recently been solved through deep reinforcement learning (DRL), which performs well on large and complex unseen instances [2]. To our knowledge, there is one piece of work on DRL for stowage optimization, which solves an incomplete slot planning problem with a single port of discharge and uniform containers [3]. Hence, novel DRL formulations are necessary to efficiently solve realistic and unseen instances.

This paper will experiment with DRL by formulating the master bay planning problem (MBP) as a finite Markov decision process, which is defined in terms of states, i.e. load lists, bay traits and capacities, and remaining cargo onboard, actions, i.e. allocating containers to bays, and rewards, i.e. total stowage cost, capacity utilization on fronthaul, and satisfying seaworthiness and stowage rules. As container allocation has no direct reward, a neural net is used to prescribe appropriate actions. Iteratively, the agent searches for an optimal policy that maximizes rewards through Markovian decision-making, whereafter network parameters are updated by proximal policy optimization. This stochastic on-policy method limits the risk of disrupting changes to solutions. Furthermore, a simulation environment will enable learning from unobserved problem instances.

If DRL scales well for MBP, this opens a new venue for stowage research. We recognize sufficient data on vessel types, stacking rules in slot planning, and revenue management in stowage plans as future challenges. However, we believe that more AI research advances the field toward fully automated stowage planning.
Original languageEnglish
Publication date14 Sept 2022
Number of pages2
Publication statusPublished - 14 Sept 2022
EventInternational Conference on Computational Logistics 2022 - Universitat Pompeu Fabra, Barcelona, Spain
Duration: 21 Sept 202223 Sept 2022


ConferenceInternational Conference on Computational Logistics 2022
LocationUniversitat Pompeu Fabra
Internet address


  • Container Stowage
  • Master Bay Planning
  • Deep Reinforcement Learning
  • Markov Decision Process
  • Maritime Logistics


Dive into the research topics of 'Deep Reinforcement Learning for Master Bay Planning on Container Vessels'. Together they form a unique fingerprint.

Cite this