Statistical Assessment of Plans via Probabilistic Optimization of Reliability

Publikation: AfhandlingerPh.d.-afhandling

Abstract

Autonomous Underwater Vehicles (AUVs) require efficient and reliable
task planning to operate for extended periods without human
intervention. However, current task planning methodologies often rely
on deliberately abstract domain models to ensure tractability of the planning
task. This abstraction can lead to unreliable or underperforming
plans in real-world execution. In this thesis, I present a novel probabilistic
decision-making model aimed at enhancing the reliability of AUV task
planning and assisting in plan selection through various techniques. The
model supports reasoning by inference from past events based on the
current state (introspection) as well as reasoning based on past experiences
to retrieve knowledge (retrospection). By evaluating plans and
anticipating future events before mission execution (prospection), the
proposed framework enhances autonomous decision-making capabilities.
The framework comprises three components: Plan Generation, Plan Evaluation,
and Plan Execution. Plans can be generated from a knowledge
base, a classical AI planner, or Large Language Models (LLMs).
First, I propose a framework that enhances the ability of underwater
robots to autonomously recover from anomalous situations. This framework
is built around a knowledge model developed in three stages: first,
we create a deterministic knowledge base to describe the health of hardware,
software, and environmental components involved in a mission;
second, I model these components probabilistically by defining the likelihood
of failures, faults, and fixes; and finally, I combine deterministic
and probabilistic knowledge into a minimal ROS package designed to
detect failures, isolate underlying faults, propose fixes, and determine
the most likely successful outcome. The approach is motivated by a
camera fault scenario and is demonstrated using both a real AUV and
a simulated Remotely Operated Vehicle (ROV) experiencing a thruster
failure.
Second, I automate plan generation using a risk-aware planning
framework, and a subset of these plans is assessed using various risk
metrics. Traditional planning problems are typically addressed with
risk-neutral optimization, focusing on a single objective such as minimizing
time or energy consumption. However, optimal abstract plans can
become suboptimal or unreliable during physical execution. To address
this, I introduce a method that generates diverse high-level plans and
evaluates them using risk measures including Variance, Entropy, Value
at Risk (VaR), Conditional Value at Risk (CVaR), and Entropic Value
at Risk (EVaR). These metrics provide a comprehensive assessment of
the potential risks associated with each plan. The method is evaluated
using a realistic underwater robot simulation across various scenarios,
demonstrating its feasibility and effectiveness. Among these metrics,
Variance identifies the minimum-risk plan, while VaR, CVaR, and EVaR
offer more nuanced ranking of the candidate plans. The superiority of
the approach is demonstrated through two physics-based simulation
scenarios: pipeline inspection and subsea infrastructure inspection.
A third approach explores plan generation using Large Language
Models (LLMs) using past incident plans. Testing marine vessel hardware
and software in field trials is essential to avoid technical and environmental
failures, yet such trials are costly. Generating realistic domain
descriptions based on past missions enhances the value of simulation
testing and reduces the need for expensive real-world trials. In this
work, I generate scenarios from unstructured Incident Response Plan
(IRP) documents using LLMs and convert them into structured planning
programs. The two synthesized marine test-domain datasets contain
approximately 90% parsable, 75% solvable, and 57% correct planning
programs. I further evaluate the diversity of the generated plans using
metrics from the machine learning domain, including L2, Cosine, Wasserstein,
and BERTScore. To improve the embedding quality, I fine-tune
a pre-trained model, CodeBERT, for PDDL domain inputs. Among all
models, fine-tuned CodeBERT best captured the diversity of generated
plans (evaluated on 10 samples out of 51), while among pre-trained models,
Code-LLaMA showed stronger performance based on the L2 metric.
However, these metrics are limited in detecting structural variations
between plan elements.
This thesis enhances autonomous underwater vehicle task planning
by combining these methods to overcome existing limitations and boost
the reliability, variety, and effectiveness of AUV missions.
OriginalsprogEngelsk
KvalifikationPh.d.
Vejleder(e)
  • Wasowski, Andrzej , Hovedvejleder
  • Heinrich, Stefan , Bivejleder
Bevillingsdato5 aug. 2025
ISBN'er, trykt978-87-7949-548-7
ISBN'er, elektronisk978-87-7949-566-1
StatusUdgivet - 2025

Fingeraftryk

Dyk ned i forskningsemnerne om 'Statistical Assessment of Plans via Probabilistic Optimization of Reliability'. Sammen danner de et unikt fingeraftryk.

Citationsformater