TY - JOUR
T1 - A distributed and morphology-independent strategy for adaptive locomotion in self-reconfigurable modular robots
AU - Christensen, David Johan
AU - Schultz, Ulrik Pagh
AU - Støy, Kasper
N1 - (jcg) It seems unlikely that this publication isn't peer reviewed? Please check with the ITU author
(jcg) I still expect that the pulication is peer reviewed, is that checked?
PY - 2013
Y1 - 2013
N2 - In this paper, we present a distributed reinforcement learning strategy for morphology-independent life-long gait learning for modular robots. All modules run identical controllers that locally and independently optimize their action selection based on the robot’s velocity as a global, shared reward signal. We evaluate the strategy experimentally mainly on simulated, but also on physical, modular robots. We find that the strategy: (i) for six of seven configurations (3–12 modules) converge in 96% of the trials to the best known action-based gaits within 15 min, on average, (ii) can be transferred to physical robots with a comparable performance, (iii) can be applied to learn simple gait control tables for both M-TRAN and ATRON robots, (iv) enables an 8-module robot to adapt to faults and changes in its morphology, and (v) can learn gaits for up to 60 module robots but a divergence effect becomes substantial from 20–30 modules. These experiments demonstrate the advantages of a distributed learning strategy for modular robots, such as simplicity in implementation, low resource requirements, morphology independence, reconfigurability, and fault tolerance.
AB - In this paper, we present a distributed reinforcement learning strategy for morphology-independent life-long gait learning for modular robots. All modules run identical controllers that locally and independently optimize their action selection based on the robot’s velocity as a global, shared reward signal. We evaluate the strategy experimentally mainly on simulated, but also on physical, modular robots. We find that the strategy: (i) for six of seven configurations (3–12 modules) converge in 96% of the trials to the best known action-based gaits within 15 min, on average, (ii) can be transferred to physical robots with a comparable performance, (iii) can be applied to learn simple gait control tables for both M-TRAN and ATRON robots, (iv) enables an 8-module robot to adapt to faults and changes in its morphology, and (v) can learn gaits for up to 60 module robots but a divergence effect becomes substantial from 20–30 modules. These experiments demonstrate the advantages of a distributed learning strategy for modular robots, such as simplicity in implementation, low resource requirements, morphology independence, reconfigurability, and fault tolerance.
KW - Self-reconfigurable modular robots
KW - Locomotion
KW - Online learning
KW - Distributed control
KW - Fault tolerance
KW - Self-reconfigurable modular robots
KW - Locomotion
KW - Online learning
KW - Distributed control
KW - Fault tolerance
M3 - Journal article
SN - 0921-8890
VL - 61
SP - 1021
EP - 1035
JO - Robotics and Autonomous Systems
JF - Robotics and Autonomous Systems
IS - 9
ER -