Deep Reinforcement Learning for Joint Cruise Control and Intelligent Data Acquisition in UAVs-Assisted Sensor Networks
Ref: CISTER-TR-231101       Publication Date: 8, Nov, 2023

Deep Reinforcement Learning for Joint Cruise Control and Intelligent Data Acquisition in UAVs-Assisted Sensor Networks

Ref: CISTER-TR-231101       Publication Date: 8, Nov, 2023

Unmanned aerial vehicle (UAV)-assisted sensor networks (UASNets), which play a crucial role in creating new opportunities, are experiencing significant growth in civil applications worldwide. UASNets provide a range of new functionalities for civilian sectors. Just as UASNets have revolutionized military operations with improved surveillance, precise targeting, and enhanced communication systems, they are now driving transformative change in numerous civilian sectors. For instance, UASNets improve disaster management through timely surveillance and advance precision agriculture with detailed crop monitoring, thereby significantly transforming the commercial economy. UASNets revolutionize the commercial sector by offering greater efficiency, safety, and cost-effectiveness, highlighting their transformative impact. A fundamental aspect of these new capabilities and changes is the collection of data from rugged and remote areas. Due to their excellent mobility and maneuverability, UAVs are employed to collect data from ground sensors in harsh environments, such as natural disaster monitoring, border surveillance, and emergency response monitoring. One major challenge in these scenarios is that the movements of UAVs affect channel conditions and result in packet loss. Fast movements of UAVs lead to poor channel conditions and rapid signal degradation, resulting in packet loss. On the other hand, slow mobility of a UAV can cause buffer overflows of the ground sensors, as newly arrived data is not promptly collected by the UAV.
Our proposal to address this challenge is to minimize packet loss by jointly optimizing the velocity controls and data collection schedules of multiple UAVs. The states of ground sensors include battery level, data queue length, and channel quality. In the absence of up-to-date knowledge of ground sensors’ states, we propose a multi-UAV deep reinforcement learning-based scheduling algorithm (MADRL-SA). This algorithm allows UAVs to asymptotically minimize packet loss due to buffer overflows and poor channel conditions, even in the presence of outdated knowledge of the network states at individual UAVs.
Furthermore, in UASNets, swift movements of UAVs result in poor channel conditions and fast signal attenuation, leading to an extended age of information (AoI). In contrast, slow movements of UAVs prolong flight time, thereby extending the AoI of ground sensors. Additionally, the UAVs should consider the movements of other UAVs to minimize the average AoI by coordinating their velocities. Hence, finding an equilibrium solution among UAVs to optimize velocity and reduce the average AoI becomes crucial.
To address this challenge, we propose a new mean-field flight resource allocation optimization to minimize the AoI of sensory data. Balancing the trade-off between UAV movements and AoI is formulated as a mean-field game (MFG). We introduce a new mean-field hybrid proximal policy optimization (MF-HPPO) scheme to handle the expanded solution space of MFG optimization. This scheme minimizes the average AoI by optimizing the UAV trajectories and ground sensor data collection schedules, considering mixed continuous and discrete actions. Additionally, we incorporate a long short-term memory (LSTM) in MF-HPPO to predict the time-varying network state and stabilize the training.

Yousef Emami

PhD Thesis, University of Porto.

Record Date: 21, Nov, 2023