Multi-agent coverage controller: a deep learning method with reinforcement for solving the problem of coverage path planning in multi-agent systems
Abstract
Multi-agent coverage controller: a deep learning method with reinforcement for solving the problem of coverage path planning in multi-agent systems
Incoming article date: 09.12.2025This work presents the Multi-Agent Coverage Controller (MACC)—a specialized deep reinforcement learning method designed to solve the coverage path planning problem in multi-agent systems. The method addresses key challenges inherent to coverage path planning, including sparse and noisy rewards, high gradient variance, the difficulty of credit assignment among agents, and the need to scale to a variable number of agents. MACC integrates a specific set of mechanisms: an adaptive clipping-interval width, advantage-modulation gating, a counterfactual baseline for the centralized critic, and a multi-head self-attention mechanism with a presence mask. Theoretical properties of the method are provided, demonstrating optimization stability and reduced variance of gradient estimates. A comprehensive ablation study is conducted, showing the contribution of each mechanism to agent coordination, spatial distribution of trajectories, and overall coverage speed. Experiments on a set of satellite maps indicate that MACC achieves substantial improvements in coverage completeness and speed compared to the baseline configuration, delivering the best results when all integrated mechanisms are used jointly.
Keywords: multi-agent system, coverage path planning, deep reinforcement learning, adaptive trim interval width, modulated advantage gateway, counterfactual basis, multi-head self-awareness mechanism, agent coordination