InfoCom: Kilobyte-Scale Communication-Efficient Collaborative Perception with Information Bottleneck

¹Southwest Jiaotong University
²Wuhan University of Technology
AAAI 2026
^*Corresponding author

Abstract

Precise environmental perception is critical for the reliability of autonomous driving systems. While collaborative perception mitigates the limitations of single-agent perception through information sharing, it encounters a fundamental communication-performance trade-off. Existing communication-efficient approaches typically assume MB-level data transmission per collaboration, which may fail due to practical network constraints. To address these issues, we propose InfoCom, an information-aware framework establishing the pioneering theoretical foundation for communication-efficient collaborative perception via extended Information Bottleneck principles. Departing from mainstream feature manipulation, InfoCom introduces a novel information purification paradigm that theoretically optimizes the extraction of minimal sufficient task-critical information under Information Bottleneck constraints. Its core innovations include: i) An Information-Aware Encoding condensing features into minimal messages while preserving perception-relevant information; ii) A Sparse Mask Generation identifying spatial cues with negligible communication cost; and iii) A Multi-Scale Decoding that progressively recovers perceptual information through mask-guided mechanisms rather than simple feature reconstruction. Comprehensive experiments across multiple datasets demonstrate that InfoCom achieves near-lossless perception while reducing communication overhead from megabyte to kilobyte-scale, representing 440-fold and 90-fold reductions per agent compared to Where2comm and ERMVP, respectively.

Contribution

We propose InfoCom, the first communication-efficient collaborative perception framework supported by theoretical analysis. It resolves the fundamental tension between extreme KB-scale compression and perception fidelity through a novel information purification paradigm.
InfoCom achieves unprecedented bandwidth efficiency (440×/90× reduction vs. state-of-the-art) while maintaining near-lossless perception accuracy through three co-designed innovations: Information-Aware Encoding, Sparse Mask Generation, and Multi-Scale Decoding.
InfoCom demonstrates practical scalability via plug-and-play modularity, which seamlessly integrates with existing collaborative models under real-world network constraints. This capability is validated across OPV2V, V2XSet, and DAIR-V2X datasets.

Method

InfoCom is a communication-efficient collaborative perception framework based on a novel information purification paradigm, consisting of three core modules: (1) Information-Aware Encoding condenses task-critical information from high-dimensional intermediate features into minimal sufficient representations by extending the Information Bottleneck principle; (2) Sparse Mask Generation identifies essential spatial cues with minimal communication overhead; (3) Multi-Scale Decoding progressively recovers perceptual information through mask-guided reconstruction.

Main Results

Experimental results reveal three superiorities inherent to InfoCom. i) \textit{Exceptional communication efficiency}: InfoCom requires only kilobyte-level communication volume, comparable to Late Collaboration but significantly lower than other feature-based solutions. Specifically, its bandwidth consumption is over 400 times lower than Where2comm, only 1\% of that of ERMVP, and over 4000 times lower than Standard Collaboration. ii) \textit{Superior perception performance}: InfoCom maintains perception performance on par with the bandwidth-intensive Standard Collaboration despite minimal communication overhead, while significantly outperforms Where2comm. Moreover, ERMVP exhibits the smallest performance gap relative to InfoCom. iii) \textit{Optimal communication-performance trade-off}: InfoCom demonstrates state-of-the-art efficiency in performance gain per unit bandwidth. For example, on the OPV2V dataset, InfoCom achieves $1.8 \times 10^{-2}$ average performance gain per kilobyte, substantially exceeding Where2comm ($3.2 \times10^{-5}$) and ERMVP ($1.7 \times 10^{-4}$).

BibTeX

@inproceedings{Wei2026infocom, author = {Quanmin Wei, Penglin Dai, Wei Li, Bingyi Liu, Xiao Wu}, title = {InfoCom: Kilobyte-Scale Communication-Efficient Collaborative Perception with Information Bottleneck}, booktitle = {AAAI Conference on Artificial Intelligence (AAAI)}, year = {2026} }