IEEEThis article investigates the application of deep learning (DL) to the fusion of omnidirectional (O-D) infrared (IR) sensors and O-D visual sensors to improve the intelligent perception of autonomous robotic systems. Recent techniques primarily focus on O-D and conventional visual sensors for applications in localization, mapping, and tracking. The robotic vision systems have not sufficiently utilized the combination of O-D IR and O-D visual sensors, coupled with DL, for the extraction of vegetation material. We will be showing the contradiction between current approaches and our deep vegetation learning sensor fusion. This article introduces two architectures: 1) the application of two autoencoders feeding into a four-layer convolutional neural network (CNN) and 2) two deep CNN feature extractors feeding a deep CNN fusion network (DeepFuseNet) for the fusion of O-D IR and O-D visual sensors to better address the number of false detects inherent in indices-based spectral decomposition. We compare our DL results to our previous work with normalized difference vegetation index (NDVI) and IR region-based spectral fusion, and to traditional machine learning approaches. This work proves that the fusion of the O-D IR and O-D visual streams utilizing our DeepFuseNet DL approach outperforms both the previous NVDI fused with far-IR region segmentation and traditional machine learning approaches. Experimental results of our method validate a 92% reduction in false detects compared to traditional indices-based detection. This article contributes a novel method for the fusion of O-D visual and O-D IR sensors using two CNN feature extractors feeding into a deep CNN (DeepFuseNet).