21th International Conference on Intelligent Data Engineering and Automated Learning, IDEAL 2020, Guimaraes, Portekiz, 4 - 06 Kasım 2020, cilt.12489 LNCS, ss.77-87
© 2020, Springer Nature Switzerland AG.Convolutional Neural Networks (CNNs) have recently been applied for video classification applications where various methods for combining the appearance (spatial) and motion (temporal) information from video clips are considered. The most common method for combining the spatial and temporal information for video classification is averaging prediction scores at softmax layer. Inspired by the Mycin uncertainty system for combining production rules in expert systems, this paper proposes using the Mycin formula for decision fusion in two-stream convolutional neural networks. Based on the intuition that spatial information is more useful than temporal information for video classification, this paper also proposes multiplication and asymmetrical multiplication for decision fusion, aiming to better combine the spatial and temporal information for video classification using two-stream convolutional neural networks. The experimental results show that (i) both spatial and temporal information are important, but the decision from the spatial stream should be dominating with the decision from temporal stream as complementary and (ii) the proposed asymmetrical multiplication method for decision fusion significantly outperforms the Mycin method and average method as well.