ObjectGraphs: Using Objects and a Graph Convolutional Network for the Bottom-up Recognition and Explanation of Events in Video

You are currently viewing ObjectGraphs: Using Objects and a Graph Convolutional Network for the Bottom-up Recognition and Explanation of Events in Video

ObjectGraphs: Using Objects and a Graph Convolutional Network for the Bottom-up Recognition and Explanation of Events in Video

Post published:June 28, 2021
Post category:News / Publication

Paper presentation by CERTH at the IEEE/CVF International Conference on Computer Vision and Pattern Recognition!

Authors: Nikolaos Gkalelis, Andreas Goulas, Damianos Galanopoulos, Vasileios Mezaris

Abstract: In this paper a novel bottom-up video event recognition approach is proposed, ObjectGraphs, which utilizes a rich frame representation and the relations between objects within each frame. Following the application of an object detector (OD) on the frames, graphs are used to model the object relations and a graph convolutional network (GCN) is utilized to perform reasoning on the graphs. The resulting object-based frame-level features are then forwarded to a long short-term memory (LSTM) network for video event recognition. Moreover, the weighted in-degrees (WiDs) derived from the graph’s adjacency matrix at frame level are used for identifying the objects that were considered most (or least) salient for event recognition and contributed the most (or least) to the final event recognition decision, thus providing an explanation for the latter. The experimental results show that the proposed method achieves state-of-the-art performance on the publicly available FCVID and YLI-MED datasets.

For paper download click here:

ObjectGraphs: Using Objects and a Graph Convolutional Network for the Bottom-up Recognition and Explanation of Events in Video

© All rights reserved

Imprint | Privacy Policy

You Might Also Like

New paper Veracity assessment of online data co-authored by our member FOI Swedish Defence Research Agency

New Paper “Migration-Related Semantic Concepts for the Retrieval of Relevant Video Content” will be presented at the International Workshop on Artificial Intelligence and Robotics for Law Enforcement Agencies (AIRLEAs).

20th IEEE/WIC/ACM International Conference on Web Intelligence and Intelligent Agent Technology

© All rights reserved

Imprint | Privacy Policy