IDEAS home Printed from https://ideas.repec.org/a/gam/jftint/v10y2018i12p115-d184859.html
   My bibliography  Save this article

Video-Based Human Action Recognition Using Spatial Pyramid Pooling and 3D Densely Convolutional Networks

Author

Listed:
  • Wanli Yang

    (School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China)

  • Yimin Chen

    (School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China
    Shanghai Institute for Advanced Communication and Data Science, Shanghai 200444, China)

  • Chen Huang

    (School of Computer Engineering and Science, Shanghai University, Shanghai 200444, China)

  • Mingke Gao

    (The 32nd Research Institute, China Electronics Technology Group Corporation, No. 63 Chengliugong Road, Jiading District, Shanghai 200444, China)

Abstract

In recent years, the application of deep neural networks to human behavior recognition has become a hot topic. Although remarkable achievements have been made in the field of image recognition, there are still many problems to be solved in the area of video. It is well known that convolutional neural networks require a fixed size image input, which not only limits the network structure but also affects the recognition accuracy. Although this problem has been solved in the field of images, it has not yet been broken through in the field of video. To address the input problem of fixed size video frames in video recognition, we propose a three-dimensional (3D) densely connected convolutional network based on spatial pyramid pooling (3D-DenseNet-SPP). As the name implies, the network structure is mainly composed of three parts: 3DCNN, DenseNet, and SPPNet. Our models were evaluated on a KTH dataset and UCF101 dataset separately. The experimental results showed that our model has better performance in the field of video-based behavior recognition in comparison to the existing models.

Suggested Citation

  • Wanli Yang & Yimin Chen & Chen Huang & Mingke Gao, 2018. "Video-Based Human Action Recognition Using Spatial Pyramid Pooling and 3D Densely Convolutional Networks," Future Internet, MDPI, vol. 10(12), pages 1-11, November.
  • Handle: RePEc:gam:jftint:v:10:y:2018:i:12:p:115-:d:184859
    as

    Download full text from publisher

    File URL: https://www.mdpi.com/1999-5903/10/12/115/pdf
    Download Restriction: no

    File URL: https://www.mdpi.com/1999-5903/10/12/115/
    Download Restriction: no
    ---><---

    Corrections

    All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:gam:jftint:v:10:y:2018:i:12:p:115-:d:184859. See general information about how to correct material in RePEc.

    If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.

    We have no bibliographic references for this item. You can help adding them by using this form .

    If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.

    For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: MDPI Indexing Manager (email available below). General contact details of provider: https://www.mdpi.com .

    Please note that corrections may take a couple of weeks to filter through the various RePEc services.

    IDEAS is a RePEc service. RePEc uses bibliographic data supplied by the respective publishers.