CSCi 8980 Edge Cloud Research: The Design and Implementation of a Wireless Video Surveillance System

Summary:

The current wired surveillance system requires massive cost in the deployment and development of the infrastructure. This restricts the coverage of these cameras. The authors propose a real time distributed wireless video surveillance system, Vigil, that leverages edge computing to scale up the surveillance to many cameras and expand the coverage region in the presence of limited bandwidth. The major challenge in a wireless system is the limited capacity of the wireless spectrum. Vigil aims at minimizing bandwidth consumption without sacrificing surveillance accuracy. It consists of two major components:

Edge Computing Nodes (ECN) : These are small computing platforms like a laptop or embedded system that are attached to the camera. Each ECN receives the video feed from its connected camera and executes initial processing on the the stream like face detection, indexing, compression and storage of video for a short period of time. It uploads the analytics of the feed to the Controller.
Controller : The controller is located in the cloud and receives the user queries and then runs a frame scheduling algorithm requesting the ECNs to upload only the relevant video frames. This content aware uploading strategy suppresses a large fractions of redundant data transfer to the cloud thereby saving the bandwidth.

Architecture and Design:

Each ECN calls a frameUtility on the frames to generate an array of analytics data called the util. These analytics are uploaded to the controller. The controller then makes the decision about the most valuable frames and requests the respective ECNs to upload them. Two designs are discussed in the paper:

Intra-cluster processing: When a number of cameras are capturing the same scene from different angles, there might be overlapping frames capturing the same object or person. The design suggests a re-identification algorithm that identifies frames with same objects and based on this the controller executes a scheduling algorithm to request frames from the ECNs in same cluster.
inter-cluster processing: The paper defines a metric, useful objects per second (ops), that helps maximize the number of useful objects per second delivered to the controller. The ops captures how many useful objects per second the frame at the ith index of the selected image sequence from cluster c will deliver if it is selected for transmission. Thus the design uses the ops for the inter-cluster scheduling.

Strength:

The paper provides a very comprehensive description of the design and implementations. The examples were very useful in understanding the frameUtitily, utils and the Vigil DRR algorithm. The initial scoping and problem description was very clear with assumptions clearly stated leaving less room for speculations.
good evaluation methodology of providing experimental results of the intra-cluster an inter-cluster algorithms discussed in the paper.
The system proposed is flexible in the sense that the definition of the utility can be changed based on the specific use case without changing the underlying algorithms.
Vigil outperforms the Round Robin and single camera approaches at high activity level providing a gain of 23-30%.
The inter-cluster design outperforms the traditional benchmarks for all activity levels and bandwidth.
The system is already deployed at two indoor and one outdoor site for the purpose of testing.

Weakness:

Each camera requires an ECN for connecting to the controller. This is overhead for infrastructure and management. The battery or power requirements of these devices should also be considered.
Additional hardware requirements to do video and image processing at the edge nodes. Some of these algorithms can be very computation intensive.

Discussion Points:

The re-identification algorithm fails when the objects are very close by in two frames. The author aim to target in future work the switching between traditional algorithm in such cases. How feasible is this switching keeping in mind that these are real time-systems?
In outdoor scenarios, face detection or object detection can be tricky since there is a lot of interference and noise. What will be the accuracy of content-based frame selection in such real life scenarios?

2 comments:

Unknown6 March 2017 at 21:15
This paper is pretty good. It creates a method on how to transmit effective information in one place and implement a method on how balance the transmissions from different palces to the controller. One question for this paper is that it relies on the analysis result of the video. As I know, current image recognition algorithms may not work very well in all conditions. So, using reflected signal (similar to radar, but we can use reflected Wi-Fi signal instead) to detect moving objects is much better.
There is a paper talking about how to use Wi-Fi to detect moving object.
See Through Walls with Wi-Fi!
http://conferences.sigcomm.org/sigcomm/2013/papers/sigcomm/p75.pdf
jon_weissman7 March 2017 at 08:20
Good summary. Point about battery is well-taken for some deployments, though CCTV may use AC power. Also, accuracy could become an issue as you indicate. Different priority or utility functions may be useful in such cases.

Zheng: thanks for the paper suggestion.

Note: only a member of this blog may post a comment.

Monday 6 March 2017

The Design and Implementation of a Wireless Video Surveillance System