Two pathways are connected in the angle form, so that their ends are the connection areas to each other. This embedding helps the suspicious person tracking, when he travels through the path. In step 1, the input given to this algorithm is the frame from a single camera. Each frame i is processed by the algorithm.
Whereas, in step 2 is the bbox of a selected person identified by the operator is to be tracked.
CSDL | IEEE Computer Society
In step 3, the output, an appropriate pedestrian bounding box is given. In step 5, we initialize correlation tracker bbox for correlation tracking. In step 7, a For loop is set to process each frame i generated by the camera. In step 8, YOLO box will appear on the selected suspicious person on the frame i.
In step 9, correlation box will appear on the same human and on the same frame i. In step 10, threshold of IoU will be calculated between YOLO box and correlation box, so as to get a bounding box with accuracy step 11, the calculated threshold of IOU is checked with the threshold limit set by the operator. In step 12 and 13, if the threshold limit is exceeded by current calculation of the bbox then the path is updated to the tracker for the movement update and hence appended to the path set by the YOLO box. In step 14, else if the threshold calculate is not exceeding a limit then in step 15, the pathset is appended by the CF box.
Step 16 and 17 are used to make the end part of threshold of IoU condition and frame within the particular camera loop respectively. In step 18, a final pathset is returned as the output by the algorithm. In step 1, input is taken by the algorithm from frame i of initial j-th camera, where the suspicious person is selected by the operator. In step 2, k-th camera is selected based on the camera metadata for the connecting camera to be selected. In step 3, the bbox is the selection of suspicious person used for tracking.
In Step 4, it shows the output format of algorithm as a suspicious person bounding box set. In step 5, tracking is started by the C j initial camera for the suspicious person. In step 6, the access area is searched from the camera metadata, so that continuous trajectory tracking of a suspicious person is maintained.
Step 8, shows the call to the function on an event made by algorithm when the suspicious person is disappearing. It takes input bbox calculated for the person and the access area that are connecting to that camera by camera metadata. In step 9, C k is the new camera area where the new person is supposed to be tracked. In step 10, once the camera C k is decided then the algorithm access the frames that will be provided by that camera C k.
In step 11, the suspicious person selected before is identified by the two-stage gait recognition, as per the Section 2. Then there is the two-stage strategy for gait recognition. At the end step 12, the bbox tracking is continued for the suspicious person. In this section, we are going to present several experiments that are conducted based on methodology of our proposed paper.
For conducting experiments, the following system configuration was used, as given below in Table 3. Table 4 gives information regarding camera configuration used for experiment within this paper. As shown in Figure 11 , the dataset considered in this paper was recorded using a camera network in the entrance area of the department building. There were three cameras used: camera 1, camera 2 and camera 3, and each of them were covering different paths and had non-overlapping views.
In this experiment, we have considered three different scenes where camera 1 and camera 2 are in the corridor and camera 3 is within classroom. The scene shows two students walking together. Whereas during the test scenario, we have included same students with different clothes.
In Figure 12 , a different scenario was considered for experiment where dataset is taken from Camera 4, 5 within the cloud innovation school CIS and Camera 8 at its surrounding area. Here, Camera 4 and 5 were used for recording the training data of the experiment and Camera 8 was used for performing test that includes four students and the situation of occlusion effect. In Figure 13 , we have considered a laboratory environment, in which first we are going checked the YOLO bounding box detection capability.
Also, it was found to be better than other detection methods but not necessarily it detects every frame as when there will be flashing effects. Due to some limitations, it was noticed that YOLO is not able to track all objects successfully. It was the case that the observed CF was better in tracking objects, even when not trained.
The frame became smaller and smaller as the object approached, which may have led to a loss of object in some cases and hence we needed to overcome such a limitation. Also, it led to a loss of object, when encountering an obstacle. Here, it was noticed that there is insufficient training for the CF filter to avoid obstacles that may lead to loss of robustness. It was also noticed that the CF lost the target, when severe changes in object size occured during movement. Technically, while tracking CF may have missed the target because of too much shadowing or updating the wrong template because of the scale relationship.
Several such reasons lead to the devising of new method, so as to achieve better results. Also, even if two objects are overlapping each other, they are separated by different bounding boxes and categorized separately. Therefore, the target frame is continuous and stable. In Figure 15 a, as stated before, correlation filter CF was not able to overcome the limitation of object size detection and tracking. Therefore, if the object is in continuous movement, the bounding box is found to be lagging the path as shown in the tracking results. It can be easily noticed here that the object which was lacking the track before by the bounding box is now completely aligned and is able to properly detect and perform its tracking with accuracy.
Hence the limitation of size stability was resolved. In Figure 16 , a new scenario is considered for performing detection and tracking across camera. Algorithm 2 was used to recognize the gait features of all people and label them uniquely as the feature detection done using gait. Once the features were detected as shown in Figure 16 b and subsequent images, then the algorithm started tracking the detected people. Even in case these people change positions, their subsequent identity and tracking was maintained. In Figure 17 , we considered a new scenario using Figure 9 model for across multiple camera view recognition and tracking.
By using algorithm 2, when the features are recognized and tracked in CAM1 in Figure 16 , then it is continued to be tracked in CAM2 as given in Figure 17 a, fetching gait features for uniquely identifying a person by his postures even if he changes clothes and hat. In Figure 17 b, once the fetching of gait features is done then the algorithm identifies the identity assigned to each person before and then continues to track them in the subsequent images. Henceforth, this across multiple camera view tracking allows us to generate tracking of suspicious person within the Tunghai university campus and generates a map at the end for suspicious person detection, recognizing and tracking path.
Thus, it will solve the incident, if any, that occurred. Leading to precautionary conditions to be maintained to avoid any harm or damage to be caused in the future, within the campus environment for safety concerns. The Table 5 represents the value for IoU that computes the area of overlap between the predicted bounding box and the ground truth bounding box. In Figure 18 a, the mean average precision mAP is used for the calculating the accuracy of object detection. It can be represented mathematically as:.
- Laws and Symmetry (Clarendon Paperbacks).
- Multi-camera networks : principles and applications?
- Citations en double.
- Related Items.
- Bloodstained Legacy: Meld.
- Full text of "Multi Camera Networks Principles And Applications"!
For YOLOv3 the mean average precision is given as 0. In Figure 18 b, the learning curve is evaluated with epochs for gait classification using LSTM model. These statistics is used to represent accuracy of various methods used within methodology section. The target scenario of pedestrian tracking across cameras is that human objects can be recognized and tracked when they appear across cameras. To assess the efficacy of the proposed method, the precision can be defined as the ratio of correctly recognized when people in different cameras. For cross-camera scenarios in our test environment, the mean average precision is measured as 0.
There are several famous databases, such as Duke [ 34 ] and MARS [ 35 ], to be used for the evaluation of multiple object tracking algorithms in batch mode. This means that human objects should be collected and trained in advance. For suspicious detection and tracking purposes, it is not possible to collect and train the image data before a criminal event happened. The MOTChallenge is the leading assessment platform to evaluate the performance of multiple object tracking algorithms.
It consists of 11 datasets with a total of frames or s of video. For performance evaluation and judgement for online object tracking algorithms, nine evaluation indicators of MOT15 are measured as defined as follows [ 37 , 38 ]:. Frag: Number of track fragmentations caused by miss detection where a track is interrupted by miss detection.
Several popular online tracking approaches are selected for the comparison of human objects detection and tracking. The results and comparisons are shown in Table 6.
It also works well for other indicators. The values with bold font are best result than other approaches in Table 6.
When will my book be dispatched from your warehouse?
CamNet is a non-overlapping camera network tracking dataset CamNeT for evaluating multi-target tracking algorithms [ 44 ]. The dataset is composed of five to eight cameras covering both indoor and outdoor scenes at a university. The dataset has six scenarios, and each scenario video lasts at least five minutes.
Due to the low quality of image frames, in contrast to our dataset which can capture 14 posture points per image frame, only five feature points can be used for re-identification estimation. Table 7 shows the precision results of some experiments on CamNet. Object tracking applications usually adapt possibility-based algorithms such as Kalman filter, correlation filter or combination or variances of these two methods. Kalman filter use previous object movement tracks to predict the possibilities of next object locations.
However, it often fails in cases of missing detection by occlusions, human overlapping, variant illumination and the likes. On the other hand, correlation filter method tries to compare the most similar bounding boxes between continuous image frames. Theoretically, the Kalman filter is suitable for simple use cases of radar-based object tracking where images are consisted of target object and noise signals. To meet the requirement of pre-condition for using Kalman, the image pre-process would cost much to get high-quality images from cameras with different kinds of scenes and scenarios.
In addition, from the offline testing results, the correlation filter performs much better than Kalman filter [ 45 , 46 ]. Those are main reasons why STAM-CCF choose correlation filter, instead of Kalman filter, as a base line for suspicious tracking system design within a camera. Although the correlation filter performs better than Kalman in across cameras tracking scenarios, it also faces problems of missing detection. It will cause the failure of suspicious tracking through many image frames and identify the suspicious back in lucky conditions. From our offline experiments, most of the cases, it mismatches the human objects, and fails to track suspicious persons in such kind of missing detection conditions.
This means that it needs additional complementary functions to solve the missing detection problem. Luckily, it turns out the idea of leveraging YOLO object detection functions to get the nearest bounding box with maximum response. After overcoming within-camera tracking problems, STAM-CCF remains the problem of re-identification of suspicious across multiple cameras.
That is, STAM-CCF has to compare all the first human image frames for each camera to identify which human object is the tracking suspicious. Intuitively, only adjacent cameras geographically could be the candidate camera because human movement speed is limited to 36 KM per hour currently. Hence, the related location information of each camera is used to compute the possibility of suspicious and reduce the computation resources and cost very much. Finally, to re-identify suspicious from candidate cameras, the posture and gait features are used to compute the similarity the first images with YOLO bounding box.
The first one is to track suspicious within a camera, and the other one is to compute the similarity between first image of YOLO bounding box between cameras. For the case of within-camera tracking, correlation filter will always suggest the bounding box with maximum response.
To ensure the accuracy of the object tracking, the YOLO bounding box of largest IoU value was used to adjust correlation filter bounding box. The assumption was that human has limitation of movement speed and cannot move far than a specific distance between two or five image time frames by using camera of at least 15 fps. Based on the rule, the IoU larger than 0. Recursively, the suspicious will be tracked correctly within a camera.
Even the experimental results show that STAM-CCF performs well in most use cases, there are still limitations or exceptions should be handled within a camera. Take the overlapping case as an example, the suspicious is correctly identified in the previous image frame but two humans overlap. If the suspicious person is hiding in the background, and another human is in the front ground then there will be only one YOLO bounding box with the largest IoU value. The implementation should handle such kind of exceptions by counting and keeping the number of YOLO bounding boxes if no human object standing near the boundary of image frames.
The other major procedure is to compute the similarity of posture and gait features for YOLO bounding boxes. Due to the property of the cold start problem, STAM-CCF will not have a large enough data set of suspicious postures and gait features and, thus, might lead into error-prone conditions. For example, a human will stand with different angles facing to the camera. To solve this problem, with the help of location information of camera, only humans in the connection area will be taken as candidates used for similarity comparison.
Suspicious tracking across multiple cameras has strong demands for application scenarios such as safety assurance of human, intruder detection and alarm, criminal tracking and the like. To enable the multiple camera tracking capability and overcome the obstacles which mainly come with cold-start problems and low-quality image issues, the proposed STAM-CCF combines correlation filters, YOLO object detection, posture and gait features to realize the across camera tracking functions.
In addition, it also leverages the location information to ensure good system performance by only re-identifying adjacent or candidate camera images. Besides the sharing of implementation experiences, several scenarios in a university are also designed for STAM-CCF feasibility and performance testing. Due to the limitation of resources, and a tight schedule, there are still different kinds of application scenarios such as factories and smart cities to be deployed and test in the near future. Methodology, R. National Center for Biotechnology Information , U. Journal List Sensors Basel v. Sensors Basel.
Published online Jul 9. Author information Article notes Copyright and License information Disclaimer. Received Jun 1; Accepted Jul 4. Abstract There is strong demand for real-time suspicious tracking across multiple cameras in intelligent video surveillance for public areas, such as universities, airports and factories. Keywords: suspicious tracking, surveillance, multi-camera tracking, feature based tracking.
STAM-CCF Objectives: Recently, several surveillance techniques use deep-learning based object detection algorithms that help to count new objects or people count registered. Eventually, STAM-CCF objectives can be presented as follows: Solving cold-start problems of tracking suspicious persons: Suspicious tracking is the crucial factor in any surveillance system.
STAM-CCF: Suspicious Tracking Across Multiple Camera Based on Correlation Filters
Literature Survey: Previous within-camera tracking related studies majorly focus on one-camera application scenarios and aggregate all video streams in a grid screen monitored by a human. Materials and Methods In this section, we are going to design the planning by deciding the criteria for camera position, within the departmental building and their scanning based on optimal tree traversal to avoid scanning of all cameras for optimal performance.
Open in a separate window. Figure 1. Figure 2. Figure 3. Figure 4. Detection and Tracking Model As shown in Figure 5 , the human tracking flowchart a , a recorded video is selected from a particular day, which is supposed to be inspected. Figure 5. Suspicious Person Tracking As shown in Figure 6 , the objective of this diagram is to explain specifically how the selected suspicious person is detected and tracked within the system.
Figure 6. Figure 7. Two-stage Strategy for Gait Recognition In Figure 8 , long short-term memory LSTM is used in conjunction with a Gait energy image [ 32 ] to have features learned with temporal information and cross view gait recognition.
Citations en double
Figure 8. Cross Camera Metadata As shown in Figure 9 and Figure 10 , camera map and metadata are presented in an xml file format respectively. Figure 9. Figure Results In this section, we are going to present several experiments that are conducted based on methodology of our proposed paper. Table 3 System Configuration. Table 4 Camera Configuration. Dataset As shown in Figure 11 , the dataset considered in this paper was recorded using a camera network in the entrance area of the department building.
Experiments In Figure 13 , we have considered a laboratory environment, in which first we are going checked the YOLO bounding box detection capability. Statistics 3. Experiments on Our Database The Table 5 represents the value for IoU that computes the area of overlap between the predicted bounding box and the ground truth bounding box.
Table 5 IoU Threshold. Experiments for Multiple Object Tracking Benchmark There are several famous databases, such as Duke [ 34 ] and MARS [ 35 ], to be used for the evaluation of multiple object tracking algorithms in batch mode. For performance evaluation and judgement for online object tracking algorithms, nine evaluation indicators of MOT15 are measured as defined as follows [ 37 , 38 ]: 1. Table 7 Precision results on CamNeT. Discussion 4. Experience Sharing for STAM-CCF Design Object tracking applications usually adapt possibility-based algorithms such as Kalman filter, correlation filter or combination or variances of these two methods.
Conclusions Suspicious tracking across multiple cameras has strong demands for application scenarios such as safety assurance of human, intruder detection and alarm, criminal tracking and the like. Author Contributions Methodology, R. Funding This research received no external funding. Conflicts of Interest The authors declare no conflict of interest.
References 1. Natarajan P. ACM Trans. TOMM ; 11 doi: Tripathi R. Suspicious human activity recognition: A review.
Akdemir U. Chuang C. Carried object detection using ratio histogram and its application to suspicious event analysis. IEEE Trans. Circuit Syst. Video Technol. Ryoo M. Stochastic representation and recognition of high-level group activities. Ibrahim N. Sujith B. Crime detection and avoidance in ATM: A new framework. Valera M. Intelligent distributed surveillance systems: A review. IEE Proc. Vision Image Signal. Morris B. A survey of vision-based trajectory learning and analysis for surveillance.
Circuits Syst. Abidi B. Survey and analysis of multimodal sensor planning and integration for wide area surveillance. ACM Comput. Javed O. Image and Video Processing. Aghajan H. Multi-Camera Networks Principles and Applications. Kim H. Seema A. Towards efficient wireless video sensor networks: A survey of existing node architectures and proposal for a flexi-WVSNP design. IEEE Commun. Roy-Chowdhury A. Camera networks: The acquisition and analysis of videos over wide areas. Tavli B. A survey of visual sensor network platforms. Tools Appl. Vezzani R. People reidentification in surveillance and forensics: A survey.
Song M. Tan Y. Editors: Hamid Aghajan Andrea Cavallaro. Hardcover ISBN: Imprint: Academic Press. Published Date: 11th May Page Count: For regional delivery times, please check When will I receive my book? Sorry, this product is currently out of stock. Flexible - Read on multiple operating systems and devices. Easily read eBooks on smart phones, computers, or any eBook readers, including Kindle. When you read an eBook on VitalSource Bookshelf, enjoy such features as: Access online or offline, on mobile or desktop devices Bookmarks, highlights and notes sync across all your devices Smart study tools such as note sharing and subscription, review mode, and Microsoft OneNote integration Search and navigate content across your entire Bookshelf library Interactive notebook and read-aloud functionality Look up additional information online by highlighting a word or phrase.
Institutional Subscription. Free Shipping Free global shipping No minimum order. Preface David Forsyth, Univ. Dini, A. Roth, C. Leistner, H. Grabner, and H. Stanford University, USA. Powered by. You are connected as. Connect with:. Use your name:.
Related Multi-Camera Networks: Principles and Applications
Copyright 2019 - All Right Reserved