Convolutional Neural Network for the automated detection of PFM-1 butterfly mine in orthomosaics

Overview

In our study, we deployed the Faster Regional-CNN (Faster R-CNN) [26]. This type of CNN has successful applications across the field of remote sensing from detecting maize tassels to airplanes to gravity waves [27,28,29]. We chose this type of CNN because of its superior speed and accuracy in detecting small objects to R-CNNs [30], Fast R-CNNs [31], Spatial Pyramid Pooling-Nets [32], and “You Only Look Once” (YOLO) Networks [33,34,35].

To automate the detection and mapping of the PFM-1 landmines, the CNN was trained and tested two separate times. The first time, the training data consisted of 165 RGB images obtained from different crops of six orthophotos. The orthophotos consisted of three flights over the same 10 × 20 m rubble environment and three flights over the same 10 × 20 m grass environment. Both the grass and rubble datasets were taken in fall 2019 and have 28 PFM-1 mines, four KSF-Casings, and two KSF-Caps scattered throughout the field. All training and testing was done on a Dual Socket Intel(R) Xeon(R) Silver 4114 CPU @ 2.20 GHz with 128 GB of RAM with a Titan V GPU with 12 GB of RAM. The CNN took 37 min to train over 50 epochs.

 
WorkfloW_CNN_V2-05.png
After we obtained our first model, we tested it on a withheld 10 × 20 m rubble environment, the same environment as one of the environments used for training but taken in Fall 2017, two years earlier than the training data. The CNN was tested on 18 images and took 1.87 s to produce a 0.7030 average precision (AP) for the PFM-1, a 0.7273 AP for the KSF-Casing, and a mean AP of 0.7152 (Table 2). The second time, the training data consisted of a randomly selected sample of 70% of the total images (128 RGB images) while the testing data consisted of the remaining 30% (55 RGB images). This model took 29 min to train over 50 epochs (Figure 10). Testing took 5.47 s and produced a 0.9983 AP for the PFM-1, a 0.9879 AP for the KSF-Casing, and a mean AP of 0.9931 as shown in Table 2.

Our CNN took 1.87 s to detect scattered PFM-1 landmines in a 10 × 20 m minefield equating to 2 h and 36 min ((1.87 s/200 m2) × 1,000,000 m2 = 9350 sec = 2 h and 36 min) to inspect one square kilometer with a 71.5% accuracy of landmine identification with each flight taking 3 min and 30 s for a 10 × 20 m minefield.

After increasing the volume of training and testing data and diversifying in terms of environmental conditions, landmine orientation in three-dimensional space, host environments, and presence of clutter, we have improved our CNN to a stunning 92.0% accuracy of landmine identification. Additionally, the CNN is now capable of returning cm scale accurate coordinates of where suspect mines may lie. The final step in order to perfect the CNN is to test and train it on active minefields in Afghanistan to see how effective it is in the field.

Description adapted from the paper below:
Baur J, Steinberg G, Nikulin A, Chiu K, de Smet TS. Applying Deep Learning to Automate UAV-Based Detection of Scatterable Landmines. Remote Sensing. 2020 Jan;12(5):859.

Project Authors: Gabriel Steinberg, Jasper Baur, Alex Nikulin, Kenneth Chiu and Timothy de Smet