• Home
  • Sitemap
  • Contact us
Int J Fire Sci Eng Search


Int J Fire Sci Eng > Volume 37(2); 2023 > Article
Lee, Jeong, and Jung: Development of a Forest Fire Detection System Using a Drone-based Convolutional Neural Network Model


Considering forest fires cause environmental destruction, ecosystem collapse, and severe damage to human lives and nature, developing a real-time, accurate, and stable forest fire detection system has become a critical issue in modern society. In this study, a drone-based forest fire detection system was developed using a convolutional neural network (CNN) model. Real-time forest fire detection models were developed using the CNN-based MobileNet algorithm, and their fire detection performance was evaluated. The main research results indicated that errors decreased and accuracy tended to increase during the model training and validation process as training progressed. Moreover, the V1 model exhibited the highest validation accuracy of 0.9466 among the MobileNet V1, V2, and V3 models and showed the highest accuracy of 0.9667 in evaluating the new test dataset during the model evaluation process.

1. Introduction

Recently, owing to the worsening global warming effect, an increase in dry days, and human carelessness worldwide, large forest fires have continuously occurred in many countries [1]. Figure 1 shows a domestic forest fire [2]. Forest fires occur in forest areas and cause serious damage, such as human casualties and the destruction of the natural environment. Large forest fires may lead to the long-term devastation of human lives and the natural environment owing to ecosystem destruction [3]. Figure 2 shows the status of domestic forest fire damage. An average of 537 forest fires occurred each year from 2013 to 2022, resulting in an average loss of 3,560 ha of forest area. These statistics show that forest fires have been continuously occurring in Korea over the past ten years, increasing the total damage area [4]. Therefore, early forest fire detection is crucial for fire prevention.
The conventional forest fire detection method mainly depends on visual monitoring using forest inspectors. However, this method requires considerable manpower, has high costs, and suffers from limitations in the real-time monitoring of a wide area [5]. Recently, forest fire detection systems that use drone-based convolutional neural network (CNN) models have attracted attention. In addition to their low cost, drones can effectively search a wide area and allow real-time analysis by collecting image data [6]. In this regard, CNN models have been used in several studies to classify fire and non-fire accidents; various algorithms have been applied to effectively detect fires and identify their locations. Roh et al. conducted a comparative analysis of fire detection and warning systems using edge computing technology. They attempted to improve fire safety using different CNN-based image classification models, including MobileNetV2, ResNet101, and EfficientNetB0. Their results showed that EfficientNetB0 exhibits the highest performance in classifying fire and non-fire accidents with an accuracy of 95.2%, followed by ResNet101 at 95.0% and MobileNetV2 at 91.8% [7]. A literature review also shows that in addition to fire detection, previous studies mostly focused on fire and non-fire data; however, smoke data also needs to be considered in early forest fire detection. Therefore, a combination of drones and CNN models was used in this study to develop a system that can detect and effectively respond to forest fires in their early stages with high accuracy. The most suitable model for forest fire detection was then selected through an analysis of various CNN models and used to design and verify a system that detects forest fires in real time using drone-based data collection and deep learning technology. The developed system enables the real-time detection and monitoring of forest fires as well as rapid and effective response to forest fires by providing information on the fire situation through images for decision-making on limiting damage caused by forest fires.

2. Theoretical Background

2.1 Convolutional Neural Network

CNN models are algorithms based on deep learning technology and are mainly applied in image analysis fields such as face recognition, autonomous driving, and signal analysis. Figure 3 shows the architecture of a CNN model [8]. A CNN model uses two-dimensional images as input data, extracts the unique features of the input images using convolution operations, and classifies the objects of the input images using the extracted features [9]. A CNN model consists of one or more convolutional layers and fully connected layers. Convolutional layers consist of multiple convolution layers and pooling layers. They apply filters to the input images to extract the features of the images, which are then output to the pooling layers after introducing nonlinearity through activation functions such as ReLU [10]. Pooling layers are steps used to reduce the size of the feature images extracted by the convolution layers. The features of the images that underwent the convolution and pooling stages are converted from data in a two-dimensional array into data in a one-dimensional array through flattening and transferred to the fully connected layer stage [11]. A CNN model that goes through this process exhibits a high performance when two-dimensional images are used as input data. In this study, the fire detection performance was analyzed using CNN models, which are expected to effectively extract the features of fire images collected using real-time fire monitoring imaging equipment. A deep learning model-based fire detection system was then designed based on an analysis of the performance results.

2.2 MobileNet model

MobileNet has a small operation quantity despite its relatively complex neural network structure compared to typical CNN models because the MobileNet structure uses separable convolution layers that extract the features of images more accurately than CNN. Separable convolution layers are a type of convolution layer. This structure includes depthwise convolution layers, which apply convolution to each input channel (RGB), decrease the number of parameters by summing channels, and reduce the operation quantity. It also includes pointwise convolution layers, which use fewer layers by limiting the depth and reducing the dimensions, while maintaining nonlinearity using ReLU and 1 × 1 convolution [12,13]. Figure 4 shows the architecture of MobileNet, which uses separable convolution layers [14]. Using this structure, MobileNet was developed to enable real-time execution and efficient model inference in resource-limited environments such as mobile devices and drones while also enabling efficient execution in the drones used in this study owing to the small model size and low operation quantity and providing the high inference speed required to make prompt decisions in real-time. Therefore, MobileNet is suitable for vision-based detection tasks such as forest fires. In situations where the early detection of forest fires and rapid response are required, the lightweight model provides high performance in image processing. Therefore, an experiment was performed in this study to compare the performance of the MobileNet V1, MobileNet V2, and MobileNet V3 models to analyze the performance of MobileNet. The performance of the three models was then evaluated through vision-based detection.

3. Experiment

3.1 Fire detection model development process

The steps for the development of the fire detection model shown in Figure 5 are as follows:
• A hardware and software environment suitable for developing a forest fire detection model was constructed.
• Open data were collected for forest fire detection, with a focus on fire, non-fire, and smoke images.
• For CNN model training, image preprocessing, such as data augmentation and image size normalization, was performed in a form suitable for the input layer.
• MobileNet models were then constructed using the TensorFlow library, with their performance examined through training, validation, and evaluation processes.

3.2 Experimental setup

Table 1 shows the specifications of the main equipment used in the experiment in this study. In terms of the hardware, a high-performance computer comprising a CPU equipped with a 2.20 GHz Intel® Xeon® Silver 4210 processor, a GPU equipped with an NVIDIA GeForce RTX 3090, and 192 GB of RAM were used. The software used in the model implementation includes TensorFlow 2.5.0, Python 3.8.0, CUDA 11.2, and OpenCV 4.6.0. The drone in which the real-time fire detection CNN models were to be inserted was equipped with an ultra-small HD camera, and its flight was controlled using a smartphone via a Wi-Fi connection.

3.3 Data collection

Figure 6 shows examples of the fire-, smoke- and non-fire-related datasets, with 4,800 images collected from open datasets such as Kaggle and used as input data in MobileNet [15-17]. Among them, 4,500 images were classified as the fire detection model training and validation dataset, with the remaining 300 images used as the evaluation dataset. Among the 4,500 images, the training data and validation data comprised 1,200 images (80%) and 300 images (20%), respectively, for each fire, smoke, and non-fire incident. In the case of the 300 images classified for model evaluation, 100 images each of fire, smoke, and non-fire incident were used as test data.
Figure 7(a) shows a fire image, and Figure 7(b) shows a non-fire image that can be recognized as fire. Among the datasets collected from Kaggle, 1,000 typical non-fire images and 500 images that can be recognized as fire owing to lighting and sunset were collected. Properly collecting such data with various characteristics leads to higher performance in accurately detecting non-fire incidents, even in situations that can be mistaken for fire incidents. Therefore, the fire detection model developed in this study is expected to exhibit high performance even in unwanted fire alarm situations, which has attracted recent social interest.

3.4 Data augmentation

In this study, the performance of CNN-based fire detection monitoring was improved by applying data preprocessing and augmentation technologies. The data preprocessing techniques used in this study include image size adjustment and data augmentation. All images of the datasets were normalized to a size of 224 × 224 × 3 pixels to reduce the complexity of computation and ensure standardized data input. Figure 8 shows the data augmentation used to increase the diversity of image data and prevent overfitting ((a) rotating image, (b) flipping image, (c) shifting image, and (d) zooming image) [18].

3.5 Model training performance evaluation indicator

In this study, confusion matrix analysis was conducted to visualize the performance of models and evaluate multiple evaluation indicators, including accuracy, precision, recall, and F1_Score. The confusion matrix analysis shows the number of true positive, true negative, false positive, and false negative values—useful in determining the accuracy and predictive ability of the results predicted by a model [19]. Table 2 shows the confusion matrix for the classifier.
In this study, MobileNet-based fire detection models were trained and their performances were evaluated. The trained models were then applied to fire detection monitoring equipment to analyze the classification results. The accuracy, precision, recall, and F1_Score evaluation indicators were obtained using Eqs. (1), (2), (3), and (4), respectively. These evaluation indicators are generally used to evaluate the model performance in the multiple classification tasks of CNN models [20]. Accuracy is the ratio of the data correctly predicted by the model to the entire data, and precision is the ratio of the positive data to the data predicted to be positive by the model. Moreover, recall is the ratio of the data predicted to be positive by the model to the positive data, and F1_Score, which is the harmonic mean of precision and recall, evaluates the performance of the model considering both indicators [7].

4. Experiment and Experimental Results

In this study, fire, smoke, and non-fire image data were used to develop a CNN model that can classify fire situations. CNN models, such as MobileNet V1, MobileNet V2, and MobileNet V3, were used to evaluate the performance of the drone-based fire detection system. As these models are based on image processing technology and can be used to identify and detect fires, the performance indicators of each model were analyzed. Figure 9 shows the training and validation results of each model. The training accuracy represents the training data prediction accuracy of the model, while the validation accuracy indicates the validation data prediction accuracy of the model. Additionally, the training loss evaluates model performance using the training data, while validation loss measures the generalization performance of the model for the unseen validation data. The graphs in Figure 9 show that the error decreased and accuracy tended to increase as training progressed for all three models. For MobileNet V1 and MobileNet V2, the error rapidly decreased, thereby increasing their accuracy. However, the performance of MobileNet V3 was degraded compared to MobileNet V1 and MobileNet V2.
Figure 10 shows the test dataset-based confusion matrix model evaluation results. The test confusion matrix represents the prediction accuracy of the MobileNet V1, MobileNet V2, and MobileNet V3 models in classifying the 100 images as fire, non-fire, and smoke. Based on the matrix, it is possible to evaluate the model performance and identify the proportion of images correctly classified as fire, non-fire, and smoke. The x-axis of the matrix denotes the class predicted by the model, while its y-axis represents the class designated as the actual correct answer. Considering the experimental results, the MobileNet V1 model exhibited the highest performance in determining fire situations. In addition, each model commonly detected the fire situations in the smoke class most accurately and showed insufficient performance in classifying fire and non-fire situations as non-fire situations, such as lamps, light, and sunset, which were recognized as fire in many cases. However, the MobileNet V1 model exhibited higher accuracy because non-fire situations were recognized as fire in fewer cases. Table 3 shows the verification of the model prediction and the precision, recall, accuracy, and harmonic mean of the test dataset; in addition, this table shows the performance evaluation results of each model. When the accuracies of the MobileNet V1, MobileNet V2, and MobileNet V3 models for the fire, smoke, and non-fire classes were evaluated, the results of the MobileNet V1 model most accurately determined fire situations, where the overall accuracy was approximately 96% for MobileNet V1, 87% for MobileNet V2, and 61% for MobileNet V3. As seen in Table 3, MobileNet V1 exhibited the highest accuracy of more than 95% for fire, general, and smoke images. Therefore, MobileNet V1 has the most suitable model structure for detecting forest fires using drones and is thus the most suitable model for use in the real-time forest fire monitoring system.
Table 4 shows the evaluation results of the developed fire detection model, indicating its main predictions for fire, non-fire, and smoke. The images in Figure 4(a) were correctly predicted by the model according to their classes. In particular, the model accurately detected the image that is not fire and can be recognized as fire or smoke. However, detection performance was degraded in the case of smoke images that can be recognized as fire or non-fire, as shown in Figure 4(b). Despite these limitations, MobileNet V1 is considered suitable for fire detection as it exhibits a high F1_Score of 0.9529. Figure 11 shows the experimental scenes used in the real dataset employed in the detection model. This experiment was performed to evaluate the model's ability to accurately detect fire, non-fire, and smoke.

5. Conclusions

In this study, a drone-based real-time forest fire detection model was developed using a convolutional neural network (CNN) algorithm. First, 1,600 images were collected for each fire, non-fire, and smoke incident. Data preprocessing and augmentation techniques were then used to supplement the collected image data, with the resulting data used in the input layer of the forest fire detection model. Second, the structure of the CNN model was analyzed to efficiently detect real-time forest fires, and the MobileNet algorithm designed based on separable convolution layers was used to allow real-time execution and efficient model inference in mobile devices and drones. Based on this process, a real-time forest fire detection model that exhibits a high performance despite the small size and small operation quantity was developed. Third, a comparative analysis was conducted on the MobileNet V1, MobileNet V2, and MobileNet V3 models to verify the performance of the MobileNet algorithm. The training and validation results showed that the error decreased and accuracy tended to increase as training progressed in all three models. In the comparative analysis results, the MobileNet V1 model exhibited the highest validation accuracy of 0.9466 among the three models. Finally, the developed model was evaluated based on the new test dataset. Considering the model evaluation results, the MobileNet V1 model exhibited the highest performance with a test accuracy of 0.9667. When the prediction results were analyzed in detail, the developed model exhibited excellent prediction results even in non-fire situations that can be recognized as fire, thus confirming that MobileNet V1 is the most suitable model for the developed drone-based real-time forest fire detection model. In future research, the experimental results of this study will be used to collect various data and conduct research on preprocessing and deep learning algorithms to improve model performance. In addition, artificial intelligence will be applied to develop an optimal forest fire detection drone system.


This study is supported by the Korea Agency for Infrastructure Technology Advancement (KAIA) grant funded by the Ministry of Land, Infrastructure and Transport (Grant 202102220002).


Conflicts of Interest

Authors must identify and declare any personal circumstances or interests that may be perceived as inappropriately influencing the representation or interpretation of reported research results. Declare conflicts of interest or state “The authors declare no conflict of interest.”

Author Contributions

For research articles with several authors, a short paragraph specifying their individual contributions must be provided. The following statements should be used “Conceptualization, K.J. and J.L.; methodology, J.L.; software, J.L.; validation, J.L., K.J. and H.J.; formal analysis, J.L.; investigation, H.J.; resources, K.J.; data curation, J.L.; writing—original draft preparation, K.J.; writing—review and editing, J.L.; visualization, H.J.; supervision, H.J.; project administration, H.J.; funding acquisition, H.J. All authors have read and agreed to the published version of the manuscript.”

Figure 1
Domestic forest fire (Uljin, Gyoengsangbuk-do).
Figure 2
Status of forest fire damage.
Figure 3
Convolutional neural network architecture.
Figure 4
MobileNet architecture.
Figure 5
Study methodology flowchart.
Figure 6
Various fire-, non-fire-, smoke-related datasets.
Figure 7
Fire and non-fire images.
Figure 8
Data augmentation.
Figure 9
Training and validation results for various MobileNet models.
Figure 10
Prediction of the confusion matrix.
Figure 11
Experimental scenes used in real dataset detection.
Table 1
Specifications of the Deployment Environment
Division Device Specification
Hardware CPU Intel® Xeon® Silver 4210 CPU @ 2.20 GHz
RAM 192 GB
Software Development Environment TensorFlow 2.5.0
Python 3.8.0
Cuda 11.2
Open CV 4.6.0.
Angle of View 78°
Drone Battery Capacity 1.1 Ah/3.8 V KIFSE-26686d3ff12.jpg
Picture 5MP (2592 × 1936)
Angle of View 82.6°
Video HD720 P30
Save Format JPG (Photo), MP4 (Video)
Table 2
Confusion Matrix for the Classifier
Division Predicted
True False
Actual True True Positive False Negative
False False Positive True Negative
Table 3
Test Dataset Classification Results
Name Accuracy Precision Recall F1_Score
MobileNet V1 Fire 0.9667 0.9429 0.9900 0.9659
Non-fire 0.9615 1.000 0.9804
Smoke 1.000 0.9100 0.9529
MobileNet V2 Fire 0.8700 0.8438 0.8100 0.8265
Non-fire 0.8544 0.8800 0.8670
Smoke 0.9109 0.9200 0.9154
MobileNet V3 Fire 0.6133 0.5102 0.5000 0.5051
Non-fire 0.7101 0.4900 0.5799
Smoke 0.6391 0.8500 0.7296
Table 4
Results of the Classification Using Test Image Data
Raw Image KIFSE-26686d3ff13.jpg KIFSE-26686d3ff14.jpg KIFSE-26686d3ff15.jpg
Human Cognition Fire Non-Fire Smoke
CNN Result Fire Non-Fire Smoke
(a) True Classification
Raw Image KIFSE-26686d3ff16.jpg KIFSE-26686d3ff17.jpg KIFSE-26686d3ff18.jpg
Human Cognition Smoke Smoke Smoke
CNN Result Non-Fire Fire Fire
(b) False Classification


1. S. M. Hong, Y. J. Yu, Y. W. Kim and H. G. Lee, “Study of Improve Sensing Cycle Scheme for Sersor based Forest Fire Detect System”, Proceedings of the Korea Information Processing Society Conference, 2021; No. 05a, pp. 104-107 No. 2021, https://doi.org/10.3745/PKIPS.y2021m05a.104.
2. S. H. Cho, “The Wildfires that were Brutally Terrifying - Covering the Scene of a Large Forest Fire in Uljin, North Gyeongsang Province, March 4”, Broadcast Journalist, Vol. 66, pp. 30-33 No. 2022.

3. C. G. Kim, Y. S Choung, K.Y. Joo and K.S. Lee, “Effects of Hillslope Treatments for Vegetation Development and Soil Conservation in Burned Forests”, Journal of Ecology and Environment, Vol. 29, No. 3, pp. 295-303 (2006), https://doi.org/10.5141/JEFB.2006.29.3.295.
4. Y. H. Kim and E. S. Baek, “Expansion of Firefighting Work Area by Improving the Forest Fire Response System in South Korea”, Fire Science and Engineering, Vol. 36, No. 4, pp. 66-73 No. 2022, https://doi.org/10.7731/KIFSE.cc578790.
5. K. Zhang, J. B. Park, Y. H. Park and G. H. Cho, “A Design of Forest Fire Monitoring System Based on Sensor Network”, Proceedings of the Korean Information Processing Society, Vol. 14, No. 2, pp. 843-845 (2007).

6. Y. W. Shin and J. H. Park, “Analysis of the Effectiveness of Fire Drone Missions at Disaster Sites: An Empirical Approach”, Fire Science and Engineering, Vol. 34, No. 5, pp. 112-119 No. 2020, https://doi.org/10.7731/KIFSE.cba54f4c.
7. J. H. Roh, S. H. Min and M. S. Gong, “Analysis of Fire Prediction Performance of Convolutional Neural Network-Based Classification Models”, Fire Science and Engineering, Vol. 36, No. 6, pp. 70-77 No. 2022, https://doi.org/10.7731/KIFSE.9e906e7a.
8. H. Y. Jung, S. G. Choi and B. H. Lee, “Rotor Fault Diagnosis Method Using CNN-Based Transfer Learning with 2D Sound Spectrogram Analysis”, Electronics, Vol. 12, No. 3, pp. 480No. 2023, https://doi.org/10.3390/electronics12030480.
9. Y. J. Kim and E. G. Kim, “Image based Fire Detection using Convolutional Neural Network”, Journal of the Korean Society for Information and Communication Studies, Vol. 20, No. 9, pp. 1649-1656 (2016), http://doi.org/10.6109/jkiice.2016.20.9.1649.
10. S. Aibawi, T. A. Mohammed and S. AI-Zawi, “Understanding of a Convolutional Neural Network”, International Conference on Engineering and Technology, pp. 1-6 (2017), https://doi.org/10.1109/ICEngTechnol.2017.8308186.
11. K. Zhang, J. G. Wang, H. T. Shi, X. C. Zhang and Y. H. Tang, “A Fault Diagnosis Method Based on Improved Convolutional Neural Network for Bearings under Variable Working Conditions”, Measurement, Vol. 182, pp. 109749No. 2021, https://doi.org/10.1016/j.measurement.2021.109749.
12. X. Z. Xu, M. Du, H. X. Guo, J. Y. Chang and X. Y. Zhao, “Lightweight FaceNet Based on MobileNet”, International Journal of Intelligence Science, Vol. 11, No. 1, pp. 1-16 No. 2021, https://doi.org/10.4236/ijis.2021.111001.
13. C. H. Tu, J. H. Lee, Y. M. Chan and C. S. Chen, “Pruning Depthwise Separable Convolutions for MobileNet Compression”, International Joint Conference on Neural Networks, pp. 1-8 No. 2020, https://doi.org/10.1109/IJCNN48605.2020.9207259.
14. W. Wang, Y. T. Zou, X. Wang, J. Y. You and Y. H. Luo, “A Novel Image Classification Approach via Dense-MobileNet Models”, Mobile Information Systems, (2020), pp. 8No. 2020, https://doi.org/10.1155/2020/7602384.
15. B. Dincer, Wildfire Detection Image Data, Kaggle, https://www.kaggle.com/datasets/brsdincer/wildfire-detection-image-data .

16. M. S. Prasad, Forest Fire Images, Kaggle, https://www.kaggle.com/datasets/mohnishsaiprasad/forest-fire-images .

17. C. Cristancho, FOREST FIRE IMAGE DATASET, Kaggle, https://www.kaggle.com/datasets/cristiancristancho/forest-fire-image-dataset .

18. D. A. Van and X. L. Meng, “The Art of Data Augmentation”, Journal of Computational and Graphical Statistics, Vol. 10, No. 1, pp. 1-50 (2001), https://doi.org/10.1198/10618600152418584.
19. F. Rustam, M. A. Siddique, H. U. R. Siddiqui, S. Ullah, A. Mehmood, I. Ashraf and G. S. Choi, “Wireless Capsule Endoscopy Bleeding Images Classification Using CNN Based Model”, IEEE Access, Vol. 9, pp. 33675-33688 No. 2021, https://doi.org/10.1109/ACCESS.2021.3061592.
20. H. Y. Kim, X. F. Zhang, Y. S. Kim and I. H. Jung, “Comparison of the Performance of CNN Models for Retinal Diseases Diagnosis”, Journal of the Korean Society for Intelligent Systems, Vol. 32, No. 1, pp. 51-60 No. 2022, https://doi.org/10.5391/JKIIS.2022.32.1.51.

Editorial Office
Room 906, The Korea Science Technology Center The first building, 22, Teheran-ro 7 Gil, Gangnam-gu, Seoul, Republic of Korea
Tel: +82-2-555-2450/+82-2-555-2452    Fax: +82-2-3453-5855    E-mail: kifse@hanmail.net                

Copyright © 2023 by Korean Institute of Fire Science and Engineering.

Developed in M2PI

Close layer
prev next