Smoke Particle-Source Prediction Model Based on Multiple Optical Wavelengths Using Deep Learning

Article information

Int J Fire Sci Eng. 2023;37(2):20-29
Publication date (electronic) : 2023 June 30
doi :
Defense & Safety ICT Research Department, Electronics and Telecommunications Research Institute, 218 Gajeong-ro, Yuseong-gu, Daejeon, 34129, Republic of Korea
Corresponding Author, TEL: +82-42-860-5142, E-Mail:
Received 2023 May 30; Revised 2023 June 14; Accepted 2023 June 15.


Recently, installing smoke detectors has become crucial owing to the risk of fatal human damage that may be caused by inhaling smoke during a fire. Smoke detectors have been reported as highly efficient in detecting smoke particles from fire; however, they may generate false alarms because of their limitation in distinguishing the fire smoke from the smoke generated by daily activities. Despite the frequent occurrence of these false alarms, research on predicting the types of sources through smoke particles remains insufficient. This study involved the development process of an intelligent smoke detector for false alarm reduction that aims to predict the occurrence and type of fire and the evaluation of its performance using the light-scattering characteristics for fire/non-fire sources. First, a previous experimental dataset of fire-related conditions was collected from three fire sources and three non-fire sources to train the model with the light-scattering characteristics of the smoke generated from each source. In addition, to reduce the computing power, data preprocessing was performed on the collected dataset using the median and RobustScaler. Finally, we evaluated the prediction performance of the three deep learning models using three networks: RNN, LSTM, and CNN-LSTM. As a result, we confirmed that the scattering intensity of smoke particles has unique characteristics for each source. When the data preprocessing and prediction models were applied, all three models achieved an accuracy of 0.90 or higher. However, some errors occurred that appeared at similar scattering intensities. The proposed method differs from existing methods in that it presents the possibility of predicting fire and non-fire sources and can be used as an alternative for improving false alarms in the future.

1. Introduction

Recently, the importance of installing a smoke detector has been highlighted because of the risk of fatal casualties from inhaling smoke during fire [1,2]. In Korea, smoke detectors are installed according to the National Fire Safety Code (NFSC) 203 [3], and the installation rate of smoke detectors in buildings is increasing. Although smoke detectors detect smoke generated by fires with high efficiency [4], they may not always distinguish between the actual smoke from fire and the smoke from cigarettes, cooking aerosols, and water steam [5]. Recently, the issues from such frequent false alarms are rising and require special attention because they influence the evacuation-related decision-making process of people in the event of a real fire [6].

Several researchers have conducted studies to distinguish between fire smoke and prevent false alarms [7]. Kim et al. [8] developed a real-time monitoring system algorithm that distinguished between fire- and non-fire alarms. The system was developed to enable real-time monitoring through analysis using the symmetrical coordinate method with the electrical signal of the smoke detector, which changes in the event of a fire, and the voltage sensor within the protection zone. In this study, a smoke detector predicting real-time false alarms was ideated, and a basis for establishing a system that could actively respond in the early stages of a fire alarm was established. Chan et al. [9] adjusted the sensitivity settings, drift compensation, and smoothing filter of a smoke detector to prevent false alarms. They installed it in an actual building and confirmed that false fire alarms were reduced by a factor of 0.5, thereby contributing to a reduction in false alarms. Recently, with the combination of computer vision and deep learning, research on image-based fire-detection models with improved performance has increased [10,11]. Ahn et al. [12] developed a YOLO-based early fire-detection model that can be installed in CCTV. A high-performance model was developed by training it to not misclassify cigarette and cooking image datasets as fire to distinguish false alarms. Furthermore, the fire-detection speed was experimentally compared with that of a general fire detector.

In summary, smoke generation detection through smoke detection sensors (e.g., voltage sensors and CCTV) and smoke detection algorithms (e.g., symmetrical coordinate methods and YOLO) were effectively verified. However, research on predicting the sources using smoke particles and deriving their predictive performance has been insufficient. Therefore, this study set the following research goals to fill the aforementioned knowledge gap: (1) to identify the light-scattering characteristics for each source, that is, smoke from fire or smoke from non-fire sources. (2) Preprocessing and models to predict sources with a high performance. Accordingly, this study proposes a method for distinguishing between fire and non-fire data collected using a smoke detector.

The purpose of this study was to predict smoke detector-based fire detection and the types of sources and to evaluate the prediction performance. The study was conducted as shown in Figure 1 below. First, we collected a smoke-detection dataset based on multiple optical wavelengths from previous fire experiments. Next, we performed data preprocessing to compute the power reduction for general purposes. Finally, we applied the three prediction models and analyzed the causes of these errors. The results of this study are valuable because they can be used as data to help people secure the reliability of false alarms and reduce human casualties due to predicted fires in the future. Furthermore, this study aims to contribute to the safety net construction technology with fire detection.

Figure 1

Research procedure.

2. Smoke Detector Collection and Data Preprocessing

2.1 Multiple optical wavelengths—based smoke detector

Fire detectors installed to detect fires in buildings can be classified into three typical categories: 1) heat detectors that detect fire when a certain temperature is reached, 2) smoke detectors detect fire when smoke particles are generated, and 3) flame detectors that detect fire by detecting flames using ultraviolet or infrared rays [13].

Smoke detectors can be divided into two types according to the method used to detect smoke particles: ionization detectors and photoelectric detectors. An ionization-type detector detects the change in the ion current as changes when smoke particles enter. A photoelectric detector detects the scattering of light owing to smoke particles in the case of a fire.

In previous studies, a smoke detector capable of detecting the scattering intensity of smoke particles was utilized [14,15]. In a previous experiment, a smoke detector was adopted to repeatedly measure the scattering intensity of light with multiple optical wavelengths [16]. The scattering intensity of smoke particles derived from each source indicates the unique characteristics of the sources. In this study, we attempt to evaluate the performance of a smoke detector by training and predicting the data from which the unique characteristics of the sources are derived through experiments using deep learning models.

2.2 Previous experiment dataset for data collection

In this study, a previous fire experimental dataset was collected. The fire experiment was conducted in the same manner as that in the experiment conducted by Han et al. [16] in the same experiment chamber. The chamber is a carbon-steel square duct with dimensions of 0.3 m × 0.3 m (0.09 m2). The particles generated by the sources flowed into the test section through ducts. In previous studies, paper, kerosene, and polyurethane were selected as fire sources, whereas vapor and cooking patty were selected as non-fire sources. In this study, dust was added to the non-fire source dataset. Table 1 lists the sources used in this experiment. In the experimental dataset used in this study, 20 repeated experiments were conducted for each source to reduce the deviations and errors, and 10 smoke detectors were installed in each experiment.

Fire Sources and Non-fire Alaarm Sources

Figure 2 shows the results of each type of experiment. The results show representative characteristics of the scattering intensity detected by the smoke detector after the sources were ignited. The x-axis of the graph represents time [s], the y-axis represents the intensity for each wavelength, and the intensity is a digital index that was used only as a relative comparison value in this study. In this study, multiple optical wavelengths are represented as λ1, λ2, λ3, and λ4 at four different wavelengths.

Figure 2

Scattering intensity of sources.

Each source was ignited 30 s after the start of data storage. Most of the change in the scattering intensity occurred between 45 and 80 s because of the delay in the time for smoke particles to reach the smoke detector. The filter paper (Figure 2(a)) had an intensity of up to 200 upon ignition and a large periodic wave. As shown in Figure 2(b), when Kerosene was ignited (Figure 2(b)), it increased the intensity close to 100 and maintained it. When the polyethylene was ignited, as shown (Figure 2(c)), the intensity gradually decreased and reached a negative value. When dust was generated (Figure 2(d)), the intensity gradually increased in a parabolic shape. When the patty was being cooked (Figure 2(e)), it gradually reached a negative value in a form similar to that of polyethylene. As shown in Figure 2(f), the intensity was characterized by a rapid increase when water vapor was generated. In the previous five graphs, the y-axis was fixed at -100-250°. However, the water vapor varied from -200-1,200. In other words, in the case of water vapor, the rise width was significantly larger than that of the other specimens.

2.3 Data preprocessing

We applied labels from 1-6 to the previously collected experimental dataset. The normal state, neither fire nor non-fire, was labeled as 0. We proceeded with preprocessing considering the practical use of the smoke detector. We particularly focused on versatility, reducing computing power, and improving accuracy. First, for versatility, the collected data were subtracted from the average value in the normal state. For this purpose, the scattering intensity of multiple optical wavelengths was measured for 30 s in the normal state. The average value of the data collected for 30 s was derived, and the value of the incoming data was subtracted from the average value. This process involves changing all normal states of several detectors to zero so that they can be used universally, even when various detectors are used in any position.

Next, we modified the algorithm such that only one data point was selected every 1 s to reduce computing power. To select a representative data point, the median value was selected from approximately 50 data points collected over 1 s. This data preprocessing can contribute to reducing the computing power by removing outliers and minimizing the amount of data to be processed when using a smoke detector.

Finally, normalization was performed during training to improve the accuracy. The data had their own sizes and variances. If many variations of one feature of the data are reflected, the deep learning algorithm can have problems finding patterns. Furthermore, the result can converge to zero or diverge to infinity. Therefore, it is important to perform scaling during data preprocessing to prevent this problem. Scikit-learn [17] uses various types of scalers. As shown in Figure 2, the scattering intensity derived from the previous experiment varied significantly according to the source type. In particular, water vapor appeared with a larger deviation than other types of existing data. Therefore, a normalization that minimizes the influence of outliers is necessary.

In this study, RobustScaler [18] was selected using median and interquartile range (IQR). RobustScaler is characterized by less influence from outliers. RobustScaler is expressed in Eq. (1), where q1 is the first quartile value of the entire dataset, q2 is the median value, and q3 is the third quartile value.

(1) RobustScaler=x-q2q3-q1

3. Prediction Models

3.1 Model selection

In this study, representative deep learning models that have been used in many recent studies were used to compare model performance [19]. The three models are the recurrent neural network (RNN), long short-term memory (LSTM), and convolution neural network-long short-term memory (CNN-LSTM). An RNN was specialized for repetitive and sequential data training and is characterized by a circular structure. The LSTM was developed to solve the vanishing gradient problem of RNNs. The LSTM can train data from the distant past, whereas the RNN cannot. A CNN is a structure that identifies patterns by extracting features from data [20]. The CNN-LSTM model is a combination of CNN and LSTM properties and is often used for time-series multivariate prediction.

This study was conducted using Python 3.9 programming language. The deep learning models were applied based on Keras, a Python deep learning library. The network used is a sample model [21] provided by Keras. Only the intensities derived from multiple optical wavelengths were used as the input data.

3.2 Hyperparameter

Among the hyperparameters, the epoch was set to 50 and the batch size was set to 64. The time step was 30, and of them, five modes of output data were trained as the final output data. The training interval was set to five. Of the 20 experiments, experimental data from 1 to 14 were used as the training dataset, and data from 15 to 20 were used as the test dataset. In this study, the rectified linear unit (ReLU) was used as the activation function and Adam was used as the optimizer. Because classification was the purpose of this study, softmax was used as the activation function of the output layer.

We used the grid search [22] method for hyperparameter optimization of each model. We set a range of hyperparameters such as filters, kernel size, LSTM units, and dropout rate, and determined the optimal combination. The optimal hyperparameter settings are listed in Table 2 below.

Optimal Hyperparameter Determined by GridSearchCV

3.3 Model performance

Various evaluation criteria were used to develop and evaluate prediction models. In this study, we attempted to verify the performance of the classification model using precision, recall, accuracy, and F1 score. The evaluation criteria are shown in Eqs. (1)-(4).

(2) Accuracy[%]=TP+TNTP+FP+TN+FN×100
(3) Precision[%]=TPTP+FP×100
(4) Recall[%]=TPTP+FN×100
(5) F1-score[%]=2×Precision×RecallPrecision+Recall×100

True positives (TP) represent the cases in which the model correctly predicts positive samples. True negatives (TN) represent the cases in which the model correctly predicts negative samples. A false positive (FP) represents the cases in which the model incorrectly predicts negative samples as positive. (The predicted label is positive, but the actual label is negative.) Finally, false negatives (FN) represent the cases where the model incorrectly predicts positive samples as negative. (The predicted label is negative, but the actual label is positive.)

4. Results and Discussion

4.1 Prediction results

The results for each model are presented in the confusion matrix table as shown in Figure 3. The x-axis represents the predicted data, and the y-axis represents the actual data. When analyzing the confusion matrix, if the measured and predicted data indicate the same result, the performance of the model is considered high.

Figure 3

Results of prediction model.

The classification results of the RNN are shown in Figure 3(a), and the accuracy was determined to be 0.90. The classification results of the LSTM are shown in Figure 3(b), and the accuracy was determined to be 0.92. The classification results of the of the CNN-LSTM are shown in Figure 3(c), and the accuracy was determined to be 0.93. Table 3 shows the results derived from the precision, recall, and F1 score. Here, support is the number of data points for each class. The number was set between 2,000 and 3,000 for balance.

Results of Precision, Recall, F1 Score

According to the results of this study, the scattering intensity collected through the experiment could distinguish the types of sources at a certain level or higher. The average accuracy of each deep learning model was 0.92. These results suggest that the particle scattering characteristics of different sources is different depending on the wavelength; thus, the prediction model can distinguish the characteristics of the particles. These results confirm the possibility of distinguishing the sources of smoke using only the data.

4.2 Analysis of error source

Two elements were commonly observed among the errors in the results. In two cases, the filter paper (Class 1) was classified as normal (Class 0), and the patties (Class 3) were classified as water vapor (Class 6). Figure 4 shows the classification results for the measured and predicted data. Results represent only one representative experiment. The X-axis represents time [s] and the y-axis represents the class: Class 0 is 0 and Class 1 is 1. The actual data are represented by a solid line, and the predicted data are represented by a dotted line.

Figure 4

Classification results.

The filter paper (Class 1) was classified as normal (Class 0). The results in Figure 4(a) suggest that, from approximately 75 s, the prediction was in the normal state of Class 0, although the firing state of the filter paper was true. This can be interpreted as a characteristic of filter paper, as shown in Figure 1(a). In the corresponding figure, the intensity occurred over a large range and appeared in the normal state in the final graph. Based on the characteristics of these smoke particles and the prediction results, the intensity of the smoke particles is predicted to be normal while decrease to 0 toward the end.

In this case, polyethylene (Class 3) was classified as water vapor (Class 6). Figure 4(b) shows a representative graph of the results of the polyethylene data classification. The graph shows the predicted characteristics of the water vapor in the interval between approximately 10 and 25 s. In the particle characteristics of water vapor shown in Figure 1, there is a section in which the intensities of λ1 and λ2 rapidly decrease to negative numbers, and a section in which the scattering intensity of polyethylene decreases is similar to the section; therefore, the intensities must be affecting the prediction.

These errors were consistent, and the fire and non-fire (or normal) differences between the measured and predicted values varied. If an incidence of fire is predicted as non-fire, the evacuation time of occupants is affected; similarly, if a the case of non-fire incidence is predicted as fire, the false alarm would cause economic and time damage. Therefore, measures to compensate for these errors should be considered in future studies. For example, this can be done by adding inputs or changing the hyperparameters.

Additionally, the results of predicting patty smoke (Class 5) as normal (Class 0) were confirmed by the LSTM and CNN-LSTM results. This is because the scattering intensities of the normal state and patties appear similar, as shown in Figure 1(e). The black smoke absorbs light. Because of this property, the scattering intensity was found to be low compared to that of other sources, as characterized by photoelectric detectors that emit light. Polyethylene (Class 3) showed a similar trend, suggesting that it also absorbs light similar to the black smoke.

5. Conclusions

This study proposes a method and model that can distinguish between smokes from fire and non-fire sources by developing an intelligent smoke detector to prevent false alarms. In this study, the previous experimental dataset collected through six experiments and data preprocessing was applied to the model, and its performance was evaluated. In particular, the method proposed in this study differs from existing fire and non-fire methods by presenting a method that can classify the types of smoke particles in real time and can be used as an alternative for preventing false alarms in the future. The results of this study can be summarized as follows.

First, the types of sources could be distinguished with an accuracy of 0.90 or higher through the scattering intensity collected during the experiment. These results indicate that the characteristics of particles depend on the source and that the prediction model can distinguish these characteristics. That is, these results can distinguish between the sources from where the smoke occurs using only data, thus suggesting that it is an effective method for preventing false alarms.

Second, some errors occurred when the data were classified. Most errors depend on the characteristics of the particles. In particular, false alarms predicting fire as non-fire and non-fire as fire occurred. These errors can confuse people. Because these error results are consistent, a method for reducing them is required. For instance, this can be done by adding input data or modifying the hyperparameters.

Third, for sources that generate black smoke and absorb light, no significant features were found in the detection results. The detector used in this study detects light of different wavelengths scattered by the smoke particles. Therefore, in the case of black smoke, which absorbs light, no significant features were observed. Unlike other sources of smoke, patties generate black smoke upon ignition. Consequently, it was predicted to be similar to the normal state. However, because patties are classified as non-fire, false alarms do not occur even if they are classified as normal. However, if similar characteristics appear among similar fire sources, additional solutions are needed.

This study proposes a prediction model that distinguishes between fire and non-fire sources using the scattering intensity derived from multiple optical wavelengths collected in real time. The difference from previous studies is that the predictive performance was secured by data preprocessing with high versatility, considering the characteristics of detectors and reflecting the characteristics of various types of smoke based on actual smoke generation experiments.

However, because this study conducted experiments only with clear sources (e.g., filter paper, kerosene, polyethylene, dust, patties, and water vapor), it is difficult to generalize the results to actual fires cause by different types of sources. Furthermore, experimental data obtained under various fire conditions are required to strengthen the robustness of the model. In future studies, it is necessary to conduct experiments using various sources that consider actual fire situations.


This study was supported by the Institute for Information & Communications Technology Promotion (IITP) grant funded by the Korean government (MSIT) (No. 2020-0-00012, Development of Intelligent Fire Detection Equipment Based on Smoke Particle Spectrum Analysis).


Conflicts of Interest

The authors declare no conflict of interest.

Author Contributions

Conceptualization: Y.A.; Methodology: Y.A.; Software: Y.A.; Validation: Y.A.; Formal analysis: Y.A.; Investigation: K.H.; Resources: K.H.; Data curation: Y.A.; Writing — Original Draft Preparation: Y.A.; Writing — Review and Editing: H.Y., S.K., J.R., and K.L.; Visualization: Y.A.; Supervision: K.L.; Project Administration: H.Y.; Funding Acquisition: K.L. All authors have read and agreed to the published version of the manuscript.


1. Alarie Y.. Toxicity of Fire Smoke. Crit Rev Toxicol 32(4):259–289. 2002;
2. Levin B. C., Kuligowski E. D.. Toxicology of Fire and Smoke. Inhal Toxicol 2:205. 2005;
3. NFSC 203. National Fire Safety Code 2022. .
4. Reisinger K. S.. Smoke Detectors: Reducing Deaths and Injuries Due to Fire. Pediatrics 65(4):718–724. 1980;
5. Meacham B. J.. The Use of Artificial Intelligence Techniques for Signal Discrimination in Fire Detection Systems. Journal of Fire Protection Engineering 6(3):125–136. 1994;
6. Hubballi N., Suryanarayanan V.. False Alarm Minimization Techniques in Signature-based Intrusion Detection Systems: A Survey. Computer Communications 491–17. 2014.
7. Shin H., Na K., Chang J., Uhm T.. Multimodal Layer Surveillance Map Based on Anomaly Detection Using Multi-agents for Smart City Security. ETRI Journal 44(2):183–193. (2022)
8. Kim B. J., Kim J. H.. Development of a Novel Real-Time Monitoring System Algorithm for Fire Prevention. Journal of the Korean Institute of Fire Science & Engineering 29 Korean Society of Safety 29(5):47–53. 2014;
9. Chan W. S., Chang F.-D., Chen C.-S., Chiu Y.-F., Liu C.-C., Tsai Z.-D.. Optimizing The Reliability of The Fire Alarm System in The Taiwan Photon Source. In : 10th Int. Particle Accelerator Conf.(IPAC'19), Melbourne, Australia, 19–; 24 May 2019; JACOW Publishing; Geneva, Switzerland: 4026–4028. 2019. .
10. Alkhatib A. A. A.. A Review on Forest Fire Detection Techniques. International Journal of Distributed Sensor Networks 10(3):597368. 2014;
11. Dinaburg J., Gottuk D.. Smoke Alarm Nuisance Source Characterization: Review and Recommendations. Fire Technology 52(5):1197–1233. 2016;
12. Ahn Y., Choi H., Kim B. S.. Development of Early Fire Detection Model for Buildings Using Computer Vision-based CCTV. Journal of Building Engineering 65:105647. (2023)
13. Liu Z., Kim A. K.. Review of Recent Developments in Fire Detection Technologies. Journal of Fire Protection Engineering 13(2):129–151. 2003;
14. Kim S., Park S., Lee K.. Method for Aerosol Particle and Gas Analyses Based on Dual-channel Mid-infrared Sensor. International Journal of Fire Science and Engineering 36:1–6. (2022)
15. Park S., Han K. W., Lee K.. A Study on Fire Detection Technology Through Spectrum Analysis of Smoke Particles. In : 2020 International Conference on Information and Communication Technology Convergence (ICTC). IEEE; 1563–1565. 2022; .
16. Han K., Kim S., Yang H., Cho K. S., Lee K., Han H. S.. Statistical Characteristics of Scattering Ratio Based on Three Optical Wavelengths for Smoke Particles. 36(2):40–49. (2022)
18. BUITINCK L., et al. RobustScaler: Scikit-Learn Documentation. 2018;
19. Sezer O. B., Gudelek M. U., Ozbayoglu A. M.. Financial Time Series Forecasting with Deep Learning: A Systematic Literature Review: 2005–2019. Computer Science 90:106181. (2020)
20. Simard P. Y., Steinkraus D., Platt J. C.. Best Practices for Convolutional Neural Networks Applied to Visual Document Analysis Icdar. Edinburgh: 2003.
21. Keras . 2021.
22. Scikit Learn. GridSearchCV: Exhaustive Search over Specified Parameter Values for an Estimator in Python 2022;

Article information Continued

Figure 1

Research procedure.

Figure 2

Scattering intensity of sources.

Figure 3

Results of prediction model.

Figure 4

Classification results.

Table 1

Fire Sources and Non-fire Alaarm Sources

Fire Sources Non-Fire Sources
Filter paper Kerosene Polyethylene Dust Patty Water vapor
Class 1 Class 2 Class 3 Class 4 Class 5 Class 6

Table 2

Optimal Hyperparameter Determined by GridSearchCV

Hyperparameter Given Values Optimal Value Hyperparameter Given Values Optimal Value Hyperparameter Given Values Optimal Value
Units 64, 128 128 Units 64, 128 128 (LSTM) Units 64, 128 128
Dropout Rate 0.2, 0.5 0.5 Dropout Rate 0.2, 0.5 0.2 Dropout Rate 0.2, 0.5 0.2
Learning Rate 0.01, 0.1 0.1 Learning Rate 0.01, 0.1 0.01 Filters 32, 64 64

Table 3

Results of Precision, Recall, F1 Score

Precision Recall F1 score Support
Class 0 0.85 0.81 0.79 0.93 0.97 0.97 0.89 0.88 0.87 2,610
Class 1 0.91 0.93 0.93 0.78 0.86 0.85 0.84 0.89 0.88 2,401
Class 2 0.99 0.98 0.97 0.96 0.98 0.98 0.97 0.98 098 2,586
Class 3 0.92 0.98 0.96 0.88 0.89 0.90 0.90 0.93 0.93 2,895
Class 4 0.95 0.95 0.98 0.90 0.94 0.95 0.92 0.94 0.96 2,638
Class 5 0.84 0.93 0.92 0.91 0.87 0.83 0.88 0.90 0.88 2,926
Class 6 0.87 0.88 0.90 0.91 0.94 0.94 0.89 0.91 0.92 2,953