Object classification in radar signal processing is one of the most challenging problems for similar featured targets. This challenge arises due to the limited accuracy of traditional classification methods, such as those based on radar cross section (RCS), speed, and phase differences. In the context of drone-bird classification, the sizes of drones and birds result in similar RCS signatures. Consequently, traditional classification algorithms are not effective in this context. This study addresses this issue by analyzing the micro-Doppler signatures of In-phase Quadrature (IQ) data obtained from Frequency Modulated Continuous Wave radar. The IQ data are then fed into pre-trained neural networks, classical machine learning algorithms, and convolutional neural networks (CNNs) with varying architectures to classify drones and birds separately. As a result of this study, high success rates were obtained with CNNs, Pre-Trained Networks (VGG16, VGG19, Alexnet, Googlenet, and Resnet50), and a combination of CNN algorithms with traditional machine learning algorithms like Support Vector Machine, Tree Methods, etc. Comparison results show that the combination of CNNs with traditional machine learning algorithms gives promising results.

Keywords

Bird classification
convolutional neural network
drone classification
FMCW radar
support vector machines

Author information

Introduction

Nowadays, the use of drones has been steadily rising with advancements in drone technology and increased market accessibility. This situation brings security risks in many fields. For instance, drones can be employed to secure critical facilities. Additionally, drones originally designed for civilian purposes have the potential to be weaponized and utilized in various forms of warfare, terrorist activities, and illicit operations such as smuggling [1–4]. Such events reveal the critical importance of detecting and classifying these platforms. Since Radar Cross Section (RCS) signatures of drones are relatively similar to other biological targets in some cases, they had similar speed profiles, making them difficult to classify with traditional radar classification algorithms. The limitations of traditional RCS and object velocity-based classification methods for drones have prompted a growing body of research focused on alternative detection approaches, including machine learning techniques.

There are several studies for improving the classification accuracy of this kind of target. One of these studies is the study conducted by Karlsson et al. [5] uses the convolutional neural networks (CNN) algorithm by preprocessing 77 GHz Frequency Modulated Continuous Wave (FMCW) radar data for drone classification. The data set used in this study includes In-phase Quadrature (IQ) radar data, range information of targets, and time information. The dataset used in this study contains a total of 75,868 data points. These data include measurements of six different drone classes, two different human activities (walking and running), six different bird classes, and a corner reflector. In this study, first, after the preprocessing of the acquired data, classification was performed with CNN. As a result of this classification, 90% success was achieved. To increase this rate, it is aimed to increase the success by improving the signal-to-noise ratio of the detections. In addition to the dataset used in this study, synthetic data were generated, and these synthetic data were also included in the study to improve performance. The dataset used in this study is openly available and can be accessed via open source [5].

In a study by White et al. [6], the authors perform this study with two datasets collected from L Band radar. The first dataset includes 35 bird classes and 10 drone class signature collections. With this dataset, a study was performed on CNN, and the performance for single-target data was achieved with an accuracy of 82%. The second dataset includes 64 bird classes and 95 drone classes. With the usage of this complex dataset, the accuracy of CNN will increase up to 89%–90% [6].

In a study by Zhu et al. [7], the authors provide a detailed examination of the development of radar automatic target recognition (RATR) technology and the applications of deep learning algorithms within this field. This paper shows that the classification accuracy of space targets can be improved through the utilization of the AlexNet and SqueezeNet models. Furthermore, the use of CNN algorithms for drones and aerial targets achieved an accuracy rate of 96.86%. This study concludes that deep learning algorithms offer a high classification accuracy rate for RATR technologies and provides a foundation for future research.

Kumawat et al. [7] focus on the classification of smaller aerial targets (such as drones) by using micro-Doppler signatures. In this study, a dataset which is called “DIAT-μSAT” was used. This dataset includes 4,849 micro-Doppler signatures and six different classes. They collect this dataset with custom CW X-Band radar. They used pre-trained networks such as VGG16 and VGG19 for the classification. The classification accuracies were obtained as 95% and 97% for VGG16 and VGG19, respectively.

In Narayanan et al. [8], the authors primarily focus on classifying birds and drones by analyzing their micro-Doppler spectrums. They use a dataset collected with costume 10 GHz(X-Bant) radar. This dataset includes 2,279 micro-Doppler signatures. In this study, the authors employed Support Vector Machines (SVM) for classification, achieving an accuracy of 90% when classifying drones based solely on their size. The accuracy for distinguishing between birds and drones separately was 96%. In the final case, the authors employed five different classes for drones and birds, achieving a classification accuracy of 85.

In our study, the dataset collected using SAAB SIRS 1,600 FMCW 77 GHz Radar [5] was utilized for the classification of six distinct drone classes, two human activity classes, six bird classes, and one corner reflector class. This dataset includes 75,856 different measurements. For the preprocessing of those data, raw IQ data are converted to amplitude frequency. Subsequently, all data are transformed into spectrogram images. Those spectrogram images are fed to deep learning and machine learning algorithms to classify the drones and birds.

The main contribution of this study, classification accuracy, is higher than state-of-the-art studies in the literature. Additionally, by combining the CNN algorithm with the linear SVM method, the classification accuracy surpasses that reported in existing literature.

Materials and Methods/Methodology

2.1.

Dataset

In this study, the dataset provided by Karlsson et al. [5] is used for the classification. This dataset includes drone, bird, human, and corner reflector measurements, which are collected by SAAB SIRS 1600 FMCW 77 GHz Radar. This dataset is published as open source by KTH Royal Institute of Technology. The parameters of the SAAB SIRS 1600 FMCW radar are described in Table 1.

Radar Type	SAAB SIRS 1600, FMCW
Center Frequency	77 GHz
Bandwidth	160 MHz
Range Resolution	1 m
Pulse Repetition Frequency (PRF)	17 kHz
Azimuth Beam width	1°
Scan Rate	10 Hz, mechanical
Field Of View	±9°
Output Power	10 mW

Table 1.

Radar specifications [5].

This dataset was published in the format of MAT-File, which is a file format for MATLAB software. This MAT-File includes a 130 × 6 cell array. In each row of 130 cells, the format of each measurement is saved. This dataset includes 75,868 measurements. The classes included in the dataset are shown in Table 2 and Figure 1. The distribution of each measurement for classes is shown in Table 3.

Figure 1.
Classes in the dataset. Figure was adapted by Karlsson [5].

Class	Description
“D1”	Drone, DJI matrice 200 v
“D2”	Drone, DJI Mavic 2 Pro
“D3”	Drone, DJI Phantom 3
“D4”	Drone, custom-built FPV drone
“D5”	Drone, Custom built with a Tarot 680 Pro frame
“D6”	Drone, Syma X23W
“human_walk”	Different humans walking, ranging from almost still to “normal” walking
“Human_run”	Different humans running
“seagull”	Flying Seagull
“pigeon”	Flying Pigeon
“raven”	Flying Raven
“black-headed gull”	Flying Black-headed gulls
“seagull and black-headed gull”	A mixture of flying seagulls and black-headed gulls.
“heron”	Flying Heron
“CR”	Triangular Corner Reflector, 100 m² RCS at 77 GHz

Table 2.

Classes in the dataset [5].

Class	Number of Measurements
Drone, DJI matrice 200 v	6,921
Drone, DJI Mavic 2 Pro	9,555
Drone, DJI Phantom 3	10,212
Drone, custom-built FPV drone	8,735
Drone, Custom built with a Tarot 680 Pro frame	12,093
Drone, Syma X23W	12,093
Human walking and running	6,028
Flying Birds. With and without wing flaps.	7,792
Trihedral Corner Reflectors	3,280

Table 3.

Distribution of measurements by class.

The second column in the dataset contains the IQ data, which represent the combined values of five scan segments. Since the radar has the capability of mechanical scan, each measurement includes five scan segments. Each scan segment is a vector of size 1 × 256. In the case of combining five scans, each set of five scan segments is represented as a vector of size 1 × 1,280. An example of five scan segments for the “D1” class is presented in Figure 2.

Figure 2.
Five scan segments of the “D1” class.

Of these data, 57,868 were used for training, 9,000 for validation, and the remaining 9,000 for testing the model. This split was provided by the dataset creators, and the data were downloaded without any modification to ensure comparability with the literature.

2.2.

Preprocessing Stage

Before applying machine learning and deep learning techniques, the dataset needs to be rearranged and preprocessed. As described in the dataset section, the dataset includes raw IQ data. First, all IQ data was converted frequency domain with the usage of Short Time Fourier Transform (STFT). For this purpose, Eq. (1) and Eq. (2) are used.

(1)

(2)

Here, x(t) is the input signal, w(t) is the window function, and t is the time at which the window is centered. f is the frequency, and STFT (t, f) is the STFT of the input signal. By taking the square of the magnitude of STFT (t, f), a power spectrogram is obtained. Since the pulse repetition frequency is 17 kHz according to radar specifications, the FFT sampling frequency was selected to be 17 kHz. Additionally, the IQ data consists of 256 samples for each scan segment; the FFT sampling point is therefore set to 256. A Hamming window of length 256 is selected for the STFT. After computing the STFT, the resulting spectrogram is plotted and saved in “.jpeg” image format. Examples of the spectrogram of different classes are shown in Figure 3.

Figure 3.
Examples of the spectrogram for different classes.

After the spectrograms were saved, the dataset classes were rearranged following the approach of Karlsson et al. [5] to facilitate a comparison of those of our study. Two human classes are combined as one class. Also in bird class, six bird classes are combined as one class as shown in Figure 4.

Figure 4.
Re-arrange of dataset classes.

2.3.

Classification Stage

In our study, different machine learning algorithms were used to classify eight classes. Machine learning is a field that emerged from the idea of adapting human learning methods to computers, involving various algorithms and techniques used in conjunction with science and technology. In this study, deep learning methods, which are a subfield of machine learning, and classical machine learning methods were applied to determine the model that would exhibit the best performance in solving the problem at hand. Classification techniques used in this study for radar data classification are CNN from scratch, pre-trained CNN, SVM, different Decision Tree (DT) architectures, and k-Nearest Neighbor (KNN).

CNNs are an Artificial Neural NetworkANN method developed for processing one, two, or more dimensional data such as images. What makes this method different from other methods is that the mathematical operation convolution is included in the neural network used [9]. This method yields highly successful results in various fields, including medical applications and human-computer interfaces etc. in image processing [10, 11].

SVMs are powerful supervised learning algorithms used in classification and regression analysis. SVM transforms data into a high-dimensional space and finds the best-separating hyperplane in this space [12]. The goal is to separate data points belonging to different classes with the largest marginal distance. SVM can also perform more complex classification tasks by using kernel functions for nonlinear problems [13].

The DT is a machine learning algorithm that classifies data by separating them according to certain features. In the tree structure, each node represents a feature, branches represent the possible values of these features, and leaf nodes represent the classification result. This structure can work with both categorical and numerical data and makes decisions by separating the data. DTs are a preferred method, especially due to their transparency and easy understandability [14].

The KNN algorithm is also a supervised machine learning method used in classification and regression problems. This method includes an approach based on distance calculation of the data. The K variable represents the number of nearest neighbors. This variable can take any value according to the need [15].

To evaluate the performance of all methods, commonly used metrics were employed as benchmarks. These metrics include accuracy, recall, precision, and F1 score and are used in evaluations [16, 17]. A confusion matrix was constructed for all evaluations. Confusion matrix was constructed for all evaluations, representing the counts of predicted versus actual values. Each method was executed 10 times, and standard deviations of the results of each method were also given in the tables. All algorithms were created and run in MATLAB software version 2022b.

Experimental Studies

3.1.

Convolution Neural Networks From Scratch

Different CNN architectures were designed to identify the model with the highest performance. Many hyperparameters affect the success of the CNN algorithm. Upon reviewing the literature, four parameters (input data size, filter size, learning rate, and convolution layer number) considered to influence performance have been addressed in this study. First, the architecture consisting of five-layer convolutional blocks, as presented in Figure 5, was constructed from scratch, and the change in performance of this architecture was analyzed as the input data size, filter size, and learning rate varied.

First, the accuracy and performance achieved by using five scan segments as input are compared to those obtained with a single scan segment. Upon examining the results presented in Table 4, it was determined that the use of five scan segments yielded successful outcomes. Consequently, five scan segments were applied as input throughout the remainder of the study.

Input Data	Training Accuracy	Accuracy and Standard Deviation	Precision	Recall	F1-Score	KAPPA	Process Time
Five Scan Segment (100 × 100 pixel size)	97.99	97.72 ± 1.32	97.74	97.72	97.73	0.9744	837 min 50 sec
One Scan segment (100 × 100 pixel size)	91.88	91.69 ± 0.94	92.07	91.69	91.72	0.9065	824 min 24 sec

Table 4.

Comparison of the results when five scan segments and one scan segment are used as input.

Following this step, the first technique involves modifying the input image pixel sizes in the five-layer CNN algorithm. In this method, the input sizes are altered to 100 × 100, 300 × 300, 500 × 500, and 600 × 600. Example images with modified input pixel sizes are presented in Figure 6. In this context, a comparison is made between training time and overall accuracy based on different success criteria. Based on the results (Table 5) obtained from the experiments, the input data size yielding the highest model performance was determined. The study continued with this data size.

Figure 6.
Comparison of input sizes of spectrogram images.

Pixel Size	Training Accuracy	Accuracy and Standard Deviation	Precision	Recall	F1-Score	KAPPA	Process Time
100 × 100	97.99	97.72 ± 1.32	97.74	97.72	97.73	0.9744	837 min 50 sec
300 × 300	96.84	96.9 ± 1.45	96.94	96.9	96.9	0.9651	857 min 50 sec
500 × 500	96.8	96.44 ± 1.24	96.46	96.44	96.44	0.96	2005 min 5 sec
600 × 600	96.92	94.53 ± 1.08	94.86	94.53	94.58	0.9385	3,016 min 8 sec

Table 5.

Comparison of the effect of changing the pixel size of the input data on the results.

The performance of different adaptive learning rate algorithms was evaluated. In the approach utilizing adaptive learning algorithms, the OneCycleLR and ReduceLROnPlateau techniques are employed. For both techniques, two cases are considered. In the first case, the minimum learning rate is set to 10^–3 and the maximum learning rate to 10^–2. In the second case, the minimum learning rate is set to 10^–4 and the maximum learning rate to 10^–3. After applying these methods, they were compared based on their training time and overall accuracy according to various success criteria in Table 6.

Adaptive Learning Rate Algorithm	Training Accuracy	Accuracy and Standard Deviation	Precision	Recall	F1-Score	KAPPA	Process Time
OneCycleLR (0.01–0.001 learning rate)	97.88	97.47 ± 0.97	97.56	97.47	97.48	0.9715	1,025 min 42 sec
OneCycleLR (0.001–0.0001 learning rate)	96.34	95.96 ± 1.07	96.1	95.96	95.98	0.9545	1,063 min 29 sec
ReduceLROnPlateu (0.01–0.001 learning rate)	96.88	96.48 ± 1.16	96.51	96.48	96.47	0.9604	810 min 53 sec
ReduceLROnPlateuOnLR (0.001–0.0001 learning rate)	96.3	95.66 ± 1.18	95.83	95.66	95.69	0.9511	1,053 min 29 sec

Table 6.

Comparison of the results when different adaptive learning rate algorithms are applied.

One of the methods involves changing the filter size in the five-layer CNN. Initially, filter sizes of 3 × 3, 5 × 5, and 7 × 7 are used for all layers. The main objective of altering the filter size is to improve overall performance. To evaluate the impact of changing filter sizes on the performance of the five-layer CNN, training time and overall accuracy are compared in Table 7.

Filter Size	Training Accuracy	Accuracy and Standard Deviation	Precision	Recall	F1-Score	KAPPA	Process Time
3 × 3	97.99	97.72 ± 1.32	97.74	97.72	97.73	0.9744	837 min 50 sec
5 × 5	97.47	97.22 ± 1.49	97.23	97.22	97.22	0.9688	890 min 18 sec
7 × 7	96.38	96.34 ± 1.46	96.45	96.34	96.35	0.9589	890 min 36 sec
9 × 9	96.98	96.98 ± 1.38	9	96.98	96.97	0.966	910 min 49 sec

Table 7.

Comparison of the results when different filter sizes in the convolution layers are selected.

After the input data size, learning rate, and filter size were determined through experimental studies, the number of convolutional layers was varied using the specified parameters to evaluate its impact on the model’s performance.

Initially, the number of convolutional layers was set to 5, but it was reduced to 4 and then increased to 6. The effect of the number of convolutional layers on model performance was examined as shown in Table 8. The block diagrams of the models with four and six convolutional layers are presented in Figure 7 and 8, respectively.

Convolutional Layer Number	Training Accuracy	Accuracy and Standard Deviation	Precision	Recall	F1-Score	KAPPA	Process Time
Six-layer CNN	98.26	97.77 ± 0.79	97.78	97.77	97.77	0.9749	1,000 min 41 sec
Five-layer CNN	97.99	97.72 ± 1.02	97.74	97.72	97.73	0.9744	837 min 50 sec
Four-layer CNN	97.28	97.44 ± 1.21	97.54	97.44	97.46	0.9713	787 min 29 sec

Table 8.

Comparison of the results when the number of convolution layers is changed.

3.2.

Transfer Learning

In deep networks created from scratch, when the amount of data is insufficient, the model’s performance can be adversely affected. In such cases, transfer learning can be applied, where pre-trained network architectures with previously learned weights are used, and only the weights of the final layer are updated with the training dataset. With the advancement of technology, pre-trained networks with various architectures are available. In this study, the pre-trained networks used include ResNet50, VGG16, VGG19, AlexNet, and GoogleNet. After using pre-trained networks, time and overall accuracy values are compared.

Resnet50 network is a model with 50 layers and is a network structure that has proven to work more efficiently and effectively because the outputs obtained in each layer can be input to the next layer by using the “residual connections” structure [18].

VGG16 network is a neural network structure with a total of 16 layers. Of these 16 layers, 13 layers are convolutional layers, and 3 layers are fully connected layers [19]. VGG19 network is a neural network structure with a total of 19 layers. Of these 19 layers, 16 layers are convolutional layers, and 3 layers are fully connected layers [19].

Alexnet Neural Network is a neural network structure consisting of a total of eight layers. Of these eight layers, five layers are used as convolution layers, and three layers are used as fully connected layers [20].

GoogLeNet has a unique modular architecture called “Inception,” which allows filters of different sizes to be applied in parallel within the same layer. It consists of 22 deep layers in total and despite the use of deeper layers; the number of fully connected layers is kept to a minimum to reduce the number of model parameters. Furthermore, this architecture makes heavy use of 1 × 1 convolutions to increase accuracy while reducing computational costs and memory usage. This structure significantly increases the efficiency of deep networks [20].

In this study, these architectures were adopted, and only the final layer of each network was trained using spectrogram images.

3.3.

Conventional Machine Learning Methods

The model performance was evaluated using classical machine learning methods. After feature extraction with CNNs, these methods were applied during the classification process. Since the model with six convolutional layers achieved the highest performance in the CNNs, this architecture was used for the feature extraction. After feature extraction using the six-layer CNN, DTs, KNN, and SVM algorithms were applied. After using those algorithms, training time and overall accuracies are compared.

Results and Discussions

With the usage of the methods described in the methods and experimental studies section, all results are shown in Table 4–10. In the first analysis, the main objective was to compare the accuracy difference between using five scan segments and one scan segment. As shown in the results in Table 4, using five scan segments significantly outperforms the one scan segment approach in classification. Therefore, subsequent methods continued with the five scan segments. In the second analysis, the effect of input spectrogram image pixel size on classification performance was evaluated using a five-layer CNN. The filter sizes were adjusted as described in the Methods and Experimental Studies section. As shown in Table 5, a pixel size of 100 × 100 yielded the highest classification accuracy. Adaptive learning rate algorithms were applied to the five-layer CNN. In this process, the OneCycleLR and ReduceLROnPlateau algorithms were used with similar minimum and maximum learning rates as specified in the Methods section. As shown in Table 6, the OneCycleLR algorithm outperformed the other adaptive learning rate algorithms with a minimum learning rate of 0.01 and a maximum of 0.001. However, since the accuracy of the OneCycleLR algorithm was still lower than that of the five-Layer CNN with a 100 × 100 pixel size input, the goal was to improve the five-layer CNN structure by modifying the filter sizes. Following the instructions in the Methods and Experimental Studies section, the five-layer CNN structure was modified with different filter sizes (5 × 5, 7 × 7, and 9 × 9). However, changing the filter sizes did not improve the accuracy. The five-layer CNN with a 3 × 3 filter size provided better accuracy than the models with larger filter sizes shown in Table 7. As a result, a new approach was tested to improve accuracy by modifying the CNN structure by adding and removing layers. This method demonstrated that a six-layer CNN improved overall classification accuracy which is shown in Table 8, although not to the desired extent. Consequently, the decision was made to train and classify the dataset using pre-trained networks, as described in the Methods and Experimental Studies section. The results given in Table 9 showed that pre-trained networks delivered good classification performance (over 95% accuracy); however, this approach did not lead to a significant improvement in overall accuracy. Based on these findings, a hybrid method combining CNN with classical machine learning techniques was explored. In this approach, a six-layer CNN was used for feature extraction, and classical machine learning algorithms were employed for training and testing the dataset. The results given in Table 10 indicate that the combination of the six-Layer CNN and the Linear SVM algorithm achieved an accuracy of 98.04%. The confusion matrix for this result is shown in Figure 9.

Transfer Learning Architecture	Training Accuracy	Accuracy and Standard Deviation	Precision	Recall	F1-Score	KAPPA	Process Time
Resnet50	97.53	97.24 ± 0.87	97.25	97.24	97.24	0.969	1,084 min 46 sec
VGG16	96.79	96.5 ± 0.92	96.53	96.5	96.49	0.9606	1,020 min 51 sec
VGG19	96.39	96.16 ± 0.98	96.27	96.16	96.17	0.9568	1,474 min 1 sec
AlexNET	97	96.96 ± 0.79	97.01	96.96	96.95	0.9658	471 min 30 sec
GoogleNet	97.19	96.7 ± 0.83	96.81	96.7	96.71	0.9629	489 min 38 sec

Table 9.

Comparison of the results when different transfer learning algorithms are used.

Method	Training Accuracy	Accuracy	Precision	Recall	F1-Score	KAPPA
Coarse Tree	51.7	51.69	37.46	39.94	35.35	0.433
Fine KNN	92.6	92.59	93.67	92.79	93.11	0.9149
Fine Tree	85.7	58.69	86.62	84.45	85.32	0.8356
Linear SVM	98.8	98.04	98.06	97.04	98.05	0.978
Medium Gaussian SVM	94.4	94.42	95.67	93.51	94.46	0.9359
Medium KNN	93.8	93.77	94.86	93.94	94.23	0.9284
Medium Tree	76.5	76.45	70.36	69.27	68.16	0.729

Table 10.

Comparison of the results when different machine learning algorithms are used for classification.

Figure 9.
Test results with confusion matrix in linear SVM method.

The proposed study integrates both conventional machine learning algorithms and a novel methodological perspective to assess radar-based object classification performance. All models were implemented and analyzed within the MATLAB environment to ensure consistent evaluation conditions. Experimental results indicate that the effectiveness of each algorithm may vary depending on the dataset characteristics and radar signal structure. The proposed approach can be further examined by adapting datasets of different scales and radar platforms to evaluate its generalization capability. Given the increasing computational power of modern radar systems, the suggested methods are expected to be practically implementable and computationally efficient for real-time or near-real-time operations.

When the results of all experimental studies obtained in Table 4 through Table 10 are examined, the highest success was achieved when the features were extracted with the six-layer CNN algorithm created from scratch and classified with the SVM method.

The performance of all analyses is summarized in Figure 10. It can be seen from Figure 10 that the parameter that most affects the model’s accuracy is the scan size of the input data. The next most influential parameter is the pixel size of the input data. It is understood from Figure 10 that other parameters also have an impact on accuracy. Based on these results, the proposed architecture employs 5-scan input data with a resolution of 100 × 100 pixels, utilizes the OneCycleLR learning rate scheduling algorithm, adopts a 3 × 3 filter size, and consists of six layers. The features extracted from the proposed architecture were applied to different machine learning algorithms, and the results are given in Table 10.

Figure 10.
Effect of different parameters on accuracy (*olr(onecyclelr), rlr(reducelronplateu)).

Compared with the studies in the literature (Table 11), our method achieved higher accuracy. For example, while previous works using CNNs and transfer learning on different radar types reported accuracies between 90% and 97%, our hybrid CNN–SVM model reached 98.8% on the Karlsson et al. [5] dataset. This demonstrates that combining deep feature extraction with SVM classification provides state-of-the-art performance for drone and bird classification.

References	Radar Type	Class Number	Classes	Used method	Accuracy (%)
Karlsson et al. [5]	77 GHz FMCW	9	Six different Drones, Human, Bird, Reflector	CNN	90
White et al. [6]	L-Band	2	Bird, Drone	CNN	90
Kumawat and Mainak Chakraborty [21]	Micro-Doppler	5	RC plane, three-short-blade rotor, three-long-blade rotor, quadcopter, bionic bird, and mini-helicopter + bionic bird	VGG16	95
Kumawat and Mainak Chakraborty [21]	Micro-Doppler	5		VGG19	97
Narayanan et al. [8]	10 GHz X-Band	2	Bird, Drone	SVM	96
This Study	77 GHz FMCW	9	Six different Drones, Human, Bird, Reflector	CNN-SVM	98.8

Table 11.

Comparison of this study with other studies in literature.

However, certain limitations must be acknowledged. The study relies exclusively on a single publicly available dataset, which restricts generalizability to other radars and operational environments. Additionally, small differences in accuracy (e.g., 97.7% vs. 98.0%) may not be statistically significant. Statistical analyses could not be performed in this study because the data sets were different. Future work should include statistical validation (confidence intervals and hypothesis testing) to reinforce these findings. Moreover, while training times are reported, the practical deployment of CNN–SVM models in real-time systems requires analysis of computational cost and scalability. Graphics Processing Unit (GPU) acceleration or embedded platforms can enable near real-time inference, while FPGA implementations could further reduce latency and energy consumption.

Conclusion(s)

Nowadays, due to the widespread production of drones and their easy accessibility, their use in different areas has become common. Because of the risks these uses may create, the importance of detecting drones with different sensors is increasing day by day. This study addressed this problem by exploiting micro-Doppler signatures and evaluating multiple machine learning approaches. The results demonstrated that most tested methods achieved accuracies above 90%, confirming the promise of CNN architectures, pre-trained models, and their combination with conventional classifiers for small air target classification.

The main contribution of this work lies in the proposed hybrid CNN–SVM approach, which achieved up to 98.8% accuracy. Beyond accuracy, this hybrid design improved stability on relatively small datasets by reducing overfitting and demonstrated state-of-the-art performance compared to standalone CNN or SVM models. Nevertheless, the study has limitations: it relied exclusively on a single publicly available dataset, and the observed accuracy improvements may not be statistically significant without further validation. Future research should therefore evaluate the approach with multiple radar datasets, different operating frequencies, and real-world collected signals to ensure broader generalizability.

From an application standpoint, the practical deployment of the proposed approach in real-time systems requires careful consideration of hardware efficiency and processing constraints. Recent advances in embedded GPU and FPGA technologies offer promising opportunities for achieving low-latency inference, which is essential for time-critical radar applications such as airport surveillance, infrastructure protection, and defense monitoring. Nevertheless, operational deployment may face challenges arising from noise, clutter interference, multiple moving targets, and adverse environmental conditions. Future work could explore optimized implementations and adaptive signal-processing techniques to enhance system robustness under real-world condition.

In conclusion, this work provides a strong foundation for radar-based drone and bird classification using micro-Doppler signatures. With validation on diverse datasets and optimization for real-time hardware, the proposed CNN–SVM approach has significant potential for practical use in modern security systems.

Acknowledgments

AI tools have been used to improve the spelling of some sentences.

Author Contributions

Emre Can Ertekin: Conceptualization, Writing – original draft; Selda Güney: Conceptualization, Writing – review & editing.

Funding

This research did not receive external funding from any agencies.

Ethical statement

Not applicable.

Data availability statement

Source data presented in this article is available in the study by Karlsson A., “Radar measurements on drones, birds and humans with a 77GHz FMCW sensor.” Sep. 2021. [Online]. Available: https://doi.org/10.5281/zenodo.5,845259.

Conflict of Interest

The authors declare no conflict of interest.

References

1.
The times, Turkey closes Diyarbakir airport for month after drone attack “by PKK”; 2022 [Accessed 2025 Oct 14]. Available from: https://www.thetimes.co.uk/article/turkey-closes-diyarbakir-airport-for-month-after-drone-attack-by-pkk-gmcj8skfz.
2.
Drone attack in Abu Dhabi kills 3 wounds 6 - CBC news; 2022 [Accessed 2025 Oct 14]. Available from: https://www.cbc.ca/news/world/abu-dhabi-drone-attack-1.6317555.
3.
Small drones are giving Ukraine an unprecedented edge—wired; 2022 [Accessed 2025 Oct 14]. Available from: https://www.wired.com/story/drones-russia-ukraine-war/.
4.
Ukraine: How drones are changing the way of war - science - in-depth reporting on science and technology – dw; 2022 [Accessed 2025 Oct 14]. Available from: https://www.dw.com/en/ukraine-how-drones-are-changing-the-way-of-war/a-61681013.
5.
Karlsson A, Jansson M, Hämäläinen M. Model-aided drone classification using convolutional neural networks. In: Proc. 2022 IEEE Radar Conference (RadarConf22). p. 1–9. doi:10.1109/RADARCONF2248738.2022.9764194.
6.
White D, Jahangir M, Wayman JP, Reynolds SJ, Sadler JP, Antoniou M. Bird and micro-drone Doppler spectral width and classification. In: Proc. 2023 24th International Radar Symposium (IRS), Berlin, Germany, 2023, pp. 1–10, doi:10.23919/IRS57608.2023.10172408.
7.
Jiang W, Wang Y, Li Y, Lin Y, Shen W. Radar target characterization and deep learning in radar automatic target recognition: a review. Remote Sens. 2023;15(15):3742. doi:10.3390/rs15153742.
8.
Narayanan RM, Tsang B, Bharadwaj R. Classification and discrimination of birds and small drones using radar micro-Doppler spectrogram images. Signals. 2023;4(2):337–358. doi:10.3390/signals4020018.
9.
Goodfellow I, Bengio Y, Courville A. Deep Learning. Cambridge, MA: MIT Press; 2016.
10.
Rawat P, Kane L, Goswami M, Jindal A, Sehgal S. A review on vision-based hand gesture recognition targeting RGB-depth sensors. Int J Inf Technol Decis Mak. 2023;22(01):115–156. doi:10.1142/S0219622022300026.
11.
Tafsast A, Khelalef A, Ferroudji K, Hadjili ML, Bouakaz A, Benoudjit N. Enhanced ultrasound classification of microemboli using convolutional neural network. Int J Inf Technol Decis Mak. 2023;22(04):1169–1194. doi:10.1142/S0219622022500742.
12.
Cortes C, Vapnik V. Support-vector networks. Mach Learn. 1995;20(3):273–297. doi:10.1007/BF00994018.
13.
Sun M. A multi-class support vector machine: theory and model. Int J Inf Technol Decis Mak. 2013;12(06):1175–1199. doi:10.1142/S0219622013500338.
14.
Skolnik MI. Radar Handbook. 3rd ed. New York, NY, USA: McGraw-Hill; 2008.
15.
Hastie T, Tibshirani R, Friedman J. The elements of statistical learning: data mining, inference, and prediction. 2nd ed. New York, NY, USA: Springer; 2009.
16.
Katarya R, Raturi A, Mehndiratta A, Thapper A. Impact of machine learning techniques in precision agriculture. In: Proc. 2020 3rd International Conference on Emerging Technologies in Computer Engineering: Machine Learning and Internet of Things (ICETCE); 2020. p. 1–6. doi:10.1109/ICETCE48199.2020.9091741.
17.
Czodrowski P. Count on kappa. J Comput Aided Mol Des. 2014;28(11):1049–1055. doi:10.1007/s10822-014-9759-6.
18.
He K, Zhang X, Ren S, Sun J. Deep residual learning for image recognition. In: Proc. of 2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Las Vegas, NV, USA, 2016, pp. 770–778, doi:10.1109/CVPR.2016.90.
19.
Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition, Proc. of 2015 International Conference on Learning Representations (ICLR), San Diego, CA, USA; 2015 Accessed 2025 Oct 14. Available from: https://arxiv.org/abs/1409.1556.
20.
Krizhevsky A, Sutskever I, Hinton GE. ImageNet classification with deep convolutional neural networks. Commun ACM. 2017;60(6):84–90. doi:10.1145/3065386.
21.
Kumawat HC, Chakraborty M, Raj AAB, Dhavale SV. DIAT-μSAT: Small Aerial Targets’ Micro-Doppler Signatures and Their Classification Using CNN. IEEE Geosci Remote Sens Lett. 2021;19:1–5. doi:10.1109/LGRS.2021.3102039.

Written by

Emre Can Ertekin, Selda Güney

Article Type: Research Paper

•

Date of acceptance: December 2025

Date of publication: December 2025

•

DoI: 10.5772/acrt20250109

Copyright: The Author(s), Licensee IntechOpen, License: CC BY 4.0

Download for free

© The Author(s) 2025. Licensee IntechOpen. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.

Impact of this article

301

Downloads

226

Views

Altmetric Score

Share this article

Classification of Targets by Using Frequency Modulated Continuous Wave Radar Data With Machine Learning Techniques

Abstract