Wuhan University Journal of Natural Sciences English TemplateVol 11 No6 2006 000-000Article ID 1007-1202200606-0000-00Detecting DDoS Attacks against Web ServerUsing Time Series Analysis□ Wu Qingtao Sh

hom correspondence should be addressed: zqshao@ecust.edu.cn

Introduction

With the explosive growth of the Internet, Web services have become indispensable in people's daily life. However, the availability of Web services is threatened by Distributed Denial of Service (DDoS) attacks, which involve a large number of compromised hosts flooding the target server with requests, thus making the service unavailable to legitimate users. DDoS attacks have caused significant economic losses and have become a major security challenge for Web service providers [1].

Various methods have been proposed to detect DDoS attacks, including packet filtering, traffic profiling, and statistical analysis [2]. However, the high traffic volume and the dynamic nature of Web traffic make it difficult to distinguish between legitimate traffic and attack traffic. Furthermore, attackers often employ various evasion techniques, such as IP address spoofing and randomization, to avoid detection [3].

In recent years, time series analysis has been applied to the study of network traffic, due to its ability to capture the bursty and self-similar nature of network traffic [4]. The bursty feature of DDoS attacks, which involves a sudden increase in traffic volume, can be detected by analyzing the abrupt change in the time series data. Previous work has proposed various methods for detecting DDoS attacks based on time series analysis, such as wavelet-based approaches [5], entropy-based approaches [6], and change-point detection algorithms [7].

In this paper, we propose a method for detecting DDoS attacks against Web servers by analyzing the abrupt change in the time series data obtained from Web traffic. The time series data are specified in reference sliding window and test sliding window, and the abrupt change is modeled using an Auto-Regressive (AR) process. By comparing two adjacent non-overlapping windows of the time series, the attack traffic could be detected at a time point. Combined with alarm correlation and location correlation, not only the presence of DDoS attack, but also its occurring time and location can be determined.

The remainder of this paper is organized as follows: Section 2 presents the related work on time series analysis for DDoS attack detection. Section 3 describes the proposed method in detail. Section 4 presents the experimental results and analysis. Finally, Section 5 concludes the paper and discusses future work.

Related Work

Time series analysis has been widely used in the study of network traffic, due to its ability to capture the bursty and self-similar nature of network traffic. Various methods have been proposed for detecting DDoS attacks based on time series analysis.

Wavelet-based approaches have been proposed for detecting DDoS attacks by analyzing the frequency components of the time series data [5]. The wavelet transform is used to decompose the time series data into different frequency components. Anomaly detection is performed on each frequency component, and the results are combined to detect DDoS attacks. However, the computational complexity of the wavelet transform limits the scalability of this approach.

Entropy-based approaches have been proposed for detecting DDoS attacks by analyzing the information content of the time series data [6]. The Shannon entropy is used to measure the randomness of the time series data. Anomaly detection is performed based on the deviation of the entropy value from the normal behavior. However, the entropy-based approach may not be effective for detecting low-rate DDoS attacks.

Change-point detection algorithms have been proposed for detecting DDoS attacks by analyzing the abrupt change in the time series data [7]. The change-point detection algorithm is applied to the time series data, and the abrupt change points are identified as potential attack points. However, the change-point detection algorithm may produce false positives due to the noise in the time series data.

In this paper, we propose a method for detecting DDoS attacks against Web servers by analyzing the abrupt change in the time series data obtained from Web traffic. The method is based on the Auto-Regressive (AR) process, which models the abrupt change in the time series data as a function of the past observations.

Methodology

The proposed method for detecting DDoS attacks against Web servers is based on the analysis of the time series data obtained from Web traffic. The time series data are specified in reference sliding window and test sliding window, and the abrupt change is modeled using an Auto-Regressive (AR) process. By comparing two adjacent non-overlapping windows of the time series, the attack traffic could be detected at a time point. Combined with alarm correlation and location correlation, not only the presence of DDoS attack, but also its occurring time and location can be determined.

3.1 Time Series Data Collection

The time series data used in this paper are obtained from the Web server logs. The server logs record the time stamp, the source IP address, the requested URL, and the response status code for each request made to the server. The requests are grouped into time intervals of fixed length, and the number of requests in each time interval is counted to obtain the time series data.

The time series data are specified in reference sliding window and test sliding window. The reference sliding window is used to model the normal behavior of the Web traffic, and the test sliding window is used to detect the attack traffic. The length of the sliding window is a design parameter and should be chosen according to the traffic characteristics and the detection requirements.

3.2 Auto-Regressive Model

The Auto-Regressive (AR) model is a popular statistical model for time series analysis. The AR model assumes that the current value of the time series data is a linear combination of the past values, with some random noise added. The order of the AR model specifies the number of past values used in the linear combination.

The AR model can be represented as follows:

y(t) = c + Σ ai * y(t-i) + ε(t)

where y(t) is the current value of the time series data, c is a constant term, ai are the coefficients of the past values, and ε(t) is the random noise. The order of the AR model is denoted by p, where p is the number of past values used in the linear combination.

The AR model can be estimated from the time series data using the least squares method or the maximum likelihood method. The estimated coefficients can be used to predict the future values of the time series data.

3.3 Detection Algorithm

The proposed detection algorithm consists of the following steps:

Step 1: Initialize the reference sliding window and the test sliding window.

Step 2: Estimate the AR model for the reference sliding window.

Step 3: Compute the predicted values of the test sliding window using the estimated AR model.

Step 4: Compute the residual values of the test sliding window by subtracting the predicted values from the actual values.

Step 5: Compute the mean and standard deviation of the residual values for the reference sliding window.

Step 6: Compute the z-score for each residual value in the test sliding window using the mean and standard deviation computed in Step 5.

Step 7: Identify the time point of the abrupt change in the test sliding window as the point where the maximum z-score is observed.

Step 8: Determine the presence of DDoS attack based on the threshold value of the z-score.

Step 9: Combine the alarm correlation and location correlation to determine the occurring time and location of the DDoS attack.

The threshold value of the z-score is a design parameter and should be chosen according to the noise level and the detection requirements. The alarm correlation and location correlation can be performed by analyzing the traffic patterns of the source IP addresses and the requested URLs.

Experimental Results and Analysis

In this section, we present the experimental results and analysis of the proposed method for detecting DDoS attacks against Web servers. The experiments are conducted in a test environment, using a simulated DDoS attack traffic.

4.1 Experimental Setup

The test environment consists of a Web server, a client machine, and an attacker machine. The Web server hosts a simple Web application that generates dynamic Web pages in response to the client requests. The client machine generates the legitimate traffic by sending HTTP requests to the Web server at a fixed rate. The attacker machine generates the attack traffic by sending HTTP requests to the Web server at a high rate, using multiple threads.

The time series data are collected from the Web server logs, and the sliding window length is set to 5 minutes. The AR model is estimated using the least squares method, and the order of the AR model is set to 2. The threshold value of the z-score is set to 3, and the alarm correlation and location correlation are performed by analyzing the source IP addresses and the requested URLs.

4.2 Experimental Results

Figure 1 shows the time series data of the legitimate traffic and the attack traffic. The legitimate traffic has a stable pattern with some minor fluctuations, while the attack traffic has a bursty pattern with a sudden increase in traffic volume.

Figure 2 shows the predicted values and the residual values of the test sliding window, using the estimated AR model. The predicted values follow the normal behavior of the Web traffic, while the residual values show a sudden increase at the time point of the DDoS attack.

Figure 3 shows the z-scores of the residual values in the test sliding window. The z-scores are computed using the mean and standard deviation of the residual values in the reference sliding window. The maximum z-score is observed at the time point of the DDoS attack, indicating the presence of an abnormal event.

Figure 4 shows the alarm correlation and location correlation of the DDoS attack. The source IP addresses and the requested URLs of the attack traffic are analyzed to determine the occurring time and location of the DDoS attack. The results show that the attack traffic originates from a single IP address and targets a specific URL, indicating a targeted attack.

4.3 Discussion

The experimental results demonstrate the effectiveness of the proposed method for detecting DDoS attacks against Web servers. The method is able to detect the bursty pattern of the attack traffic by analyzing the abrupt change in the time series data, and to determine the occurring time and location of the DDoS attack by combining the alarm correlation and location correlation.

However, the proposed method has some limitations and challenges. First, the method relies on the assumption that the normal behavior of the Web traffic can be modeled by an AR process, which may not be accurate for all traffic patterns. Second, the method may produce false positives or false negatives due to the noise in the time series data or the evasion techniques employed by the attackers. Third, the method may not be effective for detecting low-rate DDoS attacks, which may be masked by the legitimate traffic.

Conclusion

In this paper, we have presented a method for detecting DDoS attacks against Web servers by analyzing the abrupt change in the time series data obtained from Web traffic. The method is based on the Auto-Regressive (AR) process, which models the abrupt change in the time series data as a function of the past observations. The method is able to detect the bursty pattern of the attack traffic and to determine the occurring time and location of the DDoS attack by combining the alarm correlation and location correlation. The experimental results in a test environment have demonstrated the effectiveness of the proposed method. However, further research is needed to address the limitations and challenges of the method and to improve its performance and scalability

Wuhan University Journal of Natural Sciences English TemplateVol 11 No6 2006 000-000Article ID 1007-1202200606-0000-00Detecting DDoS Attacks against Web ServerUsing Time Series Analysis□ Wu Qingtao Sh