A Risk-Based Sensor Management Method for Target Detection in the Presence of Suppressive Jamming

: In this paper, a risk-based sensor management method for target detection in the presence of suppressive jamming is proposed, in which available sensors are dynamically scheduled to control the risk in the execution of target detection task. Firstly, the sensor detection models in the absence/ presence of jamming are established, and the calculation method of target detection risk is presented based on the target detection probability. Secondly, the sensor radiation model is established, and the calculation method of radiation risk is given by using hidden Markov filter. Then, a non-myopic objective function is constructed to minimize the sum of detection risk and radiation risk. Furthermore, in order to obtain the optimal solution of objective function quickly, a decision tree search algorithm combining with branch and bound theory and greedy search is proposed. Finally, simulations are conducted, and the results show that the proposed algorithm and sensor management method are effective and advanced compared with the existing algorithms and methods.


INTRODUCTION
Multi-sensor systems have played a major role in civil and military fields in recent years [1][2][3]. A proven and effective sensor management method should be determined to obtain the optimal combat benefits under the restrictions of the uncertain battlefield environment, diversified information expression forms, a complex relationship between operational nodes, and real-time decision-making in the networked warfare. The sensor management method based on Bayesian Theory has been evolved into mainly three kinds, including task-based management method, information-based management method and risk-based management method since Nash used linear programming theory to establish the sensor management objective function in 1977.
What the first two methods concern is good tactical indicators obtained through sensor management. Typical indicators of task-based method are the posterior carmérrao lower bound [4][5], the covariance matrix of target state [6][7], and the detection probability [8]. Typical indicators of the information-based method are the Fisher information [9][10], the Kullback-Leibler divergence [11] and the Rényi divergence [12]. However, the first two methods also have some shortcomings. In combat, commanders may not understand the meaning of these indicators and be unable to relate them to actual operations. Furthermore, the first two methods cannot deal with the actual battlefield environment flexibly. For example, when sensors are used for target detection in a certain area, the average detection probability of the whole area can be maximized by the first two methods, but if we need the detection probability of any specific places (i.e. the places near our important position) in the area be as great as possible, the first two methods cannot address the needs. In other words, the indicators of the first two methods have strict mathematical definitions, which make them unable to adapt to different needs [13]. To solve these problems mentioned above, the risk-based management method has been proposed in [14]. Because each decision-making is accompanied by uncertain risks caused by the uncertain target environment, this method aims to minimize the risk by setting the risk function, which shall reduce the loss brought by the risk rather than pursue the optimal value of above tactical indicators [15]. Meanwhile, the actual battlefield environment is considered in the modeling of the riskbased method. In this way, the risk-based management method possessing practical value has become a research focus. Target detection [16], target recognition [17], target threat assessment [13] and target tracking [18][19] are main focuses in current researches on risk-based method.
However, the following deficiencies can be found in the above literature concerning risk-based method by studying: (1) The management method is discussed without taking electromagnetic jamming into account. Management methods shown in the above literature are designed for the normal operation of sensors. Nevertheless, management models and methods for the conventional sensor will be no longer applicable, when the sensor performance is under the influence of electromagnetic jamming implemented by the enemy. (2) The radiation risks of sensors are not taken into account. Active sensors in service keep emitting electromagnetic waves, causing that signals are easily intercepted by the detectors of enemy. However, the above literature merely considers the sensor's measurement risk of the target or the environment, while the radiation risk generated by the sensor is neglected.
To solve these problems mentioned above, a riskbased sensor management method for target detection in the presence of suppressive jamming is proposed in this paper. The risk is defined as the product of the loss and the probability of the loss, and sensors are managed with the minimum total risk as the decision-making purpose, where the total risk is divided into detection risk and radiation risk.
The rest of this paper is organized as follows. In Section 2, a target detection model is established to give measurement models for the sensors in the absence/ presence of jamming, also, the calculation method for detection risk is derived. Next, the sensor radiation is modeled in Section 3, presenting the means of calculating the radiation risk. The objective function under non-myopic management is built in Section 4. In Section 5, an improved decision tree search algorithm is proposed to solve the problem with high computational complexity in searching the optimal solution. The effectiveness of the method proposed is verified through simulating in Section 6 and the conclusion is discussed in Section 7.

TARGET DETECTION MODEL 2.1 Scene of Target Detection Task
It is assumed that the enemy would like to hinder our N sensors to monitor the area O of 200 × 200 km through dispatching some electronic jamming aircrafts for suppressive jamming over the sensors we deployed. The center point of the area O is our defending target with the coordinates (xc, y c ).The scene diagram is shown in Fig. 1.

Sensor Detection Model in the Absence of Jamming
In the absence of jamming, the maximum detection distance of sensor n at time k can be described as Eq. (1) [20]. 3 min 4 n n n n t r rcs n k n n n n n Where P n is the transmitting power of the sensor, n t G is the transmitting antenna gain, n r G is the receiving antenna gain, ε rcs is target RCS that is taken as ε rcs = 10 m in this paper, λ n is the wavelength of the sensor signal, σ is the Boltzmann constant, K n is the noise temperature, B n is the receiver bandwidth, F n is the noise factor of the receiver, where ro is the distance of o from sensor n. According to [21], the detection probability is approximately calculated as follows: x e da and pf n is false alarm probability. By using N sensors for joint detection, the detection probability at o can be calculated as follows: Then, the missing alarm probability is

Sensor Detection Model in The Presence of Jamming
A diagram containing scene of suppressive jamming is shown in Fig. 2. Our sensor is set to detect potential targets in the monitored area. However, the enemy's jamming aircraft transmits strong noise jamming to exert suppressive jamming over our sensor, leading to reduced performances of sensor. In the presence of jamming, the maximum detection distance of sensor n at time k can be described as Eq. (6) [22].
where n k r is the distance between the jamming aircraft and the sensor at time k, Pj is the transmitting power of the jammer, ∆fj is the bandwidth of the jamming signal, Gj is the antenna gain of the jammer, κj is the overall loss of the jamming signal, (SJR)min is the minimum detectable signal to jamming ratio, ∆ n r f is the receiver bandwidth, where K is a proportionality constant and 0,5 n θ is the width of main lobe of the sensor.
By transforming Eq. (6), the SJR can be obtained by:

Target Detection Risk
Whether accurate target information can be obtained is an uncertain event, which causes risk when the sensors execute detection. In this paper, the closer the area to the defending target, the more important the area is, the higher the loss resulting from the missing alarm will be. Regarding a point o in the area O, its importance priority can be calculated as follows where ρ is a proportionality coefficient, ρ =0,0002 is taken in this paper. xo and yo are the horizontal and vertical coordinates of o. Fig. 4 shows the distribution of the importance priority in the area O. The detection risk Dk,o at o is the product of the importance priority lo and the missing alarm probability pmk,o. Therefore, the detection risk in the area O at time k is given as Combining with the scheduling action of the sensor, the detection risk in the area O at time k is given as where pmo(Ak-1) represents the detection probability at o after executing the scheduling action Ak-1.
When the decision step is H, the cumulative detection risk in the time domain [k +1, k + H] is given as

SENSOR RADIATION MODEL
In [23,24], the sensor intercepted probability is calculated with the parameters such as the transmitting power, the target state, and the window function of the receiver. However, the sensor intercepted probability cannot be calculated directly for target detection since the parameters of the potential targets in the surveillance area are unknown. To solve this problem, Emission Level Impact (ELI) is used to replace sensor intercepted probability in [25,26], which indicates the radiation amount cumulatively received by the enemy. It does not need to obtain target parameters in calculation, so it can be applied for target detection. Therefore, an ELI-based sensor radiation model is built in this paper.

Radiation State
The system radiation state at time k is defined as E is, the greater the sensor n intercepted probability will be [25]. Hence, a corresponding relationship between ELI and the interception cost is established in this paper: when the ELI state of sensor n at time k is i , the corresponding sensor intercepted probability is The ELI state transition process can be approximated as a Markov process through introducing a state transition matrix Tn to describe its state transition. If the sensor n is in service, Otherwise, Tn is an identity matrix.

Radiation Observation
The radiation observation set at time k is defined as i denotes the probability that the instantaneous threat level is observed to be q, when the ELI value of sensor n is transferred from i to j.

Radiation Risk
As the radiation state of sensor cannot be observed in the actual combat [25], the belief state After the instantaneous threat level n k Z q = is achieved by the scheduled sensor n at time k , the belief state n k c is updated as where the symbol  represents the Hadamard product, 1 is a Emax-dimensional unit vector. If the sensor is not in Although n k Z cannot be predicted at time k − 1, its probability distribution can be calculated as follows: According to the relationship between ELI and sensor intercepted probability, the predicted value of sensor n intercepted probability at time k is calculated as where V = [1, …, Emax] represents the ELI values. Then, the radiation risk of the sensor n at time k is defined as where indicating the importance of each sensor for combat, which is a part of prior knowledge. Combining with the scheduling action, the radiation risk of the sensor system can be calculated as

OBJECTIVE FUNCTION
In this paper, the purpose of sensor management is to dynamically decide reasonable sensor scheduling scheme to control the total risk which is the sum of detection risk and radiation risk. Essentially, sensors to be used for detection should be determined before the start of each decision-making cycle, that is, an optimal sensor scheduling action Ak at time k + 1 should be decided at time k.
The non-myopic management method based on multistep cumulative benefits is superior to the myopic method based on one-step benefit in terms of performance [27]. Thus, we establish a non-myopic management model in this paper. Combining with the Eq. (13) and Eq. (23), when the decision step is H, the non-myopic objective function is stated as  Figure 5 The process of management method

ALGORITHM DESIGN
The solution space of the objective function grows exponentially with the increase of time step. There are total (2 N − 1) H solutions. The computation complexity is so high in solving that, which is difficult to satisfy the request of real time. For this reason, an idea of decision tree search is introduced to put forward a branch-and-bound-based greedy search algorithm, so as to search the optimal solution in a short time.
In the first place, the sensor management problem is transformed into the decision tree. The diagram of decision tree under N = 4, H = 3 is shown in Fig. 6  corresponding to the node i in the level h , while the nodes in the level h are obtained by expanding the scheduling action Ak+h−1 of the nodes in the last level. Hence, the scheduling sequence contained in the lowest-level node of the minimum cumulative total risk is the optimal solution of the objective function. However, a large amount of time should be taken in GS since the number of nodes in each level is increased exponentially. In order to decrease the number of opened nodes as much as possible, the branch and bound theory [28] is introduced in this paper combining with GS search to timely delete the node that the lower bound is larger than the minimum total risk, thus, accelerating the search. If the scheduling sequence corresponding to a node in the level h is Ak:k+h−1, the lower bound value of the node is:   (28) If the lower bound of the node is larger than the minimum cumulative total risk ψ min obtained currently in searching, the node and its subsequent branches will be deleted. And the remaining nodes are expanded in according with the GS. The specific algorithm process is shown in Tab. 1: Table 1 Branch-and-bound-based greedy search algorithm Step 1: Initialize, the root node is put into the list, making the minimum cumulative total risk value as min =+ ψ ∞ .
Step 2: While (the list is not null) The first node in the list is expanded and deleted from the list. If (the child node is the node in the lowest level) Calculate the cumulative total risk of each child node.
If (the minimum value is less than min ψ ) Assign the value to min ψ .
Record the corresponding node as the node optimal node. end else Calculate the lower bound value of each child node.
Put the child node with the lower bound less than min Delete the node from the list end end end Step 3: Obtain scheduling sequence according to the optimal node, which is the optimal solution of the objective function.

SIMULATIONS
The number of sensors is 4, which are named as S1, S2, S3 and S4, respectively. Their position coordinates are (50, 50) km, (150, 50) km, (50, 150) km, and (150, 150) km, respectively, their tactical values are 5, 8, 7 and 6, respectively, their transmitting powers are 100 kW, 95 kW, 105 kW and 90 kW, respectively, their transmitting antenna gains are 36 dB, 34 dB, 38 dB, and 32 dB, respectively. The rest of their parameters is equal to details as follows: the noise temperature is 290 K, the receiver bandwidth is 5 MHz, the noise factor is 3 dB, the overall loss of sensor is 3 dB, the minimum detectable SNR is 10 dB, the wavelength of the sensor signal is 0.11 m and the false alarm probability is 10 −6 .
The number of jamming aircrafts is 2, and their jamming direction is the same as their own flight direction every time in the simulation duration. The parameters of the jammers on the two aircrafts are equal with details as follows: the transmitting power of jammer is 100 W, the antenna gain of jammer is 10 dB, the bandwidth of the jamming signal is 10 MHz, the overall loss of the jamming signal is 4 dB, the minimum detectable SJR is 10 dB, the main lobe of the jammer beam is 45° and the maximum jamming distance is 150 km. For convenience, it is assumed that both aircrafts are moving in a uniform straight line, their initial positions are (200, 80) km and (200, 100) km, respectively, their initial speeds are (−550, −300) m/s and (−500, 300) m/s, respectively. The detection interval is 1s and the simulation duration is 100 s.
The ELI value is quantized as {1, 2, 3, 4}, where 1 represents low-emission grade, 2 represents mediumemission grade, 3 represents high-emission grade, and 4 represents extremely-high-emission grade. The ELI value transfer matrices of all sensors are as follows:   It can be seen that the greater the ω, the greater the detection risk and the smaller the radiation risk. This is because with the increase of ω , the impact of radiation risk on the total risk will be greater and more attention will be paid to control the radiation risk in decision-making. When, ω = 0.8, 0.9, 0.91 and 0.93, the detection risk is less than the radiation risk. While ω = 0.97, the detection risk is far greater than the radiation risk. Only when ω = 0.95, they are close, indicating that the detection risk and the radiation risk have reached a relative balance. Therefore, we choose ω = 0.95 in the next simulations.

Comparison of Search Algorithms
In order to illustrate the advancement of the branchand-bound-based greedy search algorithm (BB-GS), we compare it with three existing algorithms, namely uniform cost search (UCS), greedy search (GS) and branch-andbound-based standard cost search (BB-UCS) [29]. Then, the percentage of nodes opened and the maximum number of nodes stored are selected as evaluation indexes, which represent the performance of the algorithm in search time and memory consumption, respectively [29]. Fig. 8 shows a comparison of the percentage of nodes opened and the maximum number of s nodes stored in each algorithm. It can be seen clearly that BB-GS can greatly reduces the search time and memory consumption compared with other algorithms under any decision steps. Fig. 9 shows how the cumulative total risk varies under different decision step. It can be seen that with the increase of decision step, the total risk decreases gradually, but the magnitude of each decline is smaller. The reason for this is that with the increase of decision step, the inaccuracy of information is also increasing, leading to the error in calculation being greater. Furthermore, considering that decision step will increase the computational complexity exponentially, we choose H = 4 in the next simulations.   2) When the maximum detection distance of a sensor changes abruptly, the sensor jammed state changes (from being jammed to not being jammed, or vice versa). Combining with Fig. 10 (a)-(c), it can be seen that when the sensor jammed state changes (i.e. at 5 s, 15 s, 47 s), the detection risk, the radiation risk and the total risk all change abruptly, and the scheduling action also changes. The reason for this is that when the jammed state of a sensor changes, it can be regarded as a completely different sensor, leading to the scheduling system will change the scheduling action to cope with this change. Fig. 11 shows the curves of the predicted and actual cumulative radiation risk. It can be seen that they are almost identical in the simulation duration, which indicates that the method of predicting radiation risk proposed in this paper is accurate and reasonable.

Comparison of Sensor Management Methods
In order to fully illustrate the effectiveness of the proposed method (PM), we choose three existing methods to compare with it. They are as follows: (1) Rule-based management method (BRM), which takes the minimum sum of average missing alarm probability and sensor intercepted probability as the management objective to make decision; (2) Static management method (SM), which schedules fixed sensor combinations for target detection in the simulation duration. Two kinds of scheme are set in this simulation: scheduling sensors S1 and S4 (SM1), scheduling S2 and S3 (SM2).
(3) Random Management Method (RM), which schedules random sensor combinations at each time in the simulation duration. Fig. 12 (a) compares the total risk under each method in the simulation duration. Fig. 12 (b) shows the comparison of cumulative risk corresponding under each method. We can clearly see that the total risk under PM is the lowest than that under all methods at most times. Although the radiation risk under PM is higher than that under SM, it is obviously better than all methods in controlling the total risk and detection risk. It is to say that PM can reasonably balance the detection risk and radiation risk in order to obtain a better effect in controlling of total risk. For other methods, the risk control effect under BRM is inferior to PM, even if BRM may obtain lower missing alarm probability and sensor intercepted probability, which is not practical in combat. Compared with PM, SM and RM are simple to achieve, but their corresponding total risks are higher, which are not conducive to combat. Furthermore, in order to fully illustrate the universality of the proposed method, we set the initial state of jamming aircraft randomly in the next simulations. It is assumed that the initial position of the aircraft is randomly distributed on the boundary of the area O, its initial velocity is randomly It can be seen that the cumulative radiation risk under PM is higher than that under SM when the aircraft state is random, but the cumulative total risk and the cumulative detection risk are the lowest. It further proves that the proposed method can reasonably balance the detection risk and radiation risk to effectively control the total risk and ensure higher operational benefits.

CONCLUSIONS
In this paper, multi-sensor management problem for target detection in the presence of suppressive jamming is studied, and a risk-based sensor management method is proposed, which takes the minimum total risk (including detection risk and radiation risk) as the purpose of sensor management. Firstly, the sensor detection models in the absence/ presence of jamming are established, and the calculation method of detection risk is presented. Secondly, the sensor radiation model is established based on ELI, and the calculation method of radiation risk is given. Then, a non-myopic objective function is constructed to control total risk. Furthermore, a branchand-bound-based greedy search algorithm is proposed to obtain the optimal solution quickly. Finally, the simulation results show that the proposed algorithm can effectively reduce search time and memory consumption compared with the existing algorithms. And the proposed method can significantly reduce the total risk compared with the existing algorithms.