Improving retrieval accuracy for aerosol optical depth by fusion of MODIS and CALIOP data Bo Han

Original scientific paper MODIS and CALIOP are two independent observation instruments in the A-train satellite constellation. They both provide aerosol optical depth (AOD) retrievals and scan the same points on the Earth’s surface within a two-minute interval. With different design principles, MODIS and CALIOP instruments obtain varying AOD retrieval accuracies under different conditions. In this paper, we propose a two-stage fusion approach, including an analysis stage and an integration stage, to improve AOD retrieval accuracy. In the analysis stage, we systematically analyse conditions where MODIS retrieves well while CALIOP does not, and vice-versa. In the integration stage, we combine AOD retrievals from both instruments together by drawing on the other's strong points to make up one's weak points. We test the fusion approach on the two-year collocated data from MODIS, CALIOP and AERONET. The fusion result is significantly more accurate than AOD retrievals from any single observation facility.


Introduction
Aerosol distribution has a profound impact on global climate system.Currently there are many environmental observation facilities on satellites to derive aerosol optical depth (AOD) globally for aerosol monitoring and research.Some of these are on the same satellite constellation, scanning the same points on the Earth's surface in the same orbit and within a very small time interval.It makes possible to obtain multiple aerosol observations for a specific location from different satellites at almost the same time.The design principles of each observation facility are different.Consequently, previous validation results based on AERONET observations showed the AOD retrieval accuracies derived from different facilities vary under different conditions [1÷10].The results present us with two questions: whether there is a systematic and data-driven approach that can relate observing conditions to AOD retrieval accuracy for an instrument, and whether it is possible to combine AOD retrievals from two observation instruments together, that take advantage of relative advantages of each sensor to make more accurate retrievals?
We have explored the answers to these questions in the context of AOD measurements retrieved from the Moderate Resolution Imaging Spectroradiometer (MODIS) and the Cloud-Aerosol Lidar with Orthogonal Polarization (CALIP) aboard the Cloud-Aerosol Lidar and Infrared Pathfinder Satellite Observations (CALIPSO) satellite.MODIS and CALIOP are two independent observation instruments on Aqua and CALIPSO satellites respectively, in the A-train satellite constellation.MODIS observes Earth from polar orbit in 36 wavelength bands ranging from 0,415 μm to 14,5 μm at different spatial resolution.It derives AOD by matching the TOA (top of the atmosphere) reflectance to the simulated values stored in lookup tables by using domain knowledge about aerosol models and forward simulation algorithm [1].Its AOD retrieval algorithm has been validated and is constantly improving.The C005 AOD products developed by second-generation retrieval algorithms significantly improve the retrieval accuracy compared to the previous C004 products.Its expected error for most land AODs are within ± (0.05+0.15×AOD) and errors for ocean AODs are even lower [11][12][13][14].Validation results based on AERONET sites show that under certain conditions C005 AODs have achieved accurate retrievals.Under other conditions, however, they show systematic deviation.For example, under the condition of low aerosol loading, the retrieve of AOD is more accurate in the eastern United States, Western Europe and Southern Africa sites (because their spectrum is closer to the middle "dark" and moderate "green" regimes); whereas the AOD retrieval is often overestimated by MODIS in western United States and Central Asia's sites -because of its relatively bright surface and less greenness, MODIS overestimates the AOD [15].In another example, Jethva et al. found that the errors in the fine-dominated AOD over the Indian region were large for most AOD retrievals in the C005 data [16].
The CALIPSO satellite was designed by carefully controlling its orbit to ensure the acquisition of collocated near-simultaneous measurements with other A-train satellites.The primary instrument on the CALIPSO satellite for global profiling of aerosols and clouds is CALIOP.This instrument estimates the aerosol layer optical depths and total column optical depth by backscattering lidar signals [17÷20].Validation studies show that CALIOP achieved an average AOD retrieval bias of -13% relative to AERONET over 147 global sites from June 2006 to May 2009 [9].The results show that CALIOP is biased 20% and 12% below the AERONET AODs in the Northern African sites and Middle Eastern region, respectively.When these global AOD comparisons are analyzed according to aerosol types, scientists found that CALIOP presents the most significant biases at a very high confidence level for dust and marine aerosol.In addition, A.H. Omar found that generally CALIOP AODs are lower than AERONET AODs especially at small optical depths, possibly due to undetectable tenuous aerosol layers caused by low signalto-noise ratio [8].These comparisons illustrate that MODIS and CALIOP retrieve AOD with different accuracy under different conditions.However, in those studies, the conditions that determine the retrieval accuracy of the instruments are not fully understood, especially from the aspect of aerosol modelling and spatial temporal analysis.
In this paper, we propose a two-stage fusion approach, including an analysis stage and an integration stage, to improve AOD retrieval accuracy.The analysis stage is based on a data-driven decision-tree algorithm.It systematically analyses the relationship between different observing conditions and accuracy of AOD retrievals for each instrument.The integration stage combines AOD retrievals with different accuracy (categorized as accurate and inaccurate retrievals in the paper) from both instruments to determine a fusion plan.The fusion approach is tested by the collocated data among MODIS, CALIOP and AERONET from April 2, 2009 to April 1, 2011.

Data sets
To identify the conditions determining the AOD retrieval accuracy for MODIS and CALIOP, and to further evaluate the performance of our fusion approach, we analyze the collocated data among MODIS, CALIOP and AErosol RObotics NETwork (AERONET) from April 2, 2009 to April 1, 2011.Similar to previous validation research and our previous study, the AOD retrievals from MODIS, CALIOP and AERONET are compared in visible wavelength 550 nm [9, 15, 21÷23].Data products and spatial and temporal collocation requirements are outlined in the following paragraphs.

AERONET data
The AERONET measurements of AOD are relatively accurate (on the order of 0.02) when compared with satellite retrievals.As a result, they are widely used as "ground truth" for validation of satellite AOD retrievals [11,24].In this paper, Level 2.0 cloud-screened and quality-assured AERONET data is obtained for 199 sites globally between April 2, 2009 and April 1, 2011.These sites cover land, coast, desert and marine surface types.AERONET provides AOD retrievals in seven spectral bands (340, 380, 440, 500, 670, 870, and 1020 nm).As AOD retrieval in 550 nm is not directly provided, it is interpolated by log-linearity from AOD values at 440 and 870 nm, as shown in our previous research [21÷23].

MODIS data
MODIS/Aqua Collection 005 product suites are used in this study since Aqua is in A-Train constellation.The suites include a level 2 aerosol product MYD04_L2 at a 10-km resolution, a level-1B sub-sampled calibrated radiance data MYD02SSH at a 5-km resolution and a level 2 cloud mask product MYD35 at a 1-km resolution.The three data sets with different spatial resolutions are synchronized in the spatial coincidence square region with an area of 40×40 km 2 surrounding an AERONET site [11,13].The region is required to contain at least one non-cloud pixel from MYD35 and at least one MODIS land AOD retrieval with quality flag QA ≥ 3 or at least one MODIS ocean AOD retrieval with QA ≥ 1 from MYD04_L2.

CALIOP data
CALIOP level 2 version 3 cloud-free aerosol layer products are used in this study at a 5-km resolution around AERONET sites, with the following two additional settings: CALIOP cloud-aerosol discrimination (CAD) score <−20 and extinction quality control flag QC=0 for all layers [9,19].The two settings require the absence of clouds in the CALIOP data columns and consider only quality-controlled aerosol data.For keeping the same wavelength of AOD with MODIS and AERONET, we interpolated the CALIOP AOD retrievals at 550 nm by log-linearity from AOD values at 532 and 1064 nm.

Collocated data
We collocated MODIS and AERONET data from April 2, 2009 to April 1, 2011, using the similar spatialtemporal coincidence criteria as Ichoku et al. [11].The spatial coincidence box is approximately 40 km by 40 km, with the AERONET station at the centre covering a grid of 4 by 4 MODIS aerosol retrieval pixels.Spatial mean values for MODIS attributes are calculated and synchronized with the temporal mean values of the AERONET observations taken within ±30 minutes of MODIS overpass.By the coincidence criteria, the collocated data include 6351 overpasses covering 197 out of 199 AERONET sites globally.The spatial distribution of MODIS-AERONET collocated data is illustrated as red circles in Fig. 1a.
Using a very similar spatial-temporal coincidence criteria (a 40km by 40km square box and within ±30 minutes of CALIOP overpass), we collocated CALIOP and AERONET data from April 2, 2009 to April 1, 2011.A total number of 486 collocated data points are found covering 82 AERONET sites globally.Their spatial distribution is shown in Fig. 1b.
The MODIS-AERONET collocated data are matched with the CALIOP-AERONET collocated data employing the above spatial-temporal coincidence criteria.We obtained 322 MODIS-CALIOP-AERONET collocated data covering 65 AERONET sites globally.Their spatial distribution is presented in Fig. 1c.

Methodology
To integrate AOD retrievals from MODIS and CALIOP, we propose a two-stage fusion approach, including an analysis stage and an integration stage.
The goal of the analysis stage is to develop a systematic and data-driven approach to analyze the correlation between various conditions and AOD retrieval accuracy of MODIS and CALIOP instruments.In this study, the accuracy of the retrievals is simply categorized as two class: accurate and inaccurate AOD retrievals, which are checked against an expected error (EE) envelope as, It is a common standard used in AOD validation as shown in many studies [11,13,15].Furthermore, an accurate retrieval is defined by an instrument AOD retrieval that falls within the EE envelope, Vice versa, AOD retrievals that fall outside the error box are defined as inaccurate retrievals.In this way, each collocated instrument AOD retrieval record will be categorized as two types of labels, either in EE envelope (class EE1) or outside of EE envelope (class EE0).Thereby, the data analysis can be considered as a classification problem: classifying conditions that determines accurate or inaccurate AOD retrievals from an instrument, given collocated AOD retrievals and their category labels.
Classification techniques can solve this problem.The output of a classification algorithm is the class label defined by Eq. ( 2).Its inputs are the collected information from a collocated data set.
For MODIS-AERONET collocated data, by our previous research, the following five types of attributes are potentially correlated with AOD retrievals and they are used as input-attribute candidates: 1) mean value and standard deviation of ρ MODIS AOD, its detected altitude and angstrom exponent (AE) in a spatial box; 2) mean value and standard deviation of surface reflectance at wavelength of 470 nm, 550 nm, 660 nm, 860 nm, 1240 nm,1640 nm, 2120 nm, and their derived NDVI_swir.The NDVI_swir is defined as, 2120 1240 3) observation geometry information including solar zenith, solar azimuth angle, sensor zenith, sensor azimuth angle, scattering angle; 4) percentage of cloud-free pixels in spatial box, percentage of pixels on a specific surface (water, coastal, desert, land) in spatial box; 5) observing temporal information (day, hour, minute) and spatial information (latitude, longitude).In addition, previous studies showed MODIS AOD retrieval accuracy varies in different areas [1].With latitude and longitude information, we clustered 6351 MODIS-AERONET collocated points into 17 centres by using expectation maximization algorithm.The distance from each point to these 17 cluster centres is also measured as spatial attributes.
For CALIOP-AERONET collocated data, similar as the above collocated data, the following five types of attributes are used as input-attribute candidates: 1) mean value and standard deviation of CALIOP-AOD, its detected altitude and angstrom exponent (AE) in spatial box; 2) majority aerosol type across layers in scanning columns; 3) altitude, pressure, humidity for the base and top of the layer with the thickest and thinnest AOD; 4) majority aerosol type (marine, desert dust, polluted dust, clean continental, polluted continental, biomass burning), cloud optical depth; 5) observing temporal information (day, hour, minute) and spatial information (latitude, longitude).
Both MODIS-AERONET and CALIOP-AERONET collocated data sets provide a large number of attributes.With limited training data size, it will result in over-fitting problems in decision tree classification.Therefore, we applied information gain measure to select the most informative attributes.The information gain for an attribute A in a data set S is defined as, Where, Here, p i is the probability that a data record in S labeled as class EE i (i = 0 or 1) and is computed as |EE i ,S|/|S|.For an element in A, Infor(A) denotes the average amount of information required to identify its class label.m is the number of discrete values the attribute A has. S j is the subset in S where attribute A is valued at a j .
In the data mining domain, there are many classification techniques, such as support vector machine, neural networks, logistic regression, etc.Compared with these classifiers, a decision tree has the distinct advantage of constructing classification rules and these rules are easy for interpretation.Also a decision tree is a competitive classifier by achieving high accuracy in many applications.Therefore, we select decision tree techniques as classifiers to discover conditions where MODIS or CALIOP can retrieve AODs well.
There are multiple decision tree algorithms available.Specifically, we apply C4.5 algorithm, a widely-used decision tree method, in this research.It can deal with both nominal and numerical types of attributes.
In the research, by using five-folds cross validation, we iteratively explore combination of attributes and apply C4.5 decision tree algorithms to classify data.The selected attributes achieve the largest classification accuracy are retained.
The resulted C4.5 decision tree holds a flowchart-like structure (illustrated in Fig. 2), where each internal node (illustrated by rectangles) denotes a test on an attribute, each branch represents pass ways splitting by the test, and each leaf node (illustrated by ovals) holds a class category label.Each path from the root to a leaf node consists of a decision rule.A rule can be written as an "IF" part and a "Then" part."IF" part consists of a combined condition, where each splitting criterion (such as specifying a data attribute in a value range) along a given path from the tree is logically "AND" connected.The "Then" part declares the class label for the records satisfying the combined condition in "IF" part.In the context of this study, the Ideally, each rule for a given class label covers many data records in a single class, rather than two classes.In another word, it is preferable to have a rule that produces an exact classification of records, i.e., to be pure.However, it is practically difficult to obtain pure rules in data with complex nature.Nevertheless, it is possible to obtain some rules containing a collection of records mostly from one class EE i and just few from the other class EE j (i, j = 0 or 1, i ≠ j).We use the following measure "confidence" to evaluate the accuracy of a rule, where, |EE i | (i =0 or 1) denotes the number of records satisfying the combined condition in "IF" part and its corresponding class label is EE i in a rule.
In addition, we compute measure "support" of a rule to calculate what is the percentage of records in all data set D that satisfy the rule, Rules that satisfy both a minimum confidence threshold and a minimum support threshold are defined as strong rules.
In this way, we can obtain four sets of strong condition rules: In the MODIS-CALIOP-AERONET collocated data set, each record has both MODIS AOD retrieval and CALIOP AOD retrieval.Each retrieval falls into one of three categories: accurate, inaccurate or unknown.
The MODIS AOD retrieval and CALIOP AOD retrieval are regarded as in the same level if they are in the same category -because they are measured by the same expected error envelope and supported by the same category of rules.Therefore, averaging is used as the fusion method for balancing for a robust predictor.For example, in the case of AOD retrievals in MI and CI, since the two instruments are in totally different design principles, they show different errors in different conditions, especially for outliers.Thereby, we choose their average as the final fusion results by smoothing the outlier error effects.
If MODIS AOD retrievals are accurate (MA) while CALIOP AOD retrievals are inaccurate (CI) or unknown (CU), we use MA as the fusion results.Similarly, if CALIOP AOD retrievals are inaccurate (CI) and MODIS AOD retrievals are unknown(MU), we would like to use MU as the fusion results by exploring the possibility that MODIS AOD retrievals might be accurate.
In the same way, the fusion operations can be conducted for situations where CALIOP AOD retrievals are accurate while MODIS AOD retrievals are not accurate, or situations where CALIOP AOD retrievals are unknown while MODIS AOD retrievals are inaccurate.
Based on the above two-stage approach, we obtain the final fusion results Fusion_AOD.The results are evaluated on four measurements: mean absolute error (MAE), relative absolute bias (RAB), root mean square error (RMSE) and R-square (R 2 ).They are defined as below,

Experimental results and discussion
In the analysis stage of our experiments, we use the J48 classifier, which is a Java implementation of C4.5 decision tree algorithm in an open-source software Weka [25].The confidence threshold is set to 80% and the support threshold is set to 0.5%.The parameter minNumObj in a J48 classifier controls the minimum number of instances per leaf in the decision tee.It is set to 50.
For MODIS-AERONET collocated data set, 14 attributes are finally selected in C4.5 decision tree classification.They are mean value of MODIS AOD retrievals (M_AOD), standard deviation of M_AOD (STD_M_AOD), NDVIswir, surface reflectance at wavelength 2.1 (Ref_7), Scattering Angle (SA), the percentage of cloud free pixels (cloud_free%), the percentage of coastal pixels (coastal%), the percentage of desert pixel (desert%), observation altitude (M_Alt), standard deviation of altitude (STD_M_Alt), longitude (Lon), latitude (Lat), distance from a Northern Europe point with Lat=57.70 and Lon==21.47 (DNEP) , distance to a Western Europe point with Lat=43.74 and Lon=6.54(DWEP).MAE of all MODIS-AERONET collocated records is 0.066.Fig. 2 shows a portion of decision tree flowchart resulted from the MODIS-AERONET collocated data.It illustrates one accurate retrieval condition and two inaccurate retrieval conditions.The full decision tree flowchart is complex and cannot visualize clearly in the paper.In the full decision tree, the extracted five strong condition rules deriving accurate MODIS AOD retrievals are listed in Tab. 2.
The analysis of each rule is described as below, MA1: If DWEP<=8.384 and desert%<=66.3%and M_AOD<=0.145and M_AOD>-0.020and STD_M_AOD<=0.089and Lat<=74.733 Then ACCURATE (confidence= 94.7%, support=15.8%,MAE=0.027) The numbers in the bracket show the effectiveness of rule MA1 according to three measures.Support value shows that 15.8% of records in the MODIS_AERONET collocated dataset satisfy the requirements of rule MA1.Confidence value shows that 94.7% MODIS AOD retrievals supported by rule MA1 are accurate retrievals.MAE presents the mean absolute error of MODIS AOD for all records supported by rule MA1.
The core combined condition of MA1 is DWEP<=8.384and M_AOD<=0.145and M_AOD>-0.020,because they cover almost the same set of records as in rule MA1.
Since DWEP represents the distance to a Western Europe point with Lat=43.74   The rule MA3 suggests that if surfaces are not unusually dark (Ref_7>0.0792),scattering angle plays an important role on MODIS retrievals for areas outside of desert area (desert%<=66.3%).In the combined condition, small or medium scattering angles (SA<=140.93)correspond to light aerosol loading points (M_AOD<=0.145).In the validation of C005 algorithm [15], Levy et  AERONET cloud screen algorithm has different design strategy from MODIS cloud screen algorithm.Thereby, the clear sky identified by AERONET site may contain cloud pixels detected by MODIS within a spatial range [15].The condition cloud_free%>98.5% indicates a clear sky detected by both instruments MODIS and AERONET.
The rule MA4 suggests that in the cases of clear sky (cloud_free%>98.5%),moderate/heavy aerosol loadings (M_AOD>0.145),and green surface (NDVIswir>0.323),MODIS retrievals are accurate within homogeneous area (STD_M_AOD<=0.089)where Longitude>-104.7 and Latitude<=61.846.It coincides with Levy's validation results that the difference between MODIS and AERONET AODs is small when NDVIswir is around 0.4 [15].MA4 further lists the detailed combined condition for accurate retrievals in green surface.The condition cloud_free%<=98.5% shows retrieval records with a bit of cloudy detected by MODIS.This rule suggests that in the cases of cloudy, extremely dark surface (Ref_7<=0.008)and moderate/heavy aerosol loadings (M_AOD>0.145),MODIS AOD retrievals are accurate.It verifies the effectiveness of MODIS secondgeneration AOD retrieval algorithm about its assumptions of the surface optical characteristics.
In the decision tree, the extracted five strong condition rules deriving inaccurate MODIS AOD retrievals are listed in Tab. 3 This rule shows MODIS has inaccurate AOD retrievals on heterogeneous areas (STD_M_AOD>0.089)which are in the high altitudes (M_Alt>1170.5)and are outside of deserts (desert%<=66.3%).
In Levy's validation results [15], they showed MODIS compare poorly with AERONET on elevated targets, such as on plateau, which has a relative brighter surface.
MI3: If desert%<=66.3%and M_Alt<=1170.5  This rule shows that in heterogeneous areas with extreme complex nature (STD_M_AOD>0.191),either in low altitudes or high altitudes, AODs varies greatly in a spatial box and they lead to large standard deviation of MODIS AOD.In this case, it cannot match well with AERONET AOD.
MI5: If Lat>74.733 and desert%<=66.3%and STD_M_AOD<=0.089Then INACCURATE (confidence=100%, support=0.6%,MAE=0.180) This rule shows MODIS has inaccurate retrievals on areas in high latitude (Lat>74.733).These areas are near to poles, mostly covered by ice and leading to a brighter surface.A bright surface will make the aerosol signals are comparatively weak for MODIS retrieval.
For CALIOP-AERONET collocated data set, 7 attributes are selected in C4.5 decision tree classification.They are cloud flag (C_Cloud), mean value of CALIOP AOD retrievals (C_AOD), standard deviation of C_AOD (STD_C_AOD), the thickest aerosol layer type in a scanning column (MAX_AOD_TYPE), the thinnest aerosol layer type in a scanning column (MIN_AOD_TYPE), the altitude of the base for the thickest aerosol layer (MAX_AOD_LBA), the temperature in the middle level of the thickest aerosol layer (MAX_AOD_LMT).The parameter minNumObj in J48 is set to 15. MAE of all CALIOP-AERONET collocated records is 0.106.
In the decision tree, the extracted two strong condition rules deriving accurate CALIOP AOD retrievals are listed in Table IV   Based on the above rules, we categorize the MODIS retrievals from MODIS-CALIOP-AERONET collocated data into three subsets: MA,MU and MI; and categorize the CALIOP retrievals from MODIS-CALIOP-AERONET collocated data into three subsets as well: CA,CU and CI.
Next, we apply the fusion plan presented in Table I on MODIS-CALIOP-AERONET collocated data.The final fusion results are presented in Tab. 6.
In Tab. 6, we see that by using measures MAE, RAB, RMSE, R 2 , the fusion results all are significantly better than either MODIS AOD retrievals or CALIOP AOD retrievals.For example, the root mean square error (RMSE) for MODIS and CALIOP are 0.138 and 0.153 respectively.The fusion result has been significantly decreased to 0.087.For R 2 measure, fusion result has achieved to 0.802 from 0.647 for MODIS retrievals and 0.517 for CALIOP retrievals.To intuitively compare MODIS AOD retrievals, CALIOP AOD retrievals and our fusion results, we visualize the collocated data points in Fig. 3a ÷ 3c.
From Fig. 3a ÷ 3c, we see that compared with the points from MODIS or CALIOP AOD retrievals, the points from fusion results are more closely clustered near the 1-1 line and the linear regression line has the smallest deviation from the 1-1 line.It illustrates the fusion results are more accurate than MODIS or CALIOP retrievals by comparing with AERONET AODs.This proves the effectiveness of our fusion approach.
We also compared MODIS AOD retrievals and CALIOP AOD retrievals in the experiment with results from others' previous study.Levy et al. compared MODIS C005 AODs over dark land with AERONET collocations at 550 nm wavelength, the result shows RMSE=0.116,R 2 =0.778 (corresponding to correlation coefficient R=0.882) [15].But their collocation points only cover dark targets.They noted that MODIS is comparatively weak for brighter surface.Our experiments include collocation points on bright deserts.It explains the reason why our result achieves RMSE = 0.138 and R 2 = 0.647, relatively worse than theirs.Omar et al. performed an analysis of the correlations between cloudcleared CALIOP 532 nm and AERONET 500 nm AOD over 92 land sites and 57 coastal sites [8].The 613 land collocations achieves R 2 = 0.348 (corresponding to correlation coefficient R = 0.59) and the 468 coastal collocations achieves R 2 = 0.292 (corresponding to correlation coefficient R = 0.54).They compared CALIOP and AERONET AOD retrievals in different wavelength and also their selected collocated sites are different from ours.These reasons may explain the difference.Especially, their result shows that the differences between the CALIOP AOD retrievals and the AERONET AOD measurements are independent of the surface type.It is helpful to explain in some surface type condition (such as desert), CALIOP retrieves AODs well but MODIS does not.

Conclusion
MODIS and CALIOP are two independent observation instruments in the A-train satellite constellation.They both provide aerosol retrievals at nearly the same locations on the Earth's surface with a two-minute interval.Due to the different design principles of the two observation instrument, their accuracy of AOD retrievals varies under different conditions.In this work, a two-stage fusion approach is proposed, including an analysis stage and an integration stage, to combine AOD retrievals with different accuracy from each instrument together in order to improve AOD retrieval accuracy.In the analysis stage, a data-driven decision-tree algorithm is used to systematically analyze the relationship between the observation conditions and accuracy of AOD retrievals for instrument MODIS and CALIOP respectively.In the integration stage, based on the discovered condition rules, we combine AOD retrievals from both instruments together to form a fusion plan.We test the fusion approach on the collocated data among MODIS, CALIOP and AERONET from April 2, 2009 to April 1, 2011.From decision-tree analysis, we obtain five accurate retrieval conditions and five inaccurate retrieval conditions for MODIS, and two accurate retrieval conditions and three inaccurate retrieval conditions for CALIOP.The final fusion result achieves AOD retrieval accuracy with R 2 = 0.802 and RMSE = 0.087, which is significantly more accurate than AOD retrievals from any single observation facility (For MODIS retrievals, R 2 = 0.647, RMSE = 0.138, and for CALIOP retrievals, R 2 = 0.517, RMSE = 0.133).
In our future research, we will investigate more interesting topics by fusion of MODIS and CALIOP data.MODIS and CALIOP are designed with different principles and their observed information can complement with one another.We will test if the complement information will bring new insights for AOD retrievals for some specific areas, such as urban areas, high altitude areas, ice surface areas or in cloud fields.In addition, it will be interesting to explore the fusion information from MODIS and CALIOP will be helpful for air quality forecast applications.
Meanwhile, in the next step, we would like to apply our fusion approach to other remote sensor combination.Except that other sensors collect different observation attributes from MODIS and CALIOP and they require different data pre-processing, we can apply the same decision tree techniques to discover the accurate retrieval conditions for each sensor in the analysis stage and similarly form an integration plan in the fusion stage.

Figure 2 A
Figure 2 A portion of decision tree flowchart by analysis of MODIS-AERONET collocated data

Figure 3
(a) Comparison of MODIS AOD retrievals with AERONET AOD retrievals (R 2 =0.647)(b) Comparison of CALIOP AOD retrievals with AERONET AOD retrievals (R 2 =0.517) (c) Comparison of fusion results with AERONET AOD retrievals (R 2 =0.802) class label is either ACCURATE or INACCURATE, indicating accurate AOD retrievals within EE envelope (class EE 1 ) or inaccurate AOD retrievals outside of EE envelope (class EE 0 ) respectively.
1) rule sets R1 where MODIS AOD retrievals are in class EE 1 ; 2) rule sets R2 where MODIS AOD retrievals are in class EE 0 ; 3) rule sets R3 where CALIOP AOD retrievals are in class EE 1 ; 4) rule sets R4 where CALIOP AOD retrievals are in class EE 0 .Based on R1 and R2, MODIS AOD retrievals are categorized into three subsets: MA, MI and MU.MA includes accurate retrievals supported by strong rules R1.MI includes inaccurate retrievals supported by strong rules R2.The remaining retrievals as unknown instances in MODIS AOD (no strong rule supports) form the subset MU.In similar way, CALIOP AOD retrievals are categorized into subsets CA, CI and CU.CA covers accurate retrievals, CI covers inaccurate retrievals and CU covers remaining unknown retrievals.In the fusion stage, a fusion plan is established according to definition of MA, MU, MI and CA, CU, CI, as shown in Tab. 1.
and Lon=6.54, the rule MA1 suggests that in the Western Europe (with the clustering

Table 2
Condition Rules Deriving accurate MODIS AOD Retrievals al. show that larger AOD is associated with larger scattering angle and it verifies the rule MA3.

Table 3
Condition Rules Deriving Inaccurate MODIS AOD Retrievals [15] rule MI1 suggests MODIS has inaccurate AOD retrievals on desert (desert%>66.3%).Desert has a relatively brighter surface than dark-targets where aerosol signals are comparatively weak for MODIS retrieval.We investigated the details and found under the condition, 90.8% of supported records overestimate AOD.It agrees with Levy et al's validation results[15]that MODIS poorly overestimates AOD retrievals on bright surface.

Table 6
Compare Fusion Results with MODIS/CALIOP AOD