Identification of Shadowed Areas to Improve Ragweed Leaf Segmentation

: As part of a project targeting geometrical structure analysis and identification of ragweed leaves, sample images were created. Even though images were taken under near optimal conditions, the samples still contain noise of cast shadow. The proposed method improves chromaticity based primary shape segmentation efficiency by identification and re-classification of the shadowed areas. The primary classification of each point is done generally based on thresholding the Hue channel of Hue/Saturation/Value color space. In this work, the primary classification is enhanced by thresholding an intra-class normalized weight computed from the class specific Value channel. The corrective step is the removal of areas marked as shadow from the object class. The idea is based on the assumption that the image contains a single, flat leaf in front of a homogeneous background, but there are no color and illumination restrictions. Thus, parameters of the imaging system and the light sources have influence on homogeneity of image parts; however vague shadows differ only in intensity, and hard shadows can only be dropped on the background.


INTRODUCTION
Common ragweed (Ambrosia artemisiifolia L) is one of the most troublesome weeds in Hungary as well as throughout Europe, even though this invasive species can only be found since the First World War. It causes significant economic damage, since it dramatically reduces crop yields [1]; moreover it is also harmful to the human health, as its pollen is recognized as a significant cause of allergic symptoms. Therefore, it is in the focus of intensive research. [2] presented an exhaustive study on most of the biological aspects within the framework of the British Isles. [3] investigated scale and nature of environmental factors, for example the importance of the local site, weather and soil type, throughout a large number of populations covering the European continent. [4] studied the spatial distribution of ragweed pollen concentration in Europe, focusing on the modifying functions of geographical coordinates. Influence of climate change in the spread of ragweed was also studied in [5]. Some of the researches focus on methods to identify larger areas infected by ragweed, mainly based on satellite images [6,7].
The central task of our workgroup is the identification of ragweed by processing digital images taken from the nature fields in order to specify field segments where preventive and intervener action is required, see [8] for example. In order to fully exploit the advantages of the known botanical features of the ragweed leaves, most of the methods require perfect teaching sample images. For the production of these samples some initial conditions were stated. For example, the image should contain a single flat object over a homogeneous background. The camera axis should be perpendicular to the object plane, and application of mirroring surfaces should be avoided.
The samples prepared by botanical experts have excellent botanical properties, but unfortunately lightly loaded by a structured noise resulting from the physical properties of the materials and the illumination. In other words, the samples were created with very similar chromatic and illumination properties, therefore the directions, colors and intensities of the shadows are also very similar. As a result, during image analysis, these properties will appear to be properties of the leaf. This is partially correct, because the shadows are cast by the leaf with all its unique properties. Therefore, this also serves as one of the source properties of the classification. But this is just partially correct. This would be fully correct if the same illumination circumstances were guaranteed during all independent experiments. Since this is mostly not possible, handling shadows as object properties results in false object classification. In other words, should another leaf sample series be given, where shadowed areas appear elsewhere as a result of another illumination, their unique properties would significantly differ and the classification error would occur immediately. Hence, influence of illumination should be decreased as far as possible.

RELATED WORKS
In the last decade intensive research was focused on shadow detection, segmentation, reduction of effects of illumination or their complete removal. One way was to use invisible radiation detection, e.g. [9] used near-infrared information. Another way is to use information visible for the human eye, sampled in red, green and blue (RGB). The consentaneous in these works is that the source is a single color image, converted to a colour space other than RGB, which represents chromaticity and illumination better [10,11]. Due to the fact that the scene setup of currently examined ragweed images and satellite images has some important common characteristics, the preliminary study was the review of satellite image processing methods. These common features are: the distance of the background and the object is much shorter than the distance of the object and the camera; there is no cast shadow of exterior object, nor background; object and camera plains are near perpendicular.
Another work [12] considered shadows in high resolution satellite images, proposing morphological filtering for shadow detection and an example-based learning method for shadow reconstruction. This recent work focused on attenuating the problems caused by the loss of radiometric information in shadowed areas. [13] also worked on satellite images. They tried to locate clouds and cloud shadows separately. The first step of their method was the computation of shadow indices from image bands, then after finding cloud pixels, shadowed areas were predicted by geometric relationships among the Sun, cloud pixels and Earth. As the final step, confirmation of shadowed zones was done by analysing time series of shadow regions. In summary, this method uses deep knowledge about the scene where the video was shot. The assumptions are the distance from object, the angle of the illumination source and the relative angle of the illumination and the ground. [14] also analysed satellite images by comparison of the union of shadow areas identified separately by twelve experts with the result of an automated shadow detection method they proposed. They used panchromatic and high resolution multispectral imagery. They created thresholds for both methods from a set of training data. To apply a single threshold on multispectral data, they reduced the dimensionality by principal component analysis. The intersection of these two labelings resulted in sharper details and the 1 st principal component allowed shadow detection on vegetated lands. [15] proposed a shadow detection method on cloud-free high resolution satellite images. That means shadows can be cast by objects appearing in the image. They created four indices from R, G, B channels, NIR and PAN images. Then they applied Otsu's thresholding method to find proper thresholding. The difficulty of this method is the need for NIR and PAN images, and the trigonometrical indexing. [16] analysed some of the stateof-the-art methods to select the appropriate one for their specified purposes. As a result of their analysis, applying normalised saturation-value difference (NSVD) and water body index (WBI) represented the most effective methods. Accuracy measurements showed that a more efficient solution is required using RGB space without the support of data from infrared band. They proposed a method using Otsu's thresholding 3 parameters of RGB space (WBI, color invariance and first principal component). [17] worked on shadow detection and removal in aerial motion imagery. They obtained the shadow mask by calculating a Specthem ratio and multilevel thresholding on an image converted from RGB to CIE Lch. After this, they used morphological operations to reduce noise on the shadow mask. [18] proposed a user-aided method for shadow removal. This method produces a robust removal of shadows by applying a bilateral filter and piecewise curve fitting on edges. It also offers a flexible user interaction, but for shadow detection it requires shadow features set by a user, i.e. a deep a-priori knowledge or assumption about the examined scene is needed.
A partially active method was presented by [19]. They detected the backscattered light with several photodiodes as single-pixel detectors at different spatial locations. The shading profile could be determined by the measured reflections and the positions of detectors. This profile was then applied to obtain an image where shadows are complementary to each other. This technique performed quite well but uses structured illumination generated by a projector and scene preparation is required with illumination sensors. [20] used inconsistencies in shadow geometry for image forgery detection. They computed the correspondence of an object and its cast shadow, but the object and shadowed points were set manually, by the user.
In a comparative evaluation, [21] listed and compared recent techniques for moving cast shadow detection of the following categories: chromaticity, physical property, geometry and texture. They stated that all approaches make different contributions and all have individual strengths and weaknesses. Geometry based models cannot be generalised because of strict assumptions, physical models can improve other results by adding local shadow models. Small region texture based methods perform well on textured images, but require more implementational and computational resources. Chromaticity based methods are faster to implement and run, but more sensitive to noise and less effective on under-saturated regions. [22] created an occlusion map gathered from improved BCP (bright channel prior) and joint models, with luminance and chromaticity to refine the shadow mask. In [23], Gau et al. apply a region based approach, in which besides considering individual regions separately, they predict relative illumination conditions of the segmented areas and perform a pairwise classification. [24] proposed a method combining intensity information and geometric features in grayscale images. Normally illuminated areas and shadow candidate areas are distinguished by grey level thresholds. Areas are formed from corresponding pixels by morphological steps. A particle swarm optimization is used for feature extraction and shadow-non-shadow region pairs are created by Kolmogorov test of these features.
Based on the works cited above, it is obvious that the identification of shadowed areas based on global statistical methods can only be applied when the chromaticity values are measured correctly. Should the chromaticity be insufficient i.e. without enough energy -the chromaticity based segmentation fails. In this case one has to rely on illumination data. Moreover, the intensity based global segmentation also performs poorly when the intensity of the shadow is between the intensity of the object and the background. In such cases, the shadowed area would be identified as part of the dark object.
In [25] Otsu published a global statistical method applied on intensity histograms to threshold grayscale images. [26,27] combined color and intensity based methods and published a basic statistical method to separate background and foreground independently from cast shadows.
The above solutions generally used principal component analysis, Fourier transform and other computationally expensive methods. As opposed, our main goal was to create a simple pre-processor method with noise filtering and segmentation, which can be easily implemented, does not contain unknown parameters, and can prepare any field samples for further evaluation processes with an acceptable efficiency and speed.
Based on the above results, the presented method combines hue and intensity based evaluations with Otsu's inter-class variance in hue data and weighted intra-class variance in intensity for enhanced shadow thresholding.
Segmentation of shadowed regions based on global chromaticity or intensity statistical data can be applied only with very strict limitations. Nevertheless, from the perspective of the current study, identification of shadows is a very important issue. Therefore, applying as many limitations as required is acceptable to produce useful output, as a part of a preprocessing step of a complex process. These limitations are defined in the initial conditions: single, flat leaf over a homogeneous background. In such images shadows can only appear on the background.
The shadow detection of 2D images can be extended by gathering 3D depth information of the images.
Shadows are result of occlusion in illumination setup but appear to be as independent object components and turn up on existing surfaces. Such shadow edges can be used to extract 3D information by processing visual perception results about the surface they appear on. With the help of a 3D depth map of the scene, shadow edges could be eliminated from coherent, monolith surfaces.
Depth perception could need extra sensors, or cameras, but at least extra time to get extra information. In some cases, the extra time is worth spending on gathering 3D data, but in many cases, fast, 2D-only methods are needed. The actual paper is dealing only with the 2D methods and shadow extraction.

THE PROPOSED METHOD 3.1 Chromaticity Based Primary Segmentation
In theory, the homogeneous object could easily be separated from homogeneous background based on chromaticity data. The segmentation is done by the bipolar histogram segmentation method, developed by Otsu. The result is presented in Fig. 1.
In Eq. (1) through Eq. (5) 0 and 1 lower indices are referring to the classes separated by thresholding, w 0 (t) and w 1 (t) are weights of the classes related to the t threshold index. μ 0 (t) and μ 1 (t) are weighted averages of the classes, μ T is the weighted average of the full histogram. T represents the histogram of L bins without thresholding, in which p i represents the probability of the i-th bin.
The chromaticity transformation changes the color specified by two dominant RGB channels to a hue value and projects it onto a color circle. For numerical representation, the rotation angle from red color as a base is used. Thus, the values of the hue histogram can be between 0° and 360°. With this representation, the colors on the histogram edges (0° and 359°) represent almost the same red color. The human eye cannot even sense the difference and this is just on the perception limit of the digital camera. Such a projection of a bipolar histogram containing red color becomes tripolar. In a such case, as shown in Fig. 2, Otsu's thresholding method cannot be applied directly.

Figure 2 Otsu's segmentation of a tripolar histogram
To define the optimal threshold, Otsu's thresholding method is used with increasing circular shifts on the histogram. Circularity means when values are rolled out on a side of the histogram by the shift, they roll in on the other side in the same order. But only maximizing the inter-class variances across all shifts would not find optimal shift and threshold, because a pole could be cut.
To avoid this, minimization of distance between class expected values and maximums is used for finding the best shift (s opt ), as described in Eq. (6) to Eq. (8). This is appropriate, because two subsets are searched with a distribution very similar to the Bell-curve.
As shown in Fig. 3 In Eq. (6) through Eq. (9) hist s is the histogram circularly shifted by s, I(s) is the intensity index maximum of the class, d(s) is the distance of class intensity maximum index and its expected values for the classes and combined for both, s opt is the optimal shift.  Hue values should be taken into consideration only above a given saturation and intensity, because without sufficient intensity the relevance of hue value becomes unacceptably low or the value it shows is completely invalid.
These cases appear with increased frequency in connection with shadows because their main property is the reduced intensity caused by occlusion of illumination. Therefore, hue based classification of shadowed areas is problematic.
When comparing the binary image, resulted by a huebased thresholding shown in Fig. 5a, to the grayscale representations of Fig. 6a to Fig. 6c or a binary segmentation created by a human expert (using much more complex algorithms for segmentation, classification and apriory information) shown in Fig. 5b, a classification error of shadowed areas can be noticed, particularly on the right side of the leaf. That is because the illumination came from left, therefore the shadows were dropped onto the right side. These dark shadowed areas were classified the same as the dark leaf body, even if their intensities were different. This classification error changes the identified structure of the leaf.
In this paper, the focus is on identification of shadowed areas, and their removal from object/leaf pixels.

Shadow Detection
Beside hue values, intensity data provides information about shadowed areas, therefore this is used to increase the efficiency of the segmentation method. To complete this, an intensity representation is required, which contains the necessary information, and computationally inexpensive. The following, widely used intensity representations can be applied. Their examples are shown in Fig. 6a to Fig. 6c:  Value data of HSV color space intensity of the dominant color channel: Fig. 6a:  Lightness -intensity average of most and least dominant color channels: Fig. 6b:  Intensity -intensity average of all color channels: Fig.  6c:  GrayScale -weighted intensity average of all color channels: I gs = 0.2126‧R +0.7152‧G +0.0722‧B The last grayscale representation differs from intensity only in channel weights, it does not contain additional information, and as such, its examination is not necessary.
The outputs of listed representation methods look the same, but obviously they are not equal. Fig. 7a to Fig. 7b show the differences between them. In the difference images two things are noticeable. First, the difference between intensity and lightness is the lowest. Second, the difference in the examined shadow areas is the biggest as can be seen in Fig. 7. Statistics of differences are summarized in Tab. 1. Based on Fig. 6 and Tab. 1, it can be stated that all representations (Eq. (10) to Eq. (13)) do contain enough information about shadowed areas, so the representation to use can be selected upon their computational requirements. Using RGB-HSV transformation hue and value data computed. Using this representation, the intensity based segmentation is done at first by applying the original Otsu's thresholding method on value. Its result is shown in Fig.  8a. As mentioned earlier, this method handles histograms as bipolar functions, so it classifies low intensity shadowed areas into the class with closer intensity. The reliability of classification is decreasing with the intensity, and could change in a theoretically closed spatial region with different illumination. In Fig. 8a, binary segmented leaf is presented and Fig.  8b and Fig. 8c show that the black-and-white binary segmentation of the grayscale leaf contains a part of the shadowed background. This segmentation error, like hue classification error, could significantly change the final leaf structure. Thus, this segmentation must be improved.
To support the improvement, a decision uncertainty is introduced. This function describes the distance of the histogram item from the closer classification threshold, normalized by the half of the class, as described in Eq. (14). Two thresholds must be applied, because of the circular segmentation. One is the shift of the histogram, the other is the relative threshold selected by the Otsu's method. As it is shown in Fig. 9, this method marks the shadowed regions with higher decision uncertainty.

Figure 9 Distance from closer threshold
Unfortunately, there is a great chance that its histogram will be similar to that in Fig. 2, where the pole does not appear in the centre of the class. In this case, instead of the anomaly points, the object points will get higher uncertainty value.
In Eq (14) p uc (i) is the uncertainty of the decision about the class of the i th element of the histogram, t s is the start index of the class (the shift of circular thresholding method), t e is the end index of the class (t e = s= t(s)).
Based on the analysis of the image creation process, it is presumable that both the object and the background have more pixels with normal illumination than with shadows. The underlying reason is that with almost flat surfaces their distance will be close, and the illumination of the full surface will be consistent, with an illumination angle greater than 45°.
Based on this assumption, the segmentation method can be enhanced by reformulation of the uncertainty. In the new formula, shown by Eq. (16) and Eq. (18), the uncertainty will be the absolute distance between the class expected value and the examined intensity index, normalized with the variance. indicates that the pixel at location (x, y) is classified as an object.
The enhanced uncertainty values are calculated applying Eq. (14) to Eq. (17) and presented in Fig. 10. By thresholding these values, the areas with high probability of illumination, as different from the average, can be marked.
The proposed pixel-based method may generate pixel labels with false classification. Isolated pixels could be removed by an area-based thresholding, so that only regions with size greater than a threshold can be considered as shadow regions.

RESULTS
As the last step of the proposed process, let one modify the binary image of Fig. 5a which was created by circular hue based thresholding, with marking the shadowed areas of Fig. 10b as part of the background. The applied mask computed from the uncertainty is shown in Fig. 11a to Fig.  11b. The result of this full process will be an enhanced, general statistic based binary segmentation, with an output better fitting the ground truth created manually by a human expert. In Fig. 11a, the original grayscale image is shown. Fig. 12b to Fig. 12d show binary images created by the original hue based, by the enhanced thresholding and the ground truth. Magnification of the (middle-left) leaf part of the original grayscale image and its binary representations are shown in Fig. 12a to Fig. 12d. It is clearly visible that the enhanced binary image fits the original leaf shape better than the result of the primary intensity based segmentation.   Human segmentation uses also other data than pixel intensity and color statistics, like local gradient size, direction and edge direction estimation from its partial identification. Based on these, the human made segmentation generally became stricter. This caused false positive predictions along the contour of the leaf. To adjust this segmentation mismatch, the ground truth was dilated morphologically. By applying the methods above, the amount of false positive predictions was reduced by 57% compared to the Hue's method based segmentation. The adjusted prediction errors appeared only at areas with extremely low illumination, related to maximum one third of the leaf. Therefore, partial recognition could take place based on leaf structure. Except the prediction errors of the border, expected probability of false negative predictions was 0%.
For the ragweed leaf structural recognition experiment, more than 217 images were created, and 38% of them contained shadow that modified the structure after binary segmentation. Thus, the proposed method was used on 83 images to improve binary segmentation.

CONCLUSION
Without a-priori knowledge about the scene, identification of shadowed areas of a single image is a very complex and hard task. Global statistical calculations concerning the whole image have the advantage of simplicity, and shorter run times, but they are sensitive to noise and do not utilise local gradient information.
The method presented is well applicable for segmentation of images, where the leaf and background chromaticity and illumination properties dominate histograms. The primary rough segmentation can be refined by detecting under-illumination appearing at the bottom of classes in the illumination histogram; these values can be treated as background. Utilising a priori information on the samples can be done chromaticity subspace, since vague self-shadows are represented with enough radiation energy. A maximized circular Otsu's segmentation in the chromatic histogram roughly separates the leaf from the background which can be further improved by local semantical connections between the background and the object, namely the standard deviation of pixel intensities from the expected value of the primarily selected class can be exploited for segmentation improvement.
Application of the proposed method allows more precise segmentation approximation of the real shape is better than in previous methods of the available ragweed leaf samples, thus their more precise structural analysis is possible. It is important to note that because of the strict limitations, efficiency of the proposed method may significantly decrease in general image processing tasks and is specially designed for leaf-sample processing.