A Novel Tomato Volume Measurement Method based on Machine Vision

: Density is one of the auxiliary indicators for judging the internal quality of tomatoes. However, in the density measurement process, it is often difficult to measure the volume of the tomatoes accurately. To solve this problem, first, this study proposed a novel tomato volume measurement method based on machine vision. The proposed method uses machine vision to measure the geometric feature parameters of tomatoes, and inputs them into the LabVIEW software to convert the calculation of irregular tomato volume into a BP neural network (BPNN) model that calculates the plane pixel area and pixel volume, thereby realizing the modeling, analysis, design and simulation of tomato volume; then, an experimental platform was constructed to compare the results of the proposed method with the results predicted by the 3D wireframe model. When the number of photos taken was n = 5, the average error of the tomato volume prediction results of the 3D wireframe model was 8.22%, and the highest accuracy was 92.93%; while the average error of the tomato volume prediction results of the BPNN was 4.60%, and the highest accuracy was 95.60%. Increasing the number of orthographic projections can improve the accuracy of the model, but when the number of photos was more than 7, the accuracy improvement was not significant. Also, increasing the number of nodes in the hidden layer can improve the accuracy of the model, however, considering that increasing the number of nodes will increase the host operating cost, it is suggested to choose a node number of 12 for the tomato volume measurement. In the end, the final experimental results showed that the proposed method achieved better measurement results. However, the volume measured by the two models is larger than the real volume of tomatoes. For this reason, we added a correction coefficient to the BPNN model, and its highest accuracy has increased by 1.3%.


INTRODUCTION
Automatic grading of fruits and vegetables can reduce work intensity, improve grading accuracy, and increase the value of agricultural products. It is an important link in the commercialization of agricultural products [1]. Tomatoes are one of the world's most important mass production and consumption agricultural products [2]. In China, approximately 60 million tons of tomatoes are produced each year. Since consumers often judge the quality of tomatoes by surface characteristics such as color, size, and shape [3][4], many experts and scholars have carried out relevant research on automatic tomato grading, and most of their studies focus on using machine vision to detect the ripeness, color, size, and surface defects of tomatoes [1,5]. Among them, the size of tomatoes is an important factor affecting consumers to buy tomatoes, and the volume of tomatoes is the most intuitive way to describe the size of tomatoes. Traditional volume measurement methods include using 3D scanners to scan targets, the Water drainage method, wireframe models or machine vision. These methods have their own advantages and disadvantages.
Lu et al. [6] used BPNN to optimize and predict the volume shrinkage and distortion of thin-wall components. They used a method of support vector machine and genetic algorithm optimized BPNN to accurately predict the optimization target. Jadhav successfully applied 3D imaging technology to estimate the volume of apples, oranges, mangos, strawberries, and pomegranates [7]. The measurement accuracy is 98.5%, but using of multiple cameras lead to high cost, complex platform construction and long measurement time. Hang Zhou used 3D laser radar to successfully estimate the volume of tree canopy [8], the prediction accuracy of tree canopy is 86.1%, which is low, and the equipment used is expensive. Anderson, NT adopted machine vision to successfully estimate the fruit load of mango trees [9], the accuracy is about 90%, but using wireless remote sensing satellites lead to the high cost. Vivek Venkatesh employed machine vision and image processing techniques to estimate the volume and quality of axisymmetrically-shaped fruits such as apples, limes, lemons and oranges [10], the accuracy is greater than 90%, but it is only suitable for the measurement of axisymmetric fruit volume. Although the above studies have proved the feasibility of using machine vision to solve the volume of tomatoes, the actual volume prediction accuracy is generally low and cannot be applied in reality.
Research based on neural networks has been widely promoted in the field of machine vision. The research of Hsueh-Chien proved the feasibility of neural networks in volume measurement.To predict tomato volume with higher accuracy, this paper attempts to compare the performance of the 3D wireframe model and the proposed BPNN in tomato volume measurement, in the hopes of providing a low-cost and non-destructive method for the measurement of tomato volume. The input of the BPNN model is the tomato projection pixel area and the output is the real tomato volume.

Tomato Sample Selection
In the experiment, the 200 ripe tomato samples were picked from the greenhouse of Yangling Modern Agriculture Demonstration and Innovation Park, and the variety was Jinpeng#1. In order to ensure the randomness of the volume size of the samples, the ripe tomatoes were picked randomly, and the volume of sample tomatoes was observed to conform to the normal distribution roughly. In addition, to facilitate data processing and statistics, each sample was numbered, the tomato samples numbered 1-100 were used for the model training of BPNN, and tomato samples numbered 101-200 were used to compare the performance of the two methods. Moreover, to eliminate the influence of temperature on the prediction accuracy, the samples were kept at room temperature (20 °C) for 12 hours before the experiment [11].

Volume Calculation Method based on Wireframe Model
The 3D wireframe model is a model that uses the edges and vertices of an object to represent its geometric shape [12]. It uses the projected edges of an object to generate more complicated curves, namely the projections of intersecting lines of two surfaces, outlines of surfaces, and curves on a curved surface. According to this principle, based on the front view of the contour of tomatoes, we could get a frame consisted by the collection of pixels on the 2D plane. When a sample object rotates on a rotating table, uniformly take n photos of the object, and these photos, namely the front view of the sample object, are the data source of the model.
In this study, to construct the 3D wireframe model for tomato volume measurement, first, a rotating table was set for the sample tomatoes so that they could rotate a certain angle at regular time intervals, and photos could be taken at the equal time intervals; then, the tomato contour in each photo was extracted via image segmentation. The image acquisition principle is shown in Fig. 1. With the center of the rotating table as the origin and the vertical axis as the Z axis, the Cartesian coordinates of the contour points were recorded. Starting from the first image, the coordinate rotation angles of photos taken at each time interval were sequentially accumulated. With the vertical center axis of the image as the Z axis, the intersection point of Z axis and the lower edge of the photo as the origin, the Cartesian coordinate system of the image was constructed. Then, with Z axis as the rotation axis, the first, second, and n-th images were rotated π/n, 2π/n, …., π, respectively. (x ij , y ij ) was defined as the coordinate of the j-th point on the tomato contour of the ith image, (x' ij , y' ij ) was the coordinate after the rotation transformation. In this coordinate system, the maximum row value of the edge of the tomato was detected to be h max , for the same point, after rotation, its z' in the coordinate system was h max-x . Then, for the i-th image rotating around Z axis, the 2D coordinate of the image edge after rotation was: Since the actual tomato expression model affects the reading effect, when describing the tomato wireframe model in this article, the tomato is equivalent to an ellipsoid. The tomato wireframe model established according to the above method is shown in Fig. 2. The model was equally divided into m slices along the vertical axis, and the volume of each slice was calculated and summed to obtain the final tomato volume. When solving the volume of the slices, each slice was equivalent to a frustum of a cone, x and y were the 2D coordinates of each slice, and h was the thickness of each slice. S i , S i+1 was respectively the bottom area of the i-th, and the i + 1-th slice, the bottom area of each slice was , the volume of each slice can be calculated using Eq. (2), and then the total volume of the model could be obtained.
Every time an image of an agricultural product sample was collected, the distance between the center of the tomato and the camera film was set to 140 mm. A coefficient V s = 0.1246 was obtained; if the total pixel volume of an agricultural product sample is V, then the actual volume of the tomato is V  V s . The total pixel volume V should satisfy the following formula:

Image Acquisition Platform
A dark box image acquisition platform shown as Fig.  3 Fig. 3, and the distance between the camera film and the center of the tomato is 140 mm. The platform has a simple production method, low cost and it is easy to promote.

Image Processing
Based on the Vision Assistant of LabVIEW, a basic vision processing program [13][14][15][16] was compiled to detect and extract tomato contours from the images. The basic processes include image segmentation, image graying, image binarization, particle filtering, and the IMAQ particle analysis for solving pixel area. In the experiment, the HSV color space was adopted and channel V was selected.
To prevent large errors during particle analysis, the Vision Assistant sub-function was called, the minimum value of filter parameter was set to 0.01, the filter condition was the pixel area. Particle filtering was employed for image pre-processing to filter and remove particles with smaller pixel area [17]. The image acquisition and processing processes are shown in Fig. 4. During the experiment, the software control program compiled in LabVIEW was used to initiate the platform. Every time the platform rotates, the computer collects a tomato contour image and processes it accordingly. In order to ensure the correctness of the results, the number of captured images (namely the photos) n was set to 5. The interface of the tomato volume measurement software platform constructed based on LabVIEW is shown in Fig. 5. The software can output the tomato's wireframe model to predict the volume. During image acquisition, NI Vision Assistant was adopted to configure the camera, the Color Plane Extraction sub-function was adopted to realize the S-space image extraction function in HSV color space, the Threshold function was adopted for tomato image separation, the Particle Filter function was adopted for particle filtering, and the Particle Analysis function was adopted to calculate the pixel area. In addition, after the tomato images were extracted, the tomato edge coordinates could be obtained by turning on the "Edge Information" function, and the tomato pixel volume can be obtained according to Eqs. (1), (2) and (3). Use the Particle pixel area function of LabVIEW to output the pixel area of the tomato projection, and the file IO function can be used to store the pixel area. Subsequently use MATLAB software to build the BPNN model, taking the n pixel area of each tomato as input and the real volume of each tomato as output to build the model.

Tomato Volume Prediction Model based on BPNN
Obviously, there is an obvious non-linear relationship between the pixel area of a tomato's orthographic projection and its actual volume, however, the irregular appearance of the tomato makes it difficult to describe such non-linear relationship using mathematical expressions, therefore, a BPNN model was constructed for each tomato's orthographic projection pixel area and its actual volume [18][19][20]. The BPNN constructed in this paper had an input layer, a hidden layer and an output layer. The input layer was the area of n processed pixels, the nodes of the hidden layer were obtained from empirical Eq. (4), and the output layer was the predicted volume of the tomato.
where, S is the number of nodes in the hidden layer, a is the number of nodes in the input layer, b is the number of nodes in the output layer, and c is a constant between 1-10. Since the input layer was an n × 1 vector, the input layer node was 1. One of the models is shown in Fig. 6. The tomato projection data numbered 1~100 were used to train the model, and the tomato projection data numbered 101~200 were used to compare the accuracy of the two models. The MATLAB software was adopted to train the obtained projection data, the PREMNMX function was used to normalize the model, the purelin function was used to perform linear activation on the model, the traingd function was used to perform gradient descent BP training on the model. The formulas of the three functions are as Eqs. (5) to (7).
The input layer and output layer of the BPNN model constructed in this article were both 1, so the hidden layer nodes should be between 3 and 8. Use input and output functions to obtain model training data and prediction data, that was the tomato projection data numbered 1~100 and tomato real volume data. Use the mapminmax function to normalize the model training data and prediction data. After initializing the network structure, train it. The number of rounds of training was 500, and the learning rate was 0.01. After normalizing the predicted data using the mapminmax function, the predicted data was output. After de-normalizing the predicted data, analyze the results and output the predicted error.
The backpropagation algorithm of the BPNN model constructed in this article uses the training set data, that is, the tomato projection data numbered 1~100 and tomato real volume data as the input layer, and the output result of the output layer is obtained. After calculating the error, the error was input from the output layer, and output from the output layer through the hidden layer. In the process of error back propagation, adjust the value of each parameter according to the error until convergence.

Water Drainage Method for Tomato Volume Measurement
The water drainage method was also adopted for actual tomato volume measurement, and the results were taken as the reference for the performance comparison of the 3D wireframe model and the BPNN model. The specific method was: pour water into a beaker with a drain hole on the side wall, stop filling when water overflows, then put the test object into the beaker, collect the water drained by the test object, and measure the volume of the drained water using a 500 mL graduated cylinder, then, the actual volume of each tomato was obtained. When applying the water drainage method to measure the actual volume of tomatoes, it needs to make sure that there is no crack in the calyxes of the 200 tomatoes, so that water will not enter the inside of the tomatoes, and the tomato surface must be kept dry. The histogram of the volume frequency distribution of the measurement results is shown in Fig. 7.

Figure 7 Tomato volume frequency distribution histogram
The actual volume values of the samples were close to the normal distribution, which conformed to the general laws of statistics, and the measured samples were universal. The statistics of volume data of all tomatoes is shown in Tab. 1.

EXPERIMENT AND ANALYSIS 3.1 Volume Measurement Results of 3D Wireframe Model and BPNN Model
In order to evaluate the volume measurement accuracy of the two models, the volume of the tomato was measured when the camera's photographic plate plane was 140 mm away from the center of the tomato. Tomatoes numbered 101~200 were used to compare the performance of the two models, the 100 tomatoes were measured 10 times repeatedly, and the mean values were taken, the obtained data are shown in Tab. 2. In order to intuitively compare the measurement of tomato volume by various methods, a line chart is drawn according to the data in Tab. 2, as shown in Fig. 8. In view of the influence of photo number on the results, the number of photos taken was set to n = 5.
According to Tab. 2, the volume results of the wireframe model and the BPNN model were significantly larger than the tomato volume results measured by the water drainage method. This is because the sunken part of the calyx of the tomato was blocked when the photos were taken, resulting in greater predicted volume values.  Figure 8 Volume results of tomatoes numbered 101~110 measured by the two models when photo number n = 5 Figure 9 Comparison of deviation rates of the two models In order to better compare the prediction results of the two models, the deviation rates of the two models were calculated, as shown in Fig. 9. It can be seen from the figure that the accuracy of the tomato volume values obtained by BPNN was higher, the deviation was within 5.5%. The deviation of the results obtained by the wireframe model was significantly greater than that of the BPNN.

Influence of Photo Number on the Prediction Results of the Two Models
In theory, increasing the photo number would improve the accuracy of the wireframe model; however, in reality, taking the maximum value of n is not allowed. Therefore, this paper also conducted a research on the relationship between photo number n and model accuracy, hoping to get satisfactory results with smaller n value. The volume values of tomato samples numbered 101~200 were measured again, due to the particularity of the wireframe model, n was set to n [5,15], and the data source of the BPNN came from tomato samples numbered 1~90 when n [1,15], and the obtained model accuracy is shown in Fig. 10. By increasing the number of hidden layers of the neural network, the accuracy of the BPNN model was obtained as shown in Fig. 11.

Figure 10
Relationship between photo number and measurement accuracy According to Fig. 10, as the number of photos taken increased, the tomato volume measurement accuracy improved, and such improvement was even more obvious for the wireframe model. However, when photo number reached n = 9, the accuracy improvement of the wireframe model was limited, while the highest accuracy of BPNN reached 95.6%. Obviously, increasing the photo number can improve the prediction accuracy, but such improvement is not obvious, and it would occupy GPU resources. One reason that the accuracy of the model will not approach 100% is that the sunken part of the calyx of the tomato was blocked when the photos were taken, thus the predicted volume was greater. When the number of images and hidden nodes is fixed, the linear output result of the BPNN model is increased with a correction coefficient, the learning rate is also 0.01, and the tomatoes numbered 1 to 100 are still used as the data set, and the model is verified when the tomatoes numbered 101 to 200 are verified, 1.3% higher than the highest accuracy. After comprehensively considering the relationship between measurement accuracy and measurement cost, it turned out that, when using BPNN to collect 7 photos for tomato volume calculation, the effect was optimal.

Figure 11
Influence of node number on measurement accuracy According to Fig. 11, when the hidden node number was in the interval [6,15] and the number of photos was 5, increasing the number of hidden nodes had certain effect on the improvement of model performance; however, when the number of nodes was 13, the model performance showed a decrease, which might be related to model overfitting. With the increase of node number, the training cost of the model increased greatly, but the accuracy improvement was not significant, therefore, it is suggested to take 12 as the node number for model training.

CONCLUSIONS
This paper first introduced the significance and research status of tomato volume measurement in detail, then, on this basis, it proposed a novel tomato volume measurement method based on LabVIEW and machine vision and gave the specific implementation scheme. The toolkits VAS, VDM and VISA of LabVIEW were used to realize functions of image acquisition, image processing, volume prediction and communication with the lower computer. Then, the performance of the wireframe model and the BPNN was compared and both methods had good accuracy; when the number of photos taken was 5, their accuracy reached 92.93% and 95.60%, respectively, and BPNN's accuracy was higher. When solving the tomato volume, the overall effect was better under the condition of the photo number 7. grateful to the reviewers for their helpful comments and recommendations, which make the presentation better. The influence of variable n on the accuracy of two models