AUTHOR CONTRIBUTION

This author is the only contributor.

The aim of this study is to emphasize the importance of artificial intelligence (AI) and causality modelling of food quality and analysis with ’big data’. AI with structural causal modelling (SCM), based on Bayesian networks and deep learning, enables the integration of theoretical field knowledge in food technology with process production, physicochemical analytics and consumer organoleptic assessments. Food products have complex nature and data are highly dimensional, with intricate interrelations (correlations) that are difficult to relate to consumer sensory perception of food quality. Standard regression modelling techniques such as multiple ordinary least squares (OLS) and partial least squares (PLS) are effectively applied for the prediction by linear interpolations of observed data under cross-sectional stationary conditions. Upgrading linear regression models by machine learning (ML) accounts for nonlinear relations and reveals functional patterns, but is prone to confounding and failed predictions under unobserved nonstationary conditions. Confounding of data variables is the main obstacle to applications of the regression models in food innovations under previously untrained conditions. Hence, this manuscript focuses on applying causal graphical models with Bayesian networks to infer causal relationships and intervention effects between process variables and consumer sensory assessment of food quality.

This study is based on the data available in the literature on the process of wheat bread baking quality, consumer sensory quality assessments of fermented milk products, and professional wine tasting data. The data for wheat baking quality were regularized by the least absolute shrinkage and selection operator (LASSO elastic net). Bayesian statistics was applied for the evaluation of the model joint probability function for inferring the network structure and parameters. The obtained SCMs are presented as directed acyclic graphs (DAG). D-separation criteria were applied to block confounding effects in estimating direct and total causal effects of process variables and consumer perception on food quality. Probability distributions of causal effects of the intervention of individual process variables on quality are presented as partial dependency plots determined by Bayesian neural networks. In the case of wine quality causality, the total causal effects determined by SCMs are positively validated by the double machine learning (DML) algorithm.

The data set of 45 continuous variables corresponding to different chemical, physical and biochemical variables of wheat properties from seven Croatian cultivars during two years of controlled cultivation were analysed. LASSO regularization of the data set yielded the ten key predictors, accounting for 98 % variance of the baking quality data. Based on the key variables, the quality predictive random forest model with 75 % cross-validation accuracy was derived. Causal analysis between the quality and key predictors was based on the Bayesian model shown as a DAG graph. Protein content shows the most important direct causal effect with the corresponding path coefficient of 0.71, and THMM (total high-molecular-mass glutenin subunits) content was an indirect cause with a path coefficient of 0.42, and protein total average causal effect (ACE) was 0.65. The large data set of the quality of fermented milk products included binary consumer sensory data (taste, odour, turbidity), continuous physical variables (temperature, fat, pH, colour) and three grade classes of products by consumer quality assessment. A random forest model was derived for the prediction of the quality classification with an out-of-bag (OOB) error of 0.28 %. The Bayesian network model predicts that the direct causes of the taste classification are temperature, colour and fat content, while the direct causes of the quality classification are temperature, turbidity, odour and fat content. The key quality grade ACE of temperature -0.04 grade/°C and 0.3 quality grade/fat content were estimated. The temperature ACE dependency shows a nonlinear type as negative saturation with the ’breaking’ point at 60 °C, while for fat ACE had a positive linear trend. Causal quality analysis of red and white wine was based on the large data set of eleven continuous variables of physical and chemical properties and quality assessments classified in ten classes, from 1 to 10. Each classification was obtained in triplicate by a panel of professional wine tasters. A non-structural double machine learning (DML) algorithm was applied for total ACE quality assessment. The alcohol content of red and white wine had the key positive ACE relative factor of 0.35 quality/alcohol, while volatile acidity had the key negative ACE of –0.2 quality/acidity. The obtained ACE predictions by the unstructured DML algorithm are in close agreement with the ACE obtained by the structural SCM.

Novel methodologies and results for the application of causal artificial intelligence models in the analysis of consumer assessment of the quality of food products are presented. The application of Bayesian network structural causal models (SCM) enables the d-separation of pronounced effects of confounding between parameters in noncausal regression models. Based on the SCM, inference of ACE provides substantiated and validated research hypotheses for new products and support for decisions of potential interventions for improvement in product design, new process introduction, process control, management and marketing.

According to the EU Commission report by Knowledge Centre for Food Fraud and Quality (KC-FFQ) based on 30 000 respondents, 65 % of them perceived food quality as ’very important’ when deciding what to buy, compared to food price, which is important to 54 % of consumers (

System view of causal AI model application for process and market decision making, management and innovations of food quality by do(x) inference

The baking quality of seven winter wheat cultivars from the Slavonia region in eastern Croatia was analysed. The volume of bread loaf under the standard baking protocol was used as the baking quality test. The cultivars were grown for a period of three years under controlled conditions at the experimental field of the Agricultural Institute Osijek, Croatia. Their quality properties were evaluated by 45 physical, chemical and biochemical variables. Each parameter was determined in triplicate during three consecutive years of cultivation. The measured variables were grouped as 6 indirect quality parameters, 7 farinographic parameters, 5 extensographic parameters and 25 pieces of information from reversed phase-high performance liquid chromatography (RP-HPLC) of gluten proteins. The experiment methodology and the data are available in the published manuscripts (

This dairy dataset contained 1059 samples of consumer quality assessments of fermented dairy products (

The wine quality was a large dataset, 1599 red and 4898 white samples of the Portuguese Vinho Verde wine, characterized by 12 physical and chemical composition data and quality assessments provided by a panel of professional wine tasters (

The basic principles of causal AI modelling are based on the concepts of Bayesian statistics and networks (BN). Bayesian statistics combines prior knowledge (old model) upgraded with new experimental observations (data) in the prediction of a new model. The nature of prior knowledge in modelling includes deductive (known theoretical knowledge) and inductive (empirical structures and model parameters known from previous studies) processes studies. Knowledge of a causal AI model was expressed as a joint probability density function P of the model conditioned on new data. Causal AI modelling is a two-stage process in which the first objective is to determine the structure of a BN graph G, and in the second stage to determine functional causal dependencies between variables followed by estimation of the model parameters θ.

The two-stage process of structural causal modelling (SCM) was expressed as a product of the corresponding probability density functions:

With inferred causal structure G and parameters, θ model posterior distribution was expressed by the basic Bayesian relationship:

In case of a model with continuous random variables (Gaussian), it is explicitly expressed in a functional form as:

Extensive sampling by Monte Carlo Markov chain (MCMC) algorithm was applied for statistical inferences from the model multivariable posterior probability distribution π(θ|X).

Commonly, the basic modelling presumes that all considered causal effects are directional, _{i}) connected with a set E of oriented edges (arrows), G={V,E}. It is a Bayesian network (BN) with Markov property enabling decomposition of a joint probability density function P as a product of individual node (variables x_{k}) probabilities p conditioned on their parent variables Pa. The parent variables are those variables x_{i} (vertices) pointing directly to x_{k}

Causal dependencies, direct and total, depend on a set of network paths between the cause-and-effect variables. To infer causality, confounding of interfering variables must be blocked by directed d-separation, which implies conditional independence in the probability distribution (

The wheat data were regularized by the application of a flexible net of least absolute shrinkage and selection operator (LASSO) as a combination of L1 and L2 norm penalty functions (

The initial space of 45 wheat chemical, physical and biochemical variables was reduced to the space of 10 features obtained by optimisation algorithm provided with glmnet software (

The model was the assembly of 500 trees, each obtained by random split of 3 variables. Validation of the prediction model showed that with the untrained out-of-bag samples it accounted for 75 % of variance (

Prediction of the wheat baking quality as volume of product with 10 key features by the random forest model

Causal Bayesian network model of the wheat key features and bread baking quality as volume. The path coefficients are the direct causal strengths evaluated with the standardized variables. P=^{3}, WA=water absorption/%, R=dough resistance/min, R/Ext=resistance/extensibility ratio, TGT=_{total}/%, THMM=total high molecular mass/%, a=

The causal inferences of the SCM were compared (validated) using unstructured causal model with double machine learning (DML) algorithm for estimation of the average causal effect (ACE) (_{k} predicted by the corresponding random forest (RF) mode l:

The ACE estimates with standardized data are shown as a bar chart (

Direct average causal effects (ACE) of the wheat key features on bread baking quality. The ACE values were evaluated with the standardized variables. DS=high degree of softening, a=_{total}/%

The main technological benefit is the application of the SCM to predict unconfounded effects of intervention action, _{k} with preselected deterministic value x_{k} and d-separation of confounding variables which simultaneously interfere with the intervention (treatment) and effect (outcome). To account for nonlinearity and probability in uncertainty of do(x) effects, Bayesian neural networks (BNN) were developed (

Distributions of a bread loaf volume (do(x)) caused by the intervention of do(x) on the content of: a) total high-molecular-mass (THMM) gliadins and b) protein

Causal analysis of the dairy product quality data was based on the SCM. Causal structure network learns by hill-climbing (HC) algorithm of greedy search of DAG space of association structures and causal directions to optimize Bayesian information criterion (BIC) (

Directed acyclic graph (DAG) of causal effects of milk composition and process parameters on consumer assessment of dairy quality. Temp.=temperature, Turb.=turbidity

Probability distribution of quality(do(x)) of consumer assessment of dairy quality caused by a change in: a) pretreatment temperature (°C) and b) relative fat content

For the wine quality detailed description of SCM and causal analysis is given by Kurtanjek (

The average causal effect (ACE) of wine quality caused by the change of the standardized values of physical and chemical parameters

This manuscript provides methodologies of causal AI modelling applied to complex problem of integration of objective (instrumental) and subjective (human) food quality data. The obtained causal network model helps food engineers with intervention decisions for the existing and innovation of new technologies. The methodologies are illustrated by the models of bread baking quality, fermented dairy products and wine.

Machine learning models of neural networks and random forest of decision trees were applied. The key research objective is discovery of the causal relations between the objective physicochemical data and consumer perception of quality. To find causal relationships between complex data of wheat biochemical and physical properties and bread baking quality, Bayesian statistical model with Monte Carlo Markov chain (MCMC) sampling of the posterior distribution was applied. Structural causal learning and analysis of dairy products was achieved by hill-climbing optimization of the Bayesian information criterion (BIC). Besides the structural causal models, the unstructured algorithm of double machine learning (DML) models with the random forest decision trees were applied to obtain the vine quality data.

The main technological application of the presented causal artificial models is to evaluate the effects of interventions (’do’, do(x) operator) as improvements of production process parameters and compositions of food ingredients. The causal models help find process control patterns and support technological decisions outside the available regression data. Here, for each presented model, average causal effects (ACE) were evaluated based on d-separation criteria and selection of the corresponding unconfounding adjustment sets. For the models to compare wine quality, the structural models based on ACE are in agreement with the estimates by the unstructured DML algorithm. The importance of nonlinear causal effects is modelled by Bayesian neural networks with d-separated minimal adjustment sets and shown as partial dependency plots.

FUNDING

This research did not receive any financial support.

CONFLICT OF INTEREST

The author declares that there is no conflict of interest.