Machine Learning Classification of Cervical Tissue Liquid Based Cytology Smear Images by Optomagnetic Imaging Spectroscopy

Semi-automated system for classification of cervical smear images based on Optomagnetic Imaging Spectroscopy (OMIS) and machine learning is proposed. Optomagnetic Imaging Spectroscopy has been applied to screen 700 cervical samples prepared according to Liquid Based Cytology (LBC) principles and to record spectra of the samples. Peak intensities and peak shift frequencies from the spectra have been used as features in classification models. Several machine learning algorithms have been tested and results of classification have been compared. Results suggest that the presented approach can be used to improve standard LBC screening tests for cervical cancer detection. Developed system enables detection of pre-cancerous and cancerous states with sensitivity of 79% and specificity of 83% along with AUC (ROC) of 88% and could be used as an improved alternative procedure for cervical cancer screening. Moreover, this can be achieved via portable apparatus and with immediately available results.


INTRODUCTION
Cervical cancer is the fourth most common cancer in women worldwide with 528,000 new cases and the second most common cancer in less developed regions (445,000 cases). Approximately 87% of estimated 266 000 cervical cancer deaths in 2012 occurred in less developed countries [1]. With regular appointments and access to the accurate screening tests, precancerous lesions can be detected beforehand and treated successfully. First cytology-based test that was implemented in national screening programs in developed countries was Papanicolaou test. Despite low sensitivity, Papanicolaou test succeeded in reduction of cervical cancer incidence and mortality over the past 50 years. In the meantime, persistent HPV infection was recognized as necessary cause of cervical cancer, thus FDA (Food and Drug Administration) acknowledged Human Papillomavirus (HPV) DNA test as primary screening test. However, HPV DNA test cannot decide if the infection is persistent or transient and could lead to overtreatment in women younger than 30 years. This is the reason why HPV DNA test is recommended to be used in co-testing with cytology [2,3]. Since 2006, HPV vaccine has been introduced as part of the national health immunization programs in many countries, mostly in high and uppermiddle income settings. Still, developing countries which would have the most interest in HPV vaccination programs, do not have access to HPV vaccines [4]. Over the past decades, there were many attempts to develop automated screening systems that would lower the cost and improve the accuracy of existing screening tests for cervical cancer detection [5][6][7][8][9][10]. These systems are mainly based on automated inspection of the cytology samples and classification of cervical smear images into the healthy/abnormal group. Problems associated with proper segmentation of the sample image and extraction of the highly significant features such as cytoplasm and nucleus size, shape, nucleus/cytoplasm ratio, color and texture are still subjected to consideration in many scientific studies [11][12][13][14]. The approach of automated cancer detection is aimed to reduce the workload of pathologist and to exclude human error in diagnosis that arises from the laborintensive task of manual screening. In this study, cervical sample analysis by spectroscopy and machine learning classification methods have been combined in order to classify cervical samples in semiautomatic manner. Sample spectra have been obtained by Optomagnetic Imaging Spectroscopy (OMIS), which is easy to use, fast and inexpensive [15]. Previous studies reported high accuracy of OMIS in detection of abnormal cervical samples prepared according to conventional Papanicolaou procedure, as well as in detecting colon cancer [16][17][18][19].
Here, classification results for LBC samples, obtained by selected supervised learning algorithms have been compared, in order to investigate whether improved sample preparation method, i.e. LBC affects the accuracy of cervical cancer detection by OMIS.

Materials
In cooperation with Tumour Trace Ltd, UK, and Tumour Trace d.o.o, Serbia, a total of 700 LBC smears were collected in two separate studies. First study "Classification of cervical samples" was conducted at Southend University Hospital, UK, in October 2015, while the second study "Testing three different devices on cervical samples" was conducted at Oncquest Laboratories in New Delhi, India, in February 2016. Liquid Based Cytology procedure instructs that both samples from endocervix and exocervix are collected and placed into the vial with the preserving liquid, transported to the laboratory where the sample is processed, placed on the microscopic slide and stained. All samples are examined by standard histopathology procedure and classified in five categories: negative, moderate, high grade squamous intraepithelial lesion (HSIL), dysplasia and cancer. LBC samples were subjected to analysis with OMIS and optomagnetic spectra were obtained for every sample. For the purpose of resolving the binary classification problem considered in this paper LBC samples were divided into two groups: first group consisted of 354 cases (Negative cases) and second group consisted of 346 cases (202 moderate dysplasia (MD) cases, 54 HSIL cases, 63 severe dysplasia (SD) cases and 27 cancer cases).

Optomagnetic Imaging Spectroscopy
Optomagnetic Imaging Spectroscopy (OMIS) is a technique used to observe differences in the tissue properties based on unpaired and paired electrons and hydrogen bonds. Based on light-matter interaction between diffuse visible and reflected polarized light and sample, OMIS identifies average energy state of valence electrons and hydrogen bonds within the sample material. The fact that the magnetic force of matter is four orders of magnitude closer to quantum state of matter than the electrical force, used as starting point in this approach, detection of the conformational states and changes in the matter at nanoscale level is enabled [15]. The process of scanning the sample with OMIS involves shining white diffuse light on the sample, which interacts with the valence electrons to produce a measure of the molecules' electrical and magnetic forces. The sample is first exposed to white diffuse light perpendicular to the sample and then to the white diffuse light under the Brewster angle (Fig 1). The white diffuse light interacts with the valence electrons and by capturing and analyzing digital images of the sample in these two modes, changes in spectral fingerprint of the sample can be detected.

Classification
We have made a comparison of results obtained by several, sophisticated, machine learning algorithms implemented in R, a free programming language and environment for statistical computing and graphics [20], by the following packages: "glmnet" (i.e. generalized linear model fitted via penalized maximum likelihood), "rf" (Random Forests), "gbm" (gradient boosting machines), "adaboost" (adaptive boosting), "svm" (support vector machines), "xgboost" (extreme gradient boosting). This was done through utilization of R's "caret" (short for _C_lassification_A_nd_RE_gression_T_raining) package, which acts as a unification framework for more than 200 algorithms, implemented in different R packages [21]. Such framework allows easy construction of plethora of models, simplified tuning of their parameters, and consistent testing and comparison of their performance. A brief, general overview of utilized algorithms and libraries is presented.
"Glmnet" is a package authored by Jerome Friedman, Trevor Hastie, Rob Tibshirani and Noah Simon, while the R package itself is maintained by Trevor Hastie [22]. It fits a generalized linear model via penalized maximum likelihood. "Glmnet" exploits, the so called, elastic net models which utilize both ridge and lasso regression. In essence "glmnet" solves the following problem: where is a tuning parameter which takes values over a grid covering the entire range and controls the overall strength of the penalty. The argument determines what type of model is fit. If α = 0 a ridge regression model is fit, and vice versa if α = 1 a lasso model is fit, in other word the elastic-net penalty is controlled by α, and variation of α bridges the gap between ridge and lasso. The "glmnet" implementation is extremely fast, partially because it cleverly exploits sparsity, if present, in the input matrix x. Support vector machine (SVM) is a well-known and established supervised learning approach primarily used for classification developed in early 1990s [23,24]. SVM is a generalization of a relatively simple classifier known as maximal margin classifier. Maximal margin classifier can be used only for the cases where classes are separable by a linear boundary. By the utilization of the, so called, kernel trick (implicitly mapping its inputs into highdimensional feature space), SVM also enables successful classification in the case of non-linear class boundaries.
It can be noticed that several ensemble methods have been applied, that are based on building a large number of "weak" learners, i.e. decision trees, in conjuncture with bagging (bootstrap aggregating) or boosting techniques, namely Random Forests, gradient boosting machine, AdaBoost and XGBoost. Random Forests exploit bagging, while the rest, as it is implied by their names, utilize boosting. In a nutshell both bagging and boosting enable a large set of weak learners to be combined such that a strong learner is obtained with better performance than a single one. Main sources of error in machine learning are noise, bias and variance. Ensemble techniques help in minimizing the influence of such determinants, especially variance. They improve stability and accuracy of base learning algorithms [25,26].
All above mentioned algorithms were used for binary classification of the cervical smear samples, with one class marked as "Negative", indicating negative test result, i.e. In order to provide comparability of results and overall reproducibility of the analysis the number of parameters has been fixed for all models, i.e. number of crossvalidation folds, the resampling method and performance summaries. Evaluation of models was performed by 10fold cross-validation repeated 10 times, utilizing bootstrap resampling scheme. Performance scores that were used as basis for model evaluation and selection as well as inter model comparison are area under the ROC (receiver operating characteristic) curve (AUC), sensitivity and specificity.
Further, an attempt has been made to improve performance of classification by making explicit use of ensembles. Based on results of several trials with different combinations of before mentioned algorithms, simple ensemble of gbm and Random Forest models have been selected, using linear greedy optimization on AUC. For this purpose, "caretEnsamble" R package was used [27].
To speed up the training process parallel processing has been enabled by utilization or R's "doParallel" package, more precisely we used 6 "workers" on a MacBook Pro with 2,2 GHz Intel Core i7 and 16 GB 1600 MHz DDR3.

RESULTS
Cervical cytological samples were first screened with Optomagnetic Imaging spectroscopy and as a result, optomagnetic spectra were gathered for all considered cervical samples. Following values from the spectra were used as feature sets in classification problems: maximum peak intensity values, wavelengths where maximum peaks occur, minimum peak intensity values, wavelengths where minimum peaks occur, area under the positive peaks, area under the negative peaks and AUC ratio of positive and negative peaks. Spectra obtained for LBC sample groups: HSIL, MD and SD, differ in terms of the positive and negative peak intensities, as well as in the terms of the wavelength differences where peaks occur (Fig. 2). If positive peaks in the spectra are observed, a significant shift in the case of positive peak occurrence can be detected (positive peaks are detected on 111,117 nm, 113,614 nm and 115,341 nm in the spectra obtained for HSIL, MD and SD respectively).
Spectra of the normal and cancer samples differ in the terms of the maximum positive and negative peak intensities: absolute values of the maximum positive and negative peaks are lower, in the spectra of the cancer sample compared to the normal cell spectra (peak for maximum positive for normal is 14,48 at 114,38 nm and peak for cancer is 11,84 at 114,38 nm, while peak for negative values for normal is −9,75 at 117,99 nm and for cancer is −4,42 at the 117,52 nm). Also, spectra of the cancer have characteristic local minimum of intensity 4,52 at 112,46 nm (Fig. 3).   Specificity is somewhat higher than sensitivity for LBC sample classification (Fig. 5). It ranges from mean (median) value of 0,77 in the case of adaboost model to 0,87 in the case of "glmnet" model.

DISCUSSION
Optomagnetic Imaging Spectroscopy is a relatively new technology based on visible light interaction with sample, used to provide spectral signature of the sample. Based on spectral characteristics of cervical cells, OMIS can differentiate normal from abnormal cervical sample. The intensity of characteristic peaks in optomagnetic spectra, wavelength where peaks occur and areas below the peaks are therefore used as features in classification of cervical samples. The goal was to compare detection rates between conventionally prepared Papanicolaou smears and liquid based cytology samples scanned with OMIS. Results show high sensitivity and specificity for classification of cervical samples in binary classification problems, where two classes are made of "Normal" cases and "Abnormal" cases. Six different classification models were tested for classification of LBC samples.
In our previous work unstained fresh sample classification by OMIS into normal/abnormal class (II Papanicolaou group vs. III, IV and V Papanicolaou group) with Naive Bayes classifier gave mean sensitivity of 0,73 and mean specificity of 0,82, while the stained sample classification into Cancer/Non cancer class (II Papanicolaou group -Normal vs. V Papanicolaou group -Cancer) achieved mean sensitivity of 0,78 and mean specificity of 0,98 [16,19]. In this work, we tested different classification models and LBC samples for binary classification (Normal cases vs Abnormal cases) and we obtained best classification results with Random Forest model (mean sensitivity of 0,79 and mean specificity of 0,83, AUC=0,88).
Classification of LBC samples based on Random Forest model presented in this paper demonstrates superior performance in terms of sensitivity compared to models tested in our previous research for conventionally prepared cervical cytology samples.

CONCLUSION
Application of Machine Learning (ML) in disease diagnosis is reaching its full potential nowadays. In the era of big data, constantly improved classification methods become valuable assisting tools in medicine. In cervical cancer detection, ML algorithms are mainly used for image-based classification, either those images of single cervical cell or whole smear images of Papanicolaou smears.
In our previous work, we have proposed new system for semi-automated classification of cervical samples by combining Optomagnetic Imaging Spectroscopy with machine learning algorithms [16,17,19]. In this paper, we have expanded our research to LBC samples. Spectral properties of cervical cells were obtained with Optomagnetic Imaging Spectroscopy and used for cervical sample classification. The mean sensitivity produced by Random Forest classification model was 0,79 and the mean specificity was 0,83, with AUC (ROC) of 0,88.
Optomagnetic Imaging Spectroscopy is proven to be efficient, fast and cost effective. Such a system that combines sample screening by Optomagnetic Imaging Spectroscopy and sample classification enables semiautomatic detection of abnormal cervical samples and can be used as an alternative screening system to separate normal cases and refer abnormal cases to further testing with cervical cytology and HPV tests. This can be achieved via portable apparatus and with immediately available results. Quality of the signal detected by Optomagnetic Imaging Spectroscopy depends on the thickness of the cervical cell sample and the quality of the staining procedure, i.e. on the human factor, thus the efficacy of the machine learning algorithms could be improved if the quality of the prepared sample could be more controlled. Technical Gazette 26, 6(2019), 1694-1699

Ethical Approval
Experiments were approved by the institutional Ethics Committee and carried out after patient approval.