Deep Learning for Cervical Cancer Detection
Deep Learning for Cervical Cancer Detection
September 3, 2019.
Digital Object Identifier 10.1109/ACCESS.2019.2936017
ABSTRACT Cervical cancer is the fourth most prevalent disease in women. Accurate and timely cancer
detection can save lives. Automatic and reliable cervical cancer detection methods can be devised through the
accurate segmentation and classification of Pap smear cell images. This paper presents an approach to whole
cervical cell segmentation using a mask regional convolutional neural network (Mask R-CNN) and classifies
this using a smaller Visual Geometry Group-like Network (VGG-like Net). ResNet10 is used to make full use
of spatial information and prior knowledge as the backbone of the Mask R-CNN. We evaluate our proposed
method on the Herlev Pap Smear dataset. In the segmentation phase, when Mask R-CNN is applied on the
whole cell, it outperforms the previous segmentation method in precision (0.92±0.06), recall (0.91±0.05)
and ZSI (0.91±0.04). In the classification phase, VGG-like Net is applied on the whole segmented cell and
yields a sensitivity score of more than 96% with low standard deviation (±2.8%) for the binary classification
problem and yields a higher result of more than 95% with low standard deviation (maximum 4.2% in accuracy
measurement) for the 7-class problem in terms of sensitivity, specificity, accuracy, h-mean, and F1 score.
INDEX TERMS Mask R-CNN, VGG-like Net, cell segmentation, cell classification, pap smear.
This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see [Link]
VOLUME 7, 2019 116925
Kurnianingsih et al.: Segmentation and Classification of Cervical Cells Using Deep Learning
method [3]–[6]. Such systems automatically classify normal separate the nucleus from the cervical smear model with
and abnormal cervical cells. However, an automatic two-level 95.134% precision for adaptive segmentation based on the
cascade classification system proposed in [7] produced both GVF Snake model [13]. In order to improve the classification
a false negative rate and false positive rate of 1.44%. performance, the artifacts were removed from the cytology
The aim of this paper is to develop a better system for images in the Bethesda System dataset using an SVM, result-
the automatic detection of cancer cells using a deep learning ing in a true classification of normal and abnormal cells
approach on Pap smear images. Deep learning techniques can of 85.19% and 88.1% respectively [14]. Using ultra-large
be used to identify patterns in complex big data starting with cervical histological digital images, a combination of SVMs
preprocessing the data, training the model and testing it [8]. and the block-based segmentation technique utilizes robust
The primary contributions of this paper are as follows. texture feature vectors to enhance classification efficiency for
(1) As far as we know, this work is the first to implement cervical intraepithelial neoplasia (CIN) diagnosis [15].
Mask R-CNN and the transfer learning technique to segment A segmentation method is applied to separate the cell
the whole cervical cell. nuclei from its cytoplasm and then classifies them using
(2) As far as we are aware, this work is the first to the K-Nearest Neighbor (KNN), which resulted in an 84.3%
implement a VGG-like Net in which whole cervical cells are classification accuracy with no validation and 82.9% clas-
classified. sification accuracy with 5-fold cross validation [16], [17].
We evaluate the accuracy, sensitivity, specificity, and Zijd- A KNN method is also used to classify normal and cancerous
henbos similarity index (ZSI) of the models. cells on microscopic biopsy images after the segmentation
The rest of this paper is structured as follows. process using k-means [18].
Section 2 overviews the related works on cervical cell seg- A clustering technique using fuzzy C-means (FCM) was
mentation and classification; Section 3 discusses the materi- used to segment Pap smear images [19]. One of the draw-
als and describes the methods used to segment and classify backs of FCM clustering is that it fails to detect all the valid
cervical cells. The experiment analysis and evaluation of clusters in a colour image segmentation. William et al. [20]
segmentation and classification is given in Section 4 and presented a Trainable Weka Segmentation classifier for cell
a discussion is presented in Section 5. Finally, this study segmentation and an enhanced fuzzy C-means algorithm to
concludes in Section 6. classify cervical cells.
Deep learning has achieved enormous success in many
II. RELATED WORKS applications, including cancer research. Deep learning was
Research on the automated screening of Pap smears has used to segment abnormal cells from conventional Pap smear
moved from cytology to histology over recent years. The digital images [21], [22]. Song et al. [23] proposed cervical
combination of information from a multitude of comput- cytoplasm and nuclei segmentation using superpixels and
erized histology and cytology documents was used on the convolutional neural networks (CNNs). The automatic seg-
Brazilian Cervical Cancer Information System (SISCOLO) mentation of cervical nuclei using Mask R-CNN in com-
for sensitivities above 90% [9]. However, cytology testing bination with the local fully connected conditional random
continues to be used in most countries because of its afford- field (LFCCRF) is presented by Liu et al. [24].
ability and efficiency in identifying cervical cancer in routine Several research studies on segmenting and classifying
testing. the nucleus have been overviewed in this section. However,
In almost all imaging system analysis, image segmenta- it might not be possible to classify cervical cells with only
tion is an important and demanding task. It is difficult for nucleus data. The segmentation of the whole cell is therefore
individuals to precisely analyze the segmentation of all parts more suitable [25]–[28]. Each cell is then classified using
of cervical cells (nuclei and cytoplasm) in Pap smears. Poor specific classifiers after the segmentation step. Su et al. [7]
cell segmentation can lead to poor analysis results. Accurate created a two-level cascade classifier to automatically detect
and automatic computer-assisted segmentation on the whole cervical cancer cells from thin liquid-based cytology slides.
cervical cell is necessary for cervical cancer screening and The neural MLP feedforward network of Levenberg - Mar-
diagnosis. A set of 50 images was screened for the segmen- quardt was used to classify the cervical images of 100
tation of cervical cells using mean-shift and median filtering, patients [29]. Classification of cervical cell images is done
and for the further processing of the segmentation result using with deep learning [30], [31]. The performance of this type
morphological operators [10]. of classification, however, is not very high [32].
Three SVM-based approaches (standard SVM, SVM com- In this study, the whole Pap smear cell is segmented and
bined with RFE algorithm, and SVM combined with the PCA classified using deep learning. The evaluation was carried
algorithm) are used to classify the cervical cancer dataset out on the Herlev Pap smear dataset [2], [33]. Mask R-CNN
from the repository of University of California at Irvine was used in the segmentation process. A cell image is
[11]. Nucleus and cytoplasm segmentation and classification segmented into cell (a combination of nucleus and cyto-
using multi-class SVM classifiers such as polynomial SVMs, plasm) and background. Mask R-CNN, an extension of Faster
quadratic SVMs, Gaussian RBF SVMs, and linear SVMs R-CNN, is a well-known method for tackling the issue of
resulted in 95% accuracy [12]. SVMs were also used to instance segmentation by predicting a segmentation mask
III. METHOD
A. DATASET
The Herlev Pap smear dataset collected by Herlev University
Hospital (Denmark) and the Technical University of Den-
mark [2], [33] was used to evaluate the proposed framework.
The dataset consists of 917 images, categorized manually
by qualified cytotechnicians and physicians into 7 classes as
outlined in Table 1.
B. PROPOSED METHOD
The objective of this work is to develop a method to seg-
ment whole cervical cells, both single and overlapping, from
conventional Pap smear images, and then classify them to
identify normal and abnormal cells. The proposed method
comprises two steps. The first stage partitions the cell regions
using Mask R-CNN segmentation. The second stage defines
the whole cell area (nucleus and cytoplasm) by classifying
the segments from the initial stage. The classification in
the second phase includes a training and testing phase as
shown in Figure 1. We employ Mask R-CNN in the proposed
segmentation process and use ResNet10 to fully utilize the
spatial information and prior knowledge as the backbone of
the Mask R-CNN. The primary concept of Mask R-CNN FIGURE 1. Proposed method for the automatic detection of cervical cells.
is to segment and build pixel masks for each image item
automatically. We employ a smaller VGG-like Net to classify
the segmentation results, which is inspired by the family of using the COCO dataset. The COCO dataset has 2,500,000
VGG networks. labeled instances in 328,000 images and contains 91 common
In the segmentation training stage as shown in Figure 1(a), object categories with 82 of these having more than 5,000
transfer learning is applied on Mask R-CNN weights trained labeled instances [34]. In this proposed method, the purpose
of segmentation is to isolate the cervical cell area from its the system determines in which class the cervical image
surroundings. The segmented area of the cervical cell covers belongs.
both nuclei and cytoplasm. The cytoplasm can influence how
a cervical cell is classified.
The results of the segmentation are then applied on the C. DATA PREPROCESSING
original image dataset before being handed over to the clas- The training phase in segmentation and classification has a
sification training algorithm, as illustrated in Figure 1(b). different preprocessing scheme. In the segmentation stage,
The input image (image source) for classification is the preprocessing begins by separating the image data of the
cervical cell (colored black), as shown in Figure 2. In the cervical cell from its mask. In the case of the Herlev dataset
classification stage, we employ the VGG-like network, which that we use, the original image and mask data are still mixed
is a more compact version of the VGG network for faster in one folder which corresponds to the cancer class name.
training. This collection of images is read based on the file name
Figure 1(c) illustrates the testing process during classifi- pattern and is then separated into only two types of images,
cation. The cervical cell images are segmented using Mask namely the original image of the cervical cell and the mask.
R-CNN to isolate the cervical cells and then they are pro- When the preprocessing application finds that what is being
cessed in the trained VGG-like network. Based on the final read is a mask image, the image will be converted into a
score for each class (the 2 or 7 classification problem), binary image, that is, white for pixels which are a part of the
cervical cells (a combination of cell nuclei and cytoplasm) require every pixel in an image to be associated with a
and black for the other pixels. The original image of the label. Instance segmentation can be solved using two steps,
cervical cell and its binary mask image is then resized into i.e., performing object detection to draw bounding boxes
200 pixels with a length that is proportionally adjusted based around each instance of a class and then performing semantic
on the new width. The two groups of images are ready segmentation on each of the bounding boxes [37].
for further processing, namely network training using Mask The Mask R-CNN algorithm was first introduced by
R-CNN. He et al. [38]. Mask R-CNN is based on the previous object
In the classification section, the application will read all the detection work of R-CNN, Fast R-CNN, and Faster R-CNN
images in the Herlev dataset based on the cancer class. The by Girshick et al. The first R-CNN paper, Rich Feature Hier-
images used at the classification stage are only the cervical archies for Accurate Object Detection and Semantic Segmen-
cell regions. By involving the image’s mask, before the image tation, was published in 2014 by Girshick et al. [39]. In the
of the cervical cell is copied and grouped according to the first step, we input an image to the R-CNN algorithm. We then
cancer class, the binary mask image will be applied to the run a region proposal algorithm such as Selective Search
original image so that a new image consists of only two parts, (or equivalent). The Selective Search algorithm takes the
namely the cell part of the cervix and the background (colored place of sliding windows and image pyramids, intelligently
black). This new image is then resized into 200 pixels for its examining the input image at various scales and locations,
width and is proportional in length. The image that has been thereby dramatically reducing the total number of proposed
resized is then copied into a specific folder according to the ROIs that will be sent to the network for classification. We can
classification case that we want to train, namely two folders thus think of Selective Search as a smart sliding window and
for binary classification cases and seven folders for 7 class image pyramid algorithm.
classification cases. The dataset is then ready to be trained by Once we have our proposed locations, we crop each of
the VGG-like network. them individually from the input image and apply trans-
fer learning via feature extraction. R-CNN utilizes feature
D. DATA AUGMENTATION extraction to enable a downstream classifier to learn more
The aim of applying data augmentation is to increase the discriminating patterns from these CNN features. The fourth
generalizability of the model which can increase the dataset and final step is to train a series of SVMs on top of these
size and classification accuracy while preventing overfit- extracted features for each class.
ting [35]. In this study, data augmentation is used both in The problem with the original R-CNN approach is that
the segmentation training phase and the classification training it is still incredibly slow. Furthermore, we are not actually
phase. We used several geometric transformation methods learning to localize via deep neural network, instead, we are
on the Herlev dataset for data augmentation, i.e. top-down leaving the localization to the Selective Search algorithm.
translation, left-right translation, horizontal reflection, ver- R-CNN only classifies the ROI once it has been determined as
tical reflection and rotation. For each training data image, ‘‘interesting’’ and ‘‘worth examining’’ by the region proposal
the application will select randomly what kind of geometric algorithm, which is Selective Search.
transformations will be applied to the image. Similar to the original R-CNN, the Fast R-CNN algo-
Figure 2 shows the augmented data results for classifi- rithm [40] still utilizes Selective Search to obtain region
cation using 30-degree rotations and 5 pixels of translation proposals, but a novel contribution, Region of Interest (ROI)
applied on the Herlev dataset. Pooling, is made. In this new approach, Fast R-CNN applies
the CNN to all the input images and extracts a feature map
E. SEGMENTATION from it. ROI Pooling works by extracting a fixed-size window
There are three primary goals of object detection [36] i.e., from the feature map and then passing it into a set of fully-
given an input image to obtain 1) a list of bounding boxes for connected layers to obtain the output label for the ROI.
each object in the image, 2) a class label associated with each The network of the Fast R-CNN comprises the following
bounding box and 3) the confidence score associated with phases: (1) use an image and its bounding box as the inputs;
each bounding box and class label. Instance segmentation (2) extract the feature map; (3) obtain the ROI feature vector
takes object detection a step further. Instead of predicting by applying ROI Pooling; (4) for each region proposal, calcu-
a bounding box for each object in an image, we now want late the bounding box location and the class label prediction
to predict a mask for each object, giving us a pixel-wise using two fully connected layers.
segmentation of the object rather than a coarse, perhaps even Due to dependency in the Selective Search (or equivalent)
unreliable bounding box. for the region proposal algorithm, although the network is
Instance segmentation algorithms attempt to partition the now end-to-end trainable, the inference time performance
image into meaningful parts and associate every pixel in (i.e. at prediction) dramatically declines. Ren et al. collabo-
an input image with a class label (e.g., person, road, car, rated with Ren et al. [41] to create an additional component to
bus) [37]. While object detection produces a bounding box, create the R-CNN architecture, a Region Proposal Network
instance segmentation produces a pixel-wise mask for each (RPN). As the name suggests, the goal of the RPN is to
individual object. However, instance segmentation does not remove the requirement of running Selective Search prior to
inference and instead, makes the region proposal directly into end/backbone, making it possible to decrease the size of
the R-CNN architecture. the model produced by the segmentation training stage,
An input image is presented to the network and its features making it feasible for deployment on a mobile device and
are extracted via the pre-trained CNN (i.e., the base network). potentially increase frame per second (FPS) throughput
These features, in parallel, are sent to two different compo- as well. In our study, the Mask-RCNN backbone is applied
nents of the Faster R-CNN architecture. The first component, as a ResNet-based Feature Pyramid Network (FPN) with the
the RPN, is used to determine where in an image a potential refined extraction layers of features and the reduced subse-
object could be. At this point, we do not know what the object quent extraction layers of features according to all the cervical
is, just that there is potentially an object at a certain location in cell images.
the image. The proposed bounding box ROIs are based on the Mask R-CNN utilizes a region proposal network (RPN)
ROI Pooling module of the network along with the features to generate image regions that possibly contain an object.
extracted in the previous step. ROI Pooling is used to extract Each region is ranked based on its "objectness score" (i.e. the
fixed-size windows of features which are then passed into two probability of an object being present in a specified area) and
fully connected layers (one for the class labels and one for the then the top N most probable object regions are maintained.
bounding box coordinates) to obtain our final localizations. The value 2000 was used as the N-value in the original
In essence, we now place anchors spaced uniformly across Faster R-CNN [41]. In practice, a much lower N, such as
the whole image at varying scales and aspect ratios. The RPN N = 10, 100, 200 and 300 can be used to obtain reasonable
will then examine these anchors and output a set of proposals results. In this paper, we use the same N value as He et al. [38],
as to where it is possible an object exists. In this Faster which is 300. Each of these 300 ROIs passes through three
R-CNN, the complete object detection pipeline which takes separate network sections to predict the label, the bounding
place inside the network is: (1) region proposal; (2) feature box and the image mask itself.
extraction; (3) computing the bounding box coordinates of
the objects; and (4) providing class labels for each bounding F. CLASSIFICATION
box. We employ a VGG-like network as the basis of our deep
The Mask R-CNN approach builds on the Faster R-CNN learning training for the classification stage. The idea of
and makes two significant contributions: (1) it replaces the VGGNet introduced by Simonyan and Zisserman [42] is to
ROI Pooling module with a more accurate ROI Align module; improve the recognition performance by increasing the depth
and (2) it adds a branch for each ROI, as shown in Fig- of the CNN. The network has deep architectures from 11 to
ure 3. This additional branch is responsible for predicting 19 weight layers and only uses small filters (3×3 convolution
the actual mask of an object/class. The masking branch layer filters). The deeper the network used, the larger the
splits off from the ROI Align module prior to our FC lay- number of filters learned by each convolution layer. To reduce
ers and then consists of two CONV layers responsible for the volume size, max pooling layers (2 × 2) are applied every
creating the mask predictions themselves. The Mask R- time the number of convolutional filters doubles. Another
CNN output has three kinds of prediction, i.e. class/label characteristic of VGGNet is that there are several fully
prediction, bounding box prediction and mask prediction. connected layers at the end of the network before the last
Mask R-CNN can leverage different architectures such as layer, which uses the softmax activation function as a clas-
ResNet, VGG, SqueezeNet, and MobileNet as their back- sifier. This framework achieves state-of-the-art results which
IV. RESULTS segmentation are shown in Figure 6. The original images were
A. CERVICAL CELL SEGMENTATION masked with a white color for cytoplasm and nuclei, and
The objective of cervical segmentation is to divide a cell black for the background.
into two areas, i.e., the whole cell which consists of the As seen in Figure 5, the image source column is the cervical
cytoplasm and nucleus, and the background. Sample images cell images taken from the Herlev dataset as is, without any
from the Herlev data set at every phase in the Mask-RCNN image processing. The original mask is converted from the
TABLE 4. Two-class classification performance (normal and abnormal) using 30 degree rotation and 5 pixel translation.
TABLE 7. Confusion matrix of the testing dataset (181 data) in 7-class classification.
TABLE 9. Confusion matrix of all datasets (917 data) in the 7-class problem.
whereas Chankong et al. [25] achieves 0.80±0.12 for recall object so Mask R-CNN is fast to train. Chankong’s approach
average and 0.86±0.08 ZSI average, Bezdek [45] achieves using FCM involves the manual selection of threshold and
0.79±0.13 recall average and 0.85±0.09 for ZSI average. produces a slight difference of 0.03 for precision which
Our study results are higher compared to the Watershed is higher compared to the approach using Mask R-CNN.
method [46]. A higher result for precision indicates that the approach is
The approach used by Chankong et al. [25] applied feature able to detect more pixels which are identified correctly as
extraction from the nucleus and cytoplasm in each image, part of a mask. On the other side, lower recall shows that the
whereas feature extraction in our work is conducted by deep approach detects more pixels which are identified correctly
CNN that will simplify the pre-processing steps. Chankong’s as not part of a mask. Chankong’s and Bezdek’s approach
approach also involves manual selection to choose the best shows a higher difference of 0.15 and 0.16 between precision
threshold that gives the minimum error both when applying and recall, respectively. Our approach using Mask R-CNN
the median filter and the FCM result to build the mask of has a very slight difference of only 0.01 between precision
an object. Our approach using Mask R-CNN inserts an addi- and recall which is almost a balance between both of them as
tional branch to predict automatically the actual mask of an well as a ZSI measure which achieves 0.91, which is the same
TABLE 11. Performance results for 7-classification problem using 30 degree rotation and pixel translation.
result as recall. The value of ZSI in our study is greater than 98.1% accuracy, 97.7% h-mean and 96.5% F1 score. Sim-
0.9 which shows that the detected segmentation boundary is ilarly, the classification performance of our method on the
extremely well matched with the ground truth. 7-class problem is: 96.2% sensitivity, 99.3% specificity,
Devi et al. [47] demonstrate that their segmentation method 95.9% accuracy, 97.7% h-mean and 99% F1 score. The
using NGCS achieves higher average precision and recall results show that our method, when applied on the whole cell,
compared to Mask R-CNN. Devi’s approach has more achieves a higher accuracy of 95.9% and the best specificity
complex segmentation steps, consisting of 6 layers and each of 99.3% for the 7-class problem, and a specificity of 98.6%
layer has a different algorithm. Devi’s approach and our for the 2-class problem, compared to the method presented by
approach are not significantly different, with only a slight Chankong et al. [25].
difference of 0.03 in precision and 0.04 in recall. Whole cell segmentation is a more difficult problem than
Table 13 and Table 14 compare the performance results of nucleus segmentation. Accurate whole cell segmentation is
previous classifiers and our method in terms of sensitivity, paramount to achieving high accuracy in classification per-
specificity, accuracy, h-mean, and F1 score for the 2-class formance. The advantages of applying Mask R-CNN as our
problem and 7-class problem, respectively. A higher result for segmentation method compared to the other aforementioned
the precision of segmentation will lead to a higher sensitivity methods are: (1) it is simple, flexible, and fast to train and
of the classification result, whereas a higher result for recall of does not need complex algorithms or parameter tuning; (2) it
segmentation will lead to the higher specificity of the classi- selects the features automatically; (3) it is conceptually sim-
fication results. Most of the existing classification algorithms ple and does not need complex pre-processing steps; and
for both the 2-class problem and 7-class problem result in an (4) it is flexible and can leverage different architectures such
accuracy of above 90%, except for KNN and Bayesian, which as ResNet, VGG, SqueezeNet, and MobileNet as its back-
when applied on the nucleus, yield an accuracy of below bone. The advantages of applying VGG-like Net are: (1) our
90% for the 2-class problem, while the SVM classifier with network is deep enough to obtain high accuracy; (2) it is
watershed segmentation achieved a lower performance with faster for training; (3) it is possible to decrease the size of
an accuracy below 80%. the model produced by the segmentation training stage; and
The classification performance of our method on the (4) it is feasible for deployment on a mobile device and
2-class problem is: 96.7% sensitivity, 98.6% specificity, potentially increase frame per second (FPS) throughput as
TABLE 13. Performance comparison of classification method on herlev dataset for 2-class problem.
TABLE 14. Performance comparison of classification method on herlev dataset for 7-class problem.
well. In general, the results show that our work using Mask the whole cell areas. To fully utilize the spatial information
R-CNN and a deep CNN classifier with a smaller VGGNet is and prior knowledge in Mask R-CNN, we use RestNet10 as
effective. the network backbone. In this paper, we use two types of
performance measures, i.e., segmentation and classification
VI. CONCLUSION AND FUTURE WORKS performance. We summarize the performance of our segmen-
This work proposes a method of cervical cell segmenta- tation using precision, recall, ZSI and specificity, whereas
tion and classification. The Herlev Pap smear dataset was the classification performance is evaluated using F1 score,
used for testing. First, we employed the Mask R-CNN seg- accuracy, sensitivity, specificity and h-mean.
mentation algorithm to partition the cell regions. Second, Our proposed segmentation using Mask R-CNN pro-
by classifying the segments detected from the first phase with duces the best average performance, i.e. 0.92±0.06 preci-
a smaller Visual Geometry Group Network, we identified sion, 0.91±0.05 recall and 0.91±0.04 ZSI and 0.83±0.10
specificity for all cell types with a low standard deviation. [13] J. W. Zhang, S. S. Zhang, G. H. Yang, D. C. Huang, L. Zhu, and D. F. Gao,
Only the normal columnar type produces a lower perfor- ‘‘Adaptive segmentation of cervical smear image based on GVF Snake
model,’’ in Proc. ICMLC, Jul. 2013, pp. 890–895.
mance result of below 0.90. [14] R. R. Kumar, V. A. Kumar, P. N. S. Kumar, S. Sudhamony, and
We implemented two classification scenarios i.e. 2-class R. Ravindrakumar, ‘‘Detection and removal of artifacts in cervical cytol-
and 7-class classification problems. Our proposed method ogy images using Support Vector Machine,’’ in Proc. ITiME, Dec. 2012,
pp. 717–721.
for the binary classification problem (normal and abnormal) [15] Y. Wang, D. Crookes, O. S. Eldin, S. Wang, P. Hamilton, and J. Diamond,
yields high performance results with a low standard deviation ‘‘Assisted diagnosis of cervical intraepithelial neoplasia (CIN),’’ IEEE J.
for all metrics for 250 epochs, i.e. 96.5% F1 score, 98.1% Sel. Topics Signal Process., vol. 3, no. 1, pp. 112–121, Feb. 2009.
[16] J. Jantzen, J. Norup, G. Dounias, and B. Bjerregaard, ‘‘Pap-smear bench-
accuracy, 96.7% sensitivity, 98.6% specificity, and 97.7% h- mark data for pattern classification,’’ in Proc. NiSIS, Oct. 2005, pp. 1–9.
mean, whereas the classification performance for the 7-class [17] M. Sharma, S. Kumar Singh, P. Agrawal, and V. Madaan, ‘‘Classification
problem yields a high accuracy of 95.9%, high sensitivity of clinical dataset of cervical cancer using KNN,’’ Indian J. Sci. Technol,
vol. 9, no. 28, pp. 1–5, Jul. 2016.
of 96.2%, high specificity of 99.3%, and high h-mean of [18] R. Kumar, R. Srivastava, and S. Srivastava, ‘‘Detection and classification
97.7%. of cancer from microscopic biopsy images using clinically significant and
The advantage of our method is that we do not need com- biologically interpretable features,’’ J. Med. Eng., vol. 2015, pp. 1–14,
Apr. 2015.
plex pre-processing steps since feature selection is conducted [19] J. Talukdar, C. Kr Nath, and P. H. Talukdar, ‘‘Fuzzy clustering based
by the Mask R-CNN algorithm. The limitation of our work is image segmentation of Pap smear images of cervical cancer cell using
the need for higher processing power compared to the other FCM Algorithm,’’ Int. J. Eng. Innov. Technol., vol. 3, no. 1, pp. 460–462,
Jul. 2013.
methods. Future study should focus on the use of a deeper [20] W. William, A. Ware, A. H. Basaza-Ejiri, and J. Obungoloch, ‘‘Cervical
network to improve the performance results. cancer classification from Pap-smears using an enhanced fuzzy C-means
algorithm,’’ Informat. Med. Unlocked, vol. 14, pp. 23–33, Feb. 2019.
[21] P. Liang, G. Sun, and S. and Wei, ‘‘Application of deep learning algorithm
ACKNOWLEDGMENT
in cervical cancer MRI image segmentation based on wireless sensor,’’
The authors would like to thank for the Herlev Pap smear J. Med. Syst., vol. 43, no. 156, pp. 1–7, Jun. 2019.
dataset collected by Herlev University Hospital (Denmark) [22] F. H. D. Araújo, R. R. V. Silva, D. M. Ushizima, M. T. Rezende,
C. M. Carneiro, A. G. C. Bianchi, and F. N. S. Medeiros, ‘‘Deep learning for
and the Technical University of Denmark.
cell image segmentation and ranking,’’ Computerized Med. Imag. Graph.,
vol. 72, pp. 13–21, Mar. 2019.
REFERENCES [23] Y. Song, L. Zhang, S. Chen, D. Ni, B. Li, Y. Zhou, B. Lei, and T. Wang,
[1] F. Bray, J. Ferlay, I. Soerjomataram, R. L. Siegel, L. A. Torre, and A. Jemal, ‘‘A deep learning based framework for accurate segmentation of cervical
‘‘Global cancer statistics 2018: GLOBOCAN estimates of incidence and cytoplasm and nuclei,’’ in Proc. EMBC, Aug. 2014, pp. 2903–2906.
mortality worldwide for 36 cancers in 185 countries,’’ CA, Cancer J. Clin., [24] Y. Liu, P. Zhang, Q. Song, A. Li, P. Zhang, and Z. Gui, ‘‘Automatic
vol. 68, no. 6, pp. 394–424, 2018. segmentation of cervical nuclei based on deep learning and a conditional
[2] E. Martin, ‘‘Pap-smear classification,’’ M.S. thesis, Dept. Automat., Tech. random field,’’ IEEE Access, vol. 6, pp. 53709–53721, 2018.
Univ. Denmark, Lyngby, Denmark, 2003. [25] T. Chankong, N. Theera-Umpon, and S. Auephanwiriyakul, ‘‘Automatic
[3] D. L. Rosenthal, ‘‘Computerized scanning devices for pap smear screening: cervical cell segmentation and classification in pap smears,’’ Comput.
Current status and critical review,’’ Clinics Lab. Med., vol. 17, no. 2, Methods Programs Biomed., vol. 113, no. 2, pp. 539–556, 2014.
pp. 263–284, Jun. 1997. [26] K. Li, Z. Lu, W. Liu, and J. Yin, ‘‘Cytoplasm and nucleus segmentation
[4] E. Bengtsson and P. Malm, ‘‘Screening for cervical cancer using auto- in cervical smear images using radiating GVF snake,’’ Pattern Recognit.,
mated analysis of PAP-smears,’’ Comput. Math. Methods Med., vol. 2014, vol. 45, no. 4, pp. 1255–1264, 2012.
pp. 1–12, Mar. 2014. [27] P. Y. Pai, C. C. Chang, and Y. K. Chan, ‘‘Nucleus and cytoplast contour
[5] M. E. Plissiti, C. Nikou, and A. Charchanti, ‘‘Automated detection of detector of cervical smear image,’’ Expert Syst. Appl., vol. 39, no. 1,
cell nuclei in Pap smear images using morphological reconstruction pp. 154–161, Jul. 2012.
and clustering,’’ IEEE Trans. Inf. Technol. Biomed., vol. 15, no. 2, [28] M. H. Tsai, Y. K. Chan, Z. Z. Lin, S. F. Yang-Mao, and P. C. Huang,
pp. 233–241, Mar. 2011. ‘‘Nucleus and cytoplast contour detector of cervical smear image,’’ Pattern
[6] Y.-F. Chen, P. C. Huang, K. C. Lin, H. H. Lin, L. E. Wang, C. C. Cheng, Recognit. Lett., vol. 29, no. 9, pp. 1441–1453, Jul. 2008.
T. P. Chen, Y. K. Chan, and J. Y. Chiang, ‘‘Semi-automatic segmentation [29] B. Sokouti, S. Haghipour, and A. D. Tabrizi, ‘‘A framework for diagnosing
and classification of pap smear cells,’’ IEEE J. Biomed. Health Informat., cervical cancer disease based on feedforward MLP neural network and
vol. 18, no. 1, pp. 94–108, Jan. 2014. ThinPrep histopathological cell image features,’’ Neural Comput. Appl.,
[7] J. Su, X. Xu, Y. He, and J. Song, ‘‘Automatic detection of cervical cancer vol. 24, no. 1, pp. 221–232, Jan. 2014.
cells by a two-level cascade classification system,’’ Anal. Cellular Pathol., [30] L. Zhang, L. Lu, I. Nogues, R. M. Summers, S. Liu, and J. Yao, ‘‘Deep-
vol. 2016, pp. 1–11, Apr. 2016. Pap: Deep convolutional networks for cervical cell classification,’’ IEEE
[8] B. Jan, H. Farman, M. Khan, M. Imran, I. U. Islam, A. Ahmad, S. Ali, J. Biomed. Health Inform., vol. 21, no. 6, pp. 1633–1643, Nov. 2017.
and G. Jeon, ‘‘Deep learning in big data Analytics: A comparative study,’’ [31] M. Wu, C. Yan, H. Liu, Q. Liu, and Y. Yin, ‘‘Automatic classification
Comput. Electr. Eng., vol. 75, pp. 275–287, May 2019. of cervical cancer from cytological images by using convolutional neural
[9] R. F. Costa, A. Longatto-Filho, C. Pinheiro, L. C. Zeferino, and network,’’ Biosci. Rep., vol. 38, no. 6, pp. 1–9, Nov. 2008.
J. H. Fregnani, ‘‘Historical analysis of the Brazilian cervical cancer screen- [32] A. Gençtav, S. Aksoy, and S. Önder, ‘‘Unsupervised segmentation and
ing program from 2006 to 2013: A time for reflection,’’ PLoS One, vol. 10, classification of cervical cell images,’’ Pattern Recognit., vol. 45, no. 12,
no. 9, Sep. 2015, Art. no. e0138945. pp. 4151–4168, Sep. 2012.
[10] C. Bergmeir, M. G. Silvente, J. E. López-Cuervo, and J. M. Benítez, [33] J. Norup, ‘‘Classification of pap-smear data by transductive neuro-fuzzy
‘‘Segmentation of cervical cell images using mean-shift filtering and mor- methods,’’ M. S. thesis, Dept. Automat., Tech. Univ. Denmark, Lyngby,
phological operators,’’ Proc. SPIE, Med. Imag., Image Process., vol. 7623, Denmark, 2005.
Mar. 2010, Art. no. 76234C. doi: 10.1117/12.845587. [34] T. Y. Lin, M. Maire, S. Belongie, L. Bourdev, R. Girshick, J. Hays,
[11] W. Wu and H. Zhou, ‘‘Data-driven diagnosis of cervical cancer P. Perona, D. Ramanan, C. L. Zitnick, and P. Dollar, ‘‘Microsoft COCO:
with support vector machine-based approaches,’’ IEEE Access, vol. 5, Common Objects in Context,’’ 2015, arXiv:1405.0312. [Online]. Avail-
pp. 25189–25195, 2017. able: [Link]
[12] D. Kashyap, A. Somani, J. Shekhar, A. Bhan, M. K. Dutta, R. Burget, and [35] L. Taylor and G. Nitschke, ‘‘Improving deep learning using generic
K. Riha, ‘‘Cervical cancer detection and classification using independent data augmentation,’’ Aug. 2017, arXiv:1708.06020. [Online]. Available:
level sets and multi SVMs,’’ in Proc. 39th TSP, Jun. 2016, pp. 523–528. [Link]
[36] Z. Q. Zhao, P. Zheng, S. Xu, and X. Wu, ‘‘Object detection with deep KHALID HAMED S. ALLEHAIBI received the
learning: A review,’’ Apr. 2019, arXiv:1807.05511. [Online]. Available: [Link]. degree from King Abdulaziz University, Jed-
[Link] dah, Saudi Arabia, the [Link]. degree from the
[37] A. Rosebrock, ‘‘Mask R-CNN and Cancer Detection,’’ in Deep Learning University of Tulsa, OK, USA, and the Ph.D.
for Computer Vision With Python. 2nd ed. Stockholm , Sweden: Image- degree from De Monfort University, U.K., in 2014,
Search, Nov. 2018. all in computer science. He was the Chairman
[38] K. He, G. Gkioxari, P. DollÃąr, and R. Girschik, ‘‘Mask R- of the Information Technology Department, Fac-
CNN,’’ Jan. 2018, arXiv:1703.06870. [Online]. Available:
ulty of Computing and Information Technology in
[Link]
Rabigh. He is currently an Assistant Professor with
[39] R. Girshick, J. Donahue, T. Darrell, and J. Malik, ‘‘Rich feature hierarchies
for accurate object detection and semantic segmentation,’’ Oct. 2014. the Department of Computer Science, Faculty of
arXiv:1311.2524. [Online]. Available: [Link] Computing and Information Technology, King Abdulaziz University.
[40] R. Girshick, ‘‘Fast R-CNN,’’ Sep. 2015, arXiv:1504.08083. [Online].
Available: [Link]
[41] S. Ren, K. He, R. Girshick, and J. Sun, ‘‘Faster R-CNN: Towards
real-time object detection with region proposal networks,’’ Jan. 2016,
arXiv:1506.01497. [Online]. Available: [Link]
[42] K. Simonyan and A. Zisserman, ‘‘Very deep convolutional networks for
large-scale image recognition,’’ Apr. 2015, arXiv:1409.1556. [Online].
Available: [Link] LUKITO EDI NUGROHO received the [Link].
[43] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, degree from James Cook University, Australia,
D. Erhan, V. Vanhoucke, and A. Rabinovich, ‘‘Going deeper with in 1995, and the Ph.D. degree from Monash Uni-
convolutions,’’ Sep. 2014, arXiv:1409.4842. [Online]. Available: versity, Australia, in 2001. He is currently an Asso-
[Link] ciate Professor with the Department of Electrical
[44] A. P. Zijdenbos, B. M. Dawant, R. A. Margolin, and A. C. Palmer, and Information Engineering, Faculty of Engineer-
‘‘Morphometric analysis of white matter lesion in MR images: Method ing, Universitas Gadjah Mada, Indonesia, where
and validation,’’ IEEE Trans. Med. Imag., vol. 13, no. 4, pp. 716–724, he was appointed as an Academic Staff Member
Dec. 1994. after completing his undergraduate degree. His
[45] J. C. Bezdek, Pattern Recognition With Fuzzy Objective Function Algo- research interests include pervasive and mobile
rithms. New York, NY, USA: Plenum Press, 1981. computing, software engineering, and applications of ICT in education. He is
[46] P. Soille, Morphological Image Analysis Principles and Applications.
also a member of ACM.
Berlin, Germany: Springer, 2004.
[47] M. A. Devi, J. I. Sheeba, and K. S. Joseph, ‘‘Neutrosophic graph cut-based
segmentation scheme for efficient cervical cancer detection,’’ J. King Saud
Univ.-Comput. Inf. Sci., to be published. doi: 10.1016/[Link].2018.09.014.
[48] K. Bora, M. Chowdhury, L. B. Mahanta, M. K. Kundu, and A. K. Das,
‘‘Automated classification of pap smear images to detect cervical dyspla-
sia,’’ Comput. Method Programs Biomed., vol. 138, pp. 31–47, Jan. 2017.
[49] P. R. Paul, M. K. Bhowmik, and D. Bhattacharjee, ‘‘Automated cervical
cancer detection using pap smear images,’’ in Proc. 4th Int. Conf. Soft Com- WIDYAWAN received the [Link]. degree in elec-
put. Problem Solving, Adv. Intell. Syst. Comput., Mar. 2015, pp. 267–278. trical engineering from Universitas Gadjah Mada,
[50] Y. Marinakis, G. Dounias, and J. Jantzen, ‘‘Pap smear diagnosis using the [Link]. degree from Erasmus University, The
a hybrid intelligent scheme focusing on genetic algorithm based feature Netherlands, and the Ph.D. degree in electronic
selection and nearest neighbor classification,’’ Comput. Biol. Med., vol. 3, engineering from the Cork Institute of Technol-
no. 1, pp. 69–78, Jan. 2009. ogy, Ireland. He is currently an Assistant Professor
with the Department of Electrical and Information
Engineering, Universitas Gadjah Mada, Indonesia,
where he has been serving as the Director of the
Centre for Information Systems and Resources.
He is also on the Board of Trustees, Gamatechno Indonesia. He co-founded
Ubiaware, a software systems company specializing in WLAN design and
location systems, and is also a Researcher with the Centre for Adaptive
Wireless Systems, Ireland. His research interests include wireless sensors
networks, machine learning, location technology, and ubiquitous computing
to allow computing to fade quietly into the background of everyday life.
KURNIANINGSIH received the [Link]. degree in
informatics engineering from Telkom University,
Indonesia, the [Link]. degree in electrical engi-
neering from North Sumatera University, Indone-
sia, and the Ph.D. degree in electrical engineering
from Universitas Gadjah Mada, Indonesia. She is
currently an Assistant Professor with the Depart- LUTFAN LAZUARDI received the Medical Doc-
ment of Electrical Engineering, Politeknik Negeri tor and Master of Public Health degrees from
Semarang, Indonesia. Her current research inter- Universitas Gadjah Mada, and the Ph.D. degree
ests include sensor networks, machine learning, from Innsbruck Medical University, in 2006. He is
and computational intelligence. She has been an Executive Committee Mem- currently an Assistant Professor of public health
ber in the IEEE Region 10 (Asia-Pacific Region), since 2018, appointed as with the Faculty of Medicine, Public Health and
the Information Management Committee Chair. She is also a member of the Nursing, Universitas Gadjah Mada, Indonesia. His
IEEE Computational Intelligence Society (IEEE CIS) and the IEEE Systems, main research interest includes public health infor-
Man, and Cybernetics Society (IEEE SMCS). She was a recipient of the IEEE matics. He has been actively involved in the devel-
Region 10 Young Professionals Award in Academician, in 2018. She also opment of a cancer registry in the Yogyakarta
serves as the Vice-Chair for the IEEE Indonesia Section. region under the support of the Indonesian Ministry of Health.
ANTON SATRIA PRABUWONO started his TEDDY MANTORO received the [Link]., [Link].,
academic career at the Institute of Electronics, and Ph.D. degrees in computer science and the
National Chiao Tung University, Taiwan, and Ph.D. degree from the School of Computer Sci-
the Faculty of Information and Communication ence, The Australian National University (ANU),
Technology, Universiti Teknikal Malaysia Melaka Canberra, Australia. He is currently a Computer
(UTeM), in 2006 and 2007, respectively. He joined Science Professor with Sampoerna University,
the Faculty of Information Science and Tech- Jakarta, Indonesia. He has conducted intensive
nology, Universiti Kebangsaan Malaysia (UKM), work in the intelligent environment that uses com-
in 2009. He then joined the Faculty of Computing putational intelligence. He developed the concept
and Information Technology, King Abdulaziz Uni- and theory of context-aware computing for the
versity, Rabigh, Saudi Arabia, in 2013. He was an Erasmus Mundus Visiting intelligent environment, and as a proof of concept, he and his lab developed
Professor with the Department of Mechanical Engineering and Mechatron- many prototypes that led to many awards. His research interests include
ics, Karlsruhe University of Applied Sciences, Germany. He is currently information security, pervasive computing, and intelligent environment/IoT.
a Professor with the Faculty of Computing and Information Technology, He received five Gold, nine Silver, and eleven Bronze medals from the
King Abdulaziz University. His research interests include computer vision, National and International IT Innovation Competitions, since 2009.
intelligent robotics, and autonomous systems. He is also a Senior Member
of ACM.