arXiv - 2023 - Open Access

Self-Supervised Learning for Cervical Cytology in HPV-Positive Women Using Digital Pathology

This article summarizes key observations from the preprint study "Self-supervised learning-based cervical cytology for the triage of HPV-positive women in resource-limited settings and low-data regime".

Authors: Thomas Stegmüller, Christian Abbet, Behzad Bozorgtabar, Holly Clarke, Patrick Petignat, Pierre Vassilakos, Jean-Philippe Thiran
Institutions: École Polytechnique Fédérale de Lausanne, Hôpitaux Universitaires de Genève, Centre Hospitalier Universitaire Vaudois
DOI: https://doi.org/10.48550/arXiv.2302.05195

Digital Pathology and Cervical Cancer Screening Challenges

Cervical cancer remains a leading cause of mortality despite the availability of effective screening tests such as Pap tests and human papillomavirus (HPV) testing. While these methods have reduced mortality in high-income settings, access to cervical cancer screening remains limited in many regions.

Primary HPV testing offers high sensitivity for detecting cervical precancer and high-risk HPV types, but HPV-positive women require additional triage through cervical cytology to identify abnormal cells. In low-resource settings, this process is often constrained by limited access to trained cytopathologists, delayed test results, and the time-consuming nature of cytological analysis.

Dataset and Pap Smear Digital Imaging Workflow

The researchers developed an in-house dataset consisting of 307 Pap smear slides from HPV-positive women. Among these, 69 slides were cytology-positive and 238 negative, creating a challenging dataset where even negative samples may contain infection-related cellular changes.

Slide preparation followed the SurePath™ procedure, a liquid-based cervical cytology method that produces a small, concentrated cell-deposit area. This reduces scanning time and data volume, which is important for scalable slide imaging in digital pathology workflows. The slides were digitized using a portable Grundium Ocus®40 scanner with a 40× objective, producing high-resolution whole slide images with Z-stacking.

From these whole slide images, the authors generated multiple datasets. A total of 1,228 annotated positive cervical cells were extracted, along with corresponding image tiles. In addition, approximately 1.5 million unlabeled tiles were sampled from the slides.

Self-Supervised Learning for Medical Imaging Tasks

To address the limited availability of labeled data in cervical cytology, the study applies self-supervised learning methods to whole slide image tiles. Specifically, the authors use DINO, a self-supervised framework based on contrastive learning and representation consistency across image transformations.

Two backbone architectures were evaluated: ResNet-50 and Vision Transformer (ViT-S/16). These models were pretrained on unlabeled tile data and later evaluated on downstream medical image classification tasks, including cell-level and tile-level classification.

Results show that self-supervised models achieve performance comparable to or better than models pretrained on ImageNet for cervical cytology tasks. This suggests that domain-specific self-supervision using unlabeled data from Pap smear images can improve feature representation, particularly for medical image analysis tasks where labeled data are scarce.

Cervical Cell Copy-Pasting for Data Augmentation

A key challenge identified in the study is the limited transferability of models trained on publicly available single-cell datasets to real-world Pap smear tiles containing multiple cells. To address this, the authors introduce a data augmentation strategy called Cervical Cell Copy-Pasting (C3P).

C3P involves placing labeled cervical cells onto tile backgrounds to simulate realistic multi-cell images. Several techniques were evaluated, including direct pasting, blending, and Poisson-based image reconstruction. Among these, Poisson blending provided more consistent improvements in aligning cell-level and tile-level representations.

This approach enables better use of labeled data from external datasets while improving performance on tile-level classification tasks. The results demonstrate that combining self-supervised learning with targeted augmentation can improve generalization across different cytology data distributions.

Whole Slide Image Classification with Multiple Instance Learning

For Pap smear analysis at the whole slide level, the study uses multiple instance learning (MIL), where each slide is treated as a collection of image tiles. The authors evaluated several MIL methods, including AbMIL, TransMIL, and CLAM.

Initial experiments showed that standard MIL approaches struggle with Pap smear images. This is because abnormal or precancerous cells may be sparse, while normal or non-diagnostic patterns dominate most of the slide. As a result, capturing meaningful signals for classification is challenging.

To improve performance, the authors introduced several modifications:

  • Top-k tile selection: focusing on the most suspicious tiles rather than the full slide

  • Tile-level objective: improving detection of abnormal cells during training

  • C3P integration: augmenting tiles with pasted cells to reinforce learning signals

These adjustments improved both tile-level and slide-level classification performance, although the task remains complex due to the nature of HPV-positive cohorts.

Relevance for Digital Pathology and Cervical Screening

The study highlights how digital pathology combined with artificial intelligence and deep learning can support cervical cancer screening workflows, particularly in low-resource environments. By using self-supervised learning and unlabeled data, the need for large annotated datasets can be reduced.

Importantly, the work focuses on methodological development rather than clinical deployment. While the results support the feasibility of deep learning-based cervical cytology analysis, the study does not replace established cervical cancer screening recommendations or clinical diagnostic processes.

Instead, it demonstrates how advances in computer vision, pattern recognition, and medical imaging can contribute to future digital cytology systems, including potential telecytology applications.

Limitations of the Study

Several limitations are acknowledged. The study evaluates only one self-supervised learning framework (DINO), and results may differ with other methods. The dataset includes only HPV-positive women, which increases classification difficulty and may limit generalizability.

In addition, whole slide image classification using only slide-level labels remains challenging, particularly in detecting rare abnormal cells. The models were also limited to relatively lightweight architectures, chosen to reflect realistic constraints in low-resource settings.

Concluding Observations

This study demonstrates that self-supervised learning can effectively leverage unlabeled Pap smear whole slide images for cervical cytology tasks. The proposed C3P augmentation method improves knowledge transfer from labeled cell datasets, while modifications to multiple instance learning enhance slide-level analysis.

Overall, the findings support the potential of digital pathology and artificial intelligence to assist cervical cancer screening and HPV triage workflows where expert cytology is limited. Further validation on larger and more diverse datasets is required before clinical integration.

A curated collection of digital pathology studies and references is available on Grundium’s website.

Future workplaces 2025
© Grundium Oy 2025. All rights reserved.