Executive Summary
At the beginning of this year, we initiated a joint research project with Seoul National University Hospital and GC Biopharma to explore whether artificial intelligence could assist in the rapid and scalable interpretation of sputum microscopy slides.
The goal was simple:
detect white blood cells (WBCs) and identify potential pathogenic organisms across an entire slide automatically.
However, once we began operating on real clinical material rather than curated academic datasets, the true difficulty became apparent.
Tiny bacteria, gigapixel images, and poor-quality annotations quickly turned this into a systems engineering challenge rather than a simple modeling task.
This post explains what we attempted, what failed, how we adapted, and why an end-to-end platform such as Deep Block became essential.
Why Sputum Microscopy Matters
In many pneumonia workflows, sputum examination remains a frontline diagnostic method.
Clinicians typically evaluate:
These observations help determine:
The challenge is time and labor.
Manual microscopy requires trained personnel and becomes a bottleneck as case volume increases.
Automation is therefore highly attractive —but far more complex than it first appears.
Imaging Reality vs Research Datasets
For this project, we used the Roche Ventana slide scanner to digitize entire slides.
Each slide image file ranged from 9 GB to 34 GB.
Unlike single field-of-view microscope images commonly used in academic AI papers, whole slide imaging introduces several difficulties:
In other words, classical object detection recipes do not directly transfer.
The problem becomes one of hardware orchestration, data annotation, and graphical interface design.
Dataset Construction
Because this was an early-stage feasibility effort, the dataset was necessarily limited.
We prepared:
For each class, approximately 400 instances were annotated.
The effective magnification corresponds to roughly ~800×, which is lower than what a dedicated 1000× digital microscope can provide.
As a result, some microorganisms were only weakly visible.
All annotation, training, and inference workflows were executed inside Deep Block.
Where Things Became Difficult
The most severe challenge appeared with Gram-positive rods.
They were:
During labeling, boundaries were often drawn coarsely — sometimes closer to a loose region than a true biological contour.
The annotation boundary was not rendered correctly, as shown in the lower-left corner.
This detail has enormous consequences for learning.
Because modern segmentation networks optimize pixel-level agreement, systematic boundary inflation becomes a form of structured noise.
If the model predicts the true boundary, it can still be penalized.
If it predicts the inaccurate training boundary, it is rewarded.
Therefore, better models may actually learn worse shapes.
Why Straightforward Segmentation Was Not Reliable
In controlled datasets, segmentation supervision is powerful.
In our case:
Under these conditions, pixel supervision degraded into uncertainty propagation.
We observed unstable convergence and inconsistent validation behavior, even when architectures were improved.
The bottleneck was data quality, not network capacity.
The Deep Block Workflow
To handle whole slide AI development, were lied on an integrated pipeline rather than isolated scripts.
1. Slide Ingestion
WSI files were registered and indexed inside the platform.
2. Intelligent Tiling
Gigapixel images were divided into training tiles while preserving coordinate consistency.
3. Annotation
Experts labeled directly on tiles with class management and dataset correction.
4. Training
Experiments were reproducible and traceable across configurations.
5. Validation & Visualization
Predictions could be reviewed instantly at both the data level (JSON files) and the whole-slide image.
6. Export
Results were structured into machine-readable formats for further clinical analysis.
Without this orchestration, iteration speed would have been drastically slower.
Strategic Pivot: From Segmentation to Detection
Given the annotation characteristics, we changed strategy.
Instead of forcing pixel accuracy, we:
This approach compresses noisy supervision into a more tolerant representation.
Detection tolerates spatial looseness.
Segmentation does not.
Human-in-the-Loop Improvement
The detector helped surface:
Experts could then correct them more efficiently than annotating from scratch.
Over time, dataset reliability increased.
Inference result of DEEP BLOCK after training
Throughput Considerations
Beyond accuracy, we also measured:
Even at this early stage, automation significantly reduced the search burden for specialists.
Key Lessons
1. Resolution limits dominate algorithm choice
When bacteria approach pixel scale, labeling error becomes unavoidable.
2. Data quality outweighs model complexity
More sophisticated architectures did not compensate for coarse annotation.
3. Platforms matter
Managing WSI, annotation, training, and review as a single system is critical.
4. Detection can be a bridge
Bounding boxes allow progress while improving the dataset.
What Comes Next
Future phases will include:
The present work establishes feasibility and reveals the roadmap.
Closing Thoughts
AI in clinical microscopy is not merely a modeling problem.
It is an integration problem across imaging, human labeling, data engineering,and deployment constraints.
In this study, we were able to significantly reduce the number of false positives.
Although some false negatives remained, when we performed high-speed inference across the entire slide, we could still reliably determine which pathogens were present, as the abundance of specific organisms was clearly evident.
This project reinforced why Deep Block was designed as a full-stack platform:
to allow teams to move forward even when data is imperfect.
We are continuing to collaborate with medical and industry partners to push this boundary further.