Digital pathology relies fundamentally on digital slide files-also known as whole-slide images (WSIs)-that encapsulate high-resolution representations of entire microscope slides. These files enable remote diagnosis, computational analysis, education, and archiving at a scale and flexibility impossible with traditional glass slides. The technical design of such file formats has implications for
interoperability, performance, long-term
data stewardship, and downstream analytical workflows. Broadly, digital slide file formats fall into two major categories:
proprietary formats developed by hardware vendors for their scanners, and interoperable formats engineered to facilitate cross-platform compatibility and open data exchange.
Proprietary Formats Proprietary digital slide formats are developed by hardware vendors to optimize performance and functionality for their specific scanning systems. These formats typically extend standard image containers with custom metadata structures, compression schemes, and organizational paradigms tailored to each manufacturer's technological approach. Vendor formats are designed to optimize feature sets of their native ecosystems and they present challenges for long-term data preservation, cross-platform compatibility, and vendor-neutral analysis workflows. These challenges include
vendor lock-in scenarios where institutions become dependent on specific hardware and software ecosystems, difficulties in migrating data between different platforms, and increased costs for maintaining multiple proprietary
toolchains.
SVS (Aperio) The SVS format (Slide and Viewable Storage), developed by Aperio (now part of
Leica Biosystems), is one of the most widely used digital slide formats in clinical and research pathology. SVS files are based on the
TIFF image standard, extended to support the multi-resolution
image pyramids. The format supports multiple image resolutions within a single file, with each level stored as a tiled image. The base (first) image is always the full-resolution capture. Subsidiary images represent
downsampled overviews, a
thumbnail, and optionally a macro image or a scanned label of the glass slide.
NDPI (Hamamatsu) NDPI is
Hamamatsu’s proprietary TIFF-based whole-slide imaging format, combining standard multi-directory TIFF pyramids with custom extensions for random access viewing and metadata handling. The format embeds JPEG-compressed strips within
TIFF IFDss, uses private tag ranges for offset catalogs and restart markers, and places the macro overview in the final directory—all without separate index files. Multi-resolution TIFF pyramid Separate IFDs represent each zoom level; the lowest-resolution (macro) overview resides in the last directory.
JPEG-compressed strips with restart markers Image data is stored as JPEG-compressed strips. Restart markers enable robust, random-access decoding of individual strips. Private TIFF tags (65420–65449+) Hamamatsu reserves custom tags to record strip offsets, high-order offset bits, restart-marker catalogs, and slide-specific metadata such as scan parameters.
Philips iSyntax The iSyntax format, developed by
Philips for its IntelliSite Pathology Solution and Ultra Fast Scanner systems, is a proprietary whole-slide imaging (WSI) format designed to combine the medical-grade image quality of
JPEG 2000 with the speed and responsiveness of JPEG. Unlike traditional pyramid-based TIFF formats, iSyntax uses a wavelet-based, inherently multi-resolution compression scheme. The format is optimized for real-time encoding and decoding, with a simplified
entropy coding stage—based on local correlation and an arithmetic coder—that is faster than
JPEG 2000’s
EBCOT, at the cost of ~10% increase in file size. Metadata in iSyntax is stored in a proprietary structure accessible via the Philips Pathology SDK. Due to its closed specification, interoperability with vendor-neutral libraries such as OpenSlide is not natively supported, and third-party access typically requires the official SDK or conversion tools.
PHN (Pharmanest Inc) PHN is an open file format designed to store and share digital pathology data at the level of individual fibers (fiber objects), with a special interest on collagen fibers and fibrosis extracted from whole-slide histology images. Fibers (fiber object) can be derived from the image analysis of other fibrillar proteins such as elastin, laminins, fibronectin… For each digital pathology image, the PHN file contains spatial coordinates, measurements, pixel-level segmentation masks, fiber texture analysis data, regions of interest (when applicable) making it possible to share data. PHN files are interoperable with other spatial biology dataset, thus allowing (a) novel spatial analysis of biological fibrillar structures or (b) spatial dataset fusion to bridge biological modalities – such as single-cell analysis - and histology. Technically, PHN files use the .phn extension and are built as standard ZIP archives. Inside, they combine human-readable JSON files (for metadata and measurements) with efficient NumPy .npz files (for segmentation masks). Each fiber is stored as its own object, including its geometry, bounding box, and quantitative features, allowing accurate reconstruction of the full spatial context across the tissue. PHN is designed to be transparent and interoperable so that single-fiber fibrosis data can be easily shared, reproduced, and analyzed across different platforms without relying on proprietary systems.
Other Proprietary Formats Additional examples include BIF (BioImagene Image File,
Roche), MRXS (3DHistech), SCN (older Leica), VMS/VMU (other Hamamatsu types), and more, most of which follow variants of the TIFF or BigTIFF structural paradigms, add proprietary tags, and embed unique metadata content. The diversity of proprietary formats, lack of public documentation, and evolving vendor SDKs all contribute to challenges in universal accessibility and comprehensive tool compatibility.
Interoperable Formats In response to the proliferation of proprietary formats and the growing demand for large-scale, multi-center, and AI-driven digital pathology solutions, the community has advanced a set of interoperable image formats engineered for both long-term preservation and high-performance analysis. Chief among these are
DICOM for whole-slide imaging,
OME-TIFF, and
Iris File Extension, each offering key advantages for cross-platform
data sharing, metadata standardization, and toolchain integration.
Digital Imaging and Communications in Medicine (DICOM) The
DICOM standard, originally developed for radiology, has been extended to support whole-slide imaging (WSI), with comprehensive definitions for tiled image pyramids, cross-reference metadata, and complex imaging workflows. DICOM Supplement 145 introduced the VL Whole Slide Microscopy Image IOD (Information Object Definition), enabling the storage of large, multi-resolution pathology images as collections of frames (tiles) within a single DICOM series. Key features include: • Multi-resolution pyramids comprise separate image object files representing each resolution level within a single directory for each slide • Compression support for
JPEG and
JPEG 2000 (J2K) formats • Z-planes and multi-channel imaging capabilities for depth imaging and associated metadata • Structured metadata encoding using standardized DICOM attribute tags for reproducible specimen information and acquisition parameters • Coordinate referencing system with slide-based (X, Y, Z) spatial positioning and Frame of Reference tags • Full and sparse tiling support for flexible data organization Many clinical workflows are adopting DICOM WSI as the long-term reference format, in part due to regulatory requirements and the need for standardized interoperability across institutions and platforms. Nevertheless, several challenges remain. DICOM WSI encoding can introduce computational overhead due in part to its multipart structure, making real-time display within viewers more challenging than proprietary or performance-focused formats. Furthermore DICOM has been criticized as a monolithic file specification that imposes architectural constraints and restricting technological choices. Conversion tools such as wsidicomizer, Orthanc WSI server, and PixelMed’s TIFFToDicom facilitate migration from legacy formats to DICOM-compliant archives.
OME-TIFF (Open Microscopy Environment TIFF) OME-TIFF is an open, extensible format developed by the
Open Microscopy Environment (OME) consortium to address both the data and metadata needs of modern bioimaging. It extends the classic
TIFF structure-with its widespread library and tool support-by embedding structured OME-XML metadata within TIFF tags, particularly within the ImageDescription field of the first Image File Directory (IFD). The file specification is made available by the OME consortium. Key features include: •
Pyramidal multi-resolution support: OME-TIFF uses TIFF’s SubIFD mechanism (Tag 330) to represent image pyramids, supporting rapid image navigation. Each level may use its own compression (
JPEG, JPEG 2000, etc.), and
BigTIFF extensions are supported for large file sizes. • Metadata extensibility: OME-
XML is a schema-centered metadata language capable of representing imaging provenance (e.g., microscope, lens, detector), acquisition modality, coordinate mapping, and experiment details. Attributes such as channel wavelengths,
Z index, timepoint, and objective are standardized for interoperability. • Multi-dimensionality: Supports Z-stacks, time series, multichannel imaging, and 3D/4D data organization. • OME ecosystem: The Bio-Formats Java library and OMERO server provide read/write and management capabilities for OME-TIFF images, with desktop analysis supported by
QuPath,
Fiji/
ImageJ, and others. • Validation and archiving: The format is openly specified, making it suitable for long-term research data stewardship, regulatory submission, and reproducible AI workflows. OME-TIFF is widely adopted in research and academic pathology where comprehensive metadata and analysis pipeline integration are prioritized. The Bio-Formats Java library and OMERO server provide read/write and management capabilities, with desktop analysis supported by QuPath and Fiji/ImageJ. Open-source code facilitates widespread access to OME-TIFF files across platforms.
Iris File Extension (IFE) The
Iris File Extension (IFE) is a modern binary container format for whole-slide images developed at the
University of Michigan. Built upon contemporary performance serialization technology and incorporating familiar
TIFF concepts, IFE addresses performance limitations of existing formats through optimized architecture designed specifically for high-speed file operations and efficient local slide rendering. The format enables rapid random-access reads and massively multithreaded file encoding writes while maintaining compatibility through validation routines. The specification is made completely open under a Creative Commons Attribution-No Derivative 4.0 license. Key features include: • Memory-mapped binary tile offset tables enabling direct random access without additional indexing • Modern compression support for legacy
JPEG and contemporary
AVIF image formats • Binary-encoded metadata segments storing slide descriptors, acquisition parameters, and spatial coordinates • File-level and section-level integrity validation with early corruption detection • Embedded annotation blocks for native serialization of regions of interest • Multi-threaded parallel write architecture allowing simultaneous tile encoding and flushing • Open-source Ecosystem: Iris Codec (with
WebAssembly module) and Iris
RESTful server provide performance networking deployments and tools for format conversion and file updates. • Versioned headers with feature flags ensuring backward and forward compatibility IFE embeds structured metadata and annotations alongside pixel data, enabling integrated validation and region-of-interest handling within a single file. The format's architecture decouples metadata parsing from pixel retrieval and uses memory-mapped regions to achieve optimized tile access times compared to traditional WSI formats. Cross-platform implementations are available in
C++,
Python, and
JavaScript, with specification validation tools supporting widespread adoption. == Challenges ==