In this paper, the researchers present HistoGPT, a vision language model that generates pathology reports from a patient's multiple full-resolution histology images.
Given multiple tissue sections from the same patient at up to 20× magnification, HistoGPT uses a vision foundation model to extract meaningful features from the images and combines them with a large language model (LLM) via cross-attention mechanisms to generate a pathology report.
Each generated report describes the tissue composition, cellular subtypes, and potential diagnosis.
In addition, users can interact with the model through various prompts to extract additional information such as tumor subtypes, tumor thickness, and tumor margins.
HistoGPT is trained on 15,129 whole slide images from 6,705 dermatology patients with corresponding pathology reports.
The researchers evaluate HistoGPT in an international, multi-center clinical study and show that it can accurately predict tumor subtypes, tumor thickness, and tumor margins in a zero-shot fashion.
HistoGPT also outperformed state-of-the-art MIL-based classifiers and general-purpose models like GPT-4 Vision and BioGPT-1B on key reporting and prediction metrics.
HistoGPT demonstrates the potential of artificial intelligence to assist pathologists in evaluating, reporting, and understanding routine dermatopathology cases.
【MORE】