Anika_Kowalczyk ✓ PACS Admin
Clinical Informatics · Warsaw
Mar 2026
The number of people building medical imaging AI pipelines who don't properly understand DICOM metadata is alarming, and it causes silent bugs that are incredibly hard to track down. The most common one: using pixel data from a CT series without checking the RescaleIntercept and RescaleSlope DICOM tags, which means you're working with raw stored values rather than Hounsfield Units. The formula is always HU = pixel_value * RescaleSlope + RescaleIntercept and most CT series have a slope of 1 and intercept of -1024, but not all. Scanner manufacturers and reconstruction kernels vary. Similarly, always check ImageOrientationPatient before assuming axial/coronal/sagittal orientation — some PACS systems export series in non-standard orientations and your model will receive flipped or transposed volumes silently. Pydicom's documentation at pydicom.github.io covers these tags thoroughly and SimpleITK handles most of this correctly if you use its DICOM reader rather than rolling your own. Use SimpleITK where you can.
Nadia_Bassett
Research Physicist · UCSF
Apr 2026
If you're building a pipeline that needs to handle MRI alongside CT, be very aware that MRI pixel values have no absolute physical meaning — they're scanner-dependent, protocol-dependent, and even session-dependent for the same patient on the same machine. There's no MRI equivalent of Hounsfield Units. This means you can never do population-level intensity normalization the way you do with CT. The standard approaches for MRI are z-score normalization per-volume, percentile clipping (clip to 1st and 99th percentile then scale to [0,1]), or histogram matching to a reference template. For brain MRI specifically, the Nyul & Udupa (2000) histogram standardization method implemented in intensity-normalization library (github.com/jcreinhold/intensity-normalization) is still widely used and works well. Not knowing this distinction between CT and MRI normalization requirements is a root cause of a lot of poor model performance that people incorrectly attribute to the model architecture.