Details
- Identification
- DOI: 10.3011/ESARDA.IJNSNP.2026.1
- Publication date
- 18 May 2026
- Author
- Joint Research Centre
Description
Volume: 68, December 2026, pages 2 - 8
Authors: Andrew Puyleart a, Thomas Grimes a, Benjamin Wilson a, Jennifer Hart a, Taissa Sobolev a, Ty Otto a
a Pacific Northwest National Laboratory, 902 Battelle Blvd, Richland, WA 99354
Abstract: The International Atomic Energy Agency (IAEA) faces a growing challenge in processing the increasing volume of surveillance data, which is expected to continue straining resources. New tools to expedite the review process of surveillance data are invaluable resources that have potential to save significant labor hours. Rapid advances in machine learning promise to automate parts of the image analysis workflow, thereby multiplying the productivity of human analysts and allowing the IAEA to systematically review a larger volume of imagery. The use of multimodal foundation models paired with human-in-the-loop workflows to accelerate image adjudication is discussed in this research. This work also developed multiple forensic review tools that utilize multimodal embeddings. The utility of embeddings in an exploratory toolkit deployed during a training exercise was demonstrated to capture safeguards-representative imagery. The toolkit’s object detection results were processed overnight and presented to training exercise participants on the final day of the exercise. The text prompt interactions were well received by the training exercise participants in the presentation’s audience.
Keywords:
Multimodal Foundation Models, Deep Learning, Surveillance, Nuclear Safeguards, Multi-Modal
Reference guideline: Puyleart, A., Grimes, T., Wilson, B., Hart J., Sobolev, T., Otto, T. (2026). FatCat: Utilizing Multi-Modal Foundational
Models for Surveillance Image Review, ESARDA Bulletin - The International Journal of Nuclear Safeguards and
Non-proliferation, 68, 2-8 https://doi.org/10.3011/ESARDA.IJNSNP.2026.1
