One-Prompt Segmentation combines the strengths of one-shot and interactive segmentation methods to enable zero-shot generalization across diverse medical imaging tasks, requiring only a single prompted sample during inference.
SaLIP, a unified framework that leverages the combined capabilities of the Segment Anything Model (SAM) and Contrastive Language-Image Pre-Training (CLIP) to perform zero-shot organ segmentation in medical images, without relying on domain expertise or annotated data for prompt engineering.
The core message of this paper is to introduce a novel spatially agile transformer UNet architecture, termed AgileFormer, that systematically incorporates deformable patch embedding, spatially dynamic self-attention, and multi-scale deformable positional encoding to effectively capture diverse target objects in medical image segmentation tasks.
State Space Models like Mamba offer efficient long-range interaction modeling with linear complexity, inspiring the development of VM-UNetV2 for competitive medical image segmentation.