The FOCUS framework leverages pathology foundation models and language-guided prompts to prioritize diagnostically relevant regions in whole slide images, significantly improving few-shot classification accuracy in computational pathology.
MagicDriveDiT는 DiT 아키텍처와 혁신적인 공간-시간 조건 인코딩을 활용하여 이전 방법보다 해상도와 프레임 수가 크게 향상된 사실적인 자율 주행용 장시간 비디오를 생성하는 프레임워크입니다.
実世界の閉塞を模倣した新しい合成データセットBelHouse3Dは、屋内シーンの3D点群セマンティックセグメンテーションにおける、より汎化可能なモデル開発のための、Out-of-Distribution(OOD)評価を提供する。
本稿では、LiDARおよびカメラベースの3D物体検出のパフォーマンスに、物体や環境に関する様々な要因が及ぼす影響を統計的に分析する方法論を提案する。
This research paper introduces DuoLift-GAN and DuoLift-CNN, novel deep learning models that reconstruct 3D chest CT volumes from single or biplanar X-ray images, outperforming existing methods in accuracy and visual realism while offering a detailed analysis of evaluation metrics for chest CT reconstruction quality.
Edify Image는 라플라시안 확산 모델을 사용하여 텍스트에서 고해상도 이미지를 생성하고, 이미지 업샘플링, 스타일 제어, 파노라마 생성 및 사용자 지정을 위한 미세 조정과 같은 다양한 기능을 제공하는 픽셀 공간 확산 모델 제품군입니다.
This research paper introduces novel methods for detecting and analyzing visual artifacts specific to JPEG AI image compression, a learning-based approach, and presents a dataset of such artifacts to aid in improving these codecs.
Images generated by Latent Diffusion Models (LDMs) can be effectively detected by identifying artifacts introduced by their autoencoders, eliminating the need for training on synthetic data and reducing computational costs.
NeuReg, a novel neuro-inspired deep learning architecture, achieves state-of-the-art performance in domain-invariant 3D brain image registration, effectively handling variations in human and mouse brain images across different imaging modalities and developmental stages.
This paper introduces Text2CAD, a novel framework leveraging stable diffusion models to automate the creation of 3D CAD models from textual descriptions, bridging the gap between user intent and engineering output.