LLaSA is a novel framework that improves Large Language Models' ability to process and utilize structured data (tables, graphs, databases) by representing them as hypergraphs, enabling a unified encoding method and enhancing performance on various knowledge-intensive tasks.
本文全面概述了口語對話模型,特別是級聯和端到端模型,並深入探討了語音表徵、訓練範式、串流、雙工和互動能力等核心技術,以及相關數據集、評估指標和基準。
WORLDREP is a new dataset that leverages the power of large language models to predict future international events from text, addressing limitations of existing datasets by capturing complex multilateral relations and providing high-quality, expert-validated labels.
AddrLLM, a novel framework leveraging retrieval-augmented large language models, effectively rewrites inaccurate addresses, significantly improving logistics efficiency.
Large language models (LLMs) can be effectively specialized for specific domains using a multi-stage training approach that involves knowledge distillation, iterative refinement with expert feedback, and self-evolution through inference strategy optimization.
대규모 언어 모델(LLM)의 코드 생성 능력을 향상시키기 위해 몬테카를로 트리 탐색(MCTS) 기반 자기 주도적 추론 증강 방식인 SRA-MCTS를 제안하며, 이는 다양한 중간 추론 경로를 생성하여 모델의 자율적인 사고를 촉진하고, 특히 소규모 모델의 성능을 크게 향상시킵니다.
Integrating a self-driven reasoning augmentation process using Monte Carlo Tree Search (SRA-MCTS) significantly improves the code generation capabilities of large language models, particularly in solving complex problems, by enabling the models to autonomously generate and evaluate diverse reasoning paths.
BlueLM-V-3B is an efficient algorithm and system co-design approach that enables the deployment of powerful and fast multimodal large language models (MLLMs) on mobile devices by addressing the challenges of limited memory and computational resources.
AmoebaLLM is a novel framework that enables the efficient deployment of large language models (LLMs) by allowing for the instant derivation of compressed subnets with arbitrary shapes, achieving a balance between accuracy and efficiency without the need for individual fine-tuning.
본 논문은 한국어와 같이 자원이 부족한 언어에서 효과적인 관점 기반 감성 분석(ABSA)을 수행하기 위해 번역된 벤치마크 데이터와 레이블링 되지 않은 데이터를 활용하는 KPC-cF 프레임워크를 제안합니다.