Robust and Adaptive Speech Large Language Model: WavLLM
WavLLM is a robust and adaptive speech large language model that utilizes dual encoders (Whisper and WavLM) to capture semantic and acoustic information, and employs a prompt-aware LoRA weight adapter to enhance its generalization capabilities across complex multi-task instructions.