Conceitos essenciais
The proposed A-LLMRec framework efficiently combines the collaborative knowledge from a pre-trained state-of-the-art collaborative filtering recommender system with the textual knowledge from a large language model, enabling superior performance in both cold and warm scenarios.
Resumo
The paper proposes A-LLMRec, an efficient all-round recommender system that leverages large language models (LLMs) and collaborative filtering (CF) techniques. The key idea is to align the collaborative knowledge from a pre-trained state-of-the-art CF recommender system with the token space of an LLM, allowing the LLM to directly utilize the collaborative knowledge for recommendation tasks.
The approach involves two stages:
- Alignment between Collaborative and Textual Knowledge: The item embeddings from the pre-trained CF recommender are aligned with the text embeddings from a pre-trained Sentence-BERT model, enabling the model to capture both collaborative and textual knowledge.
- Alignment between Joint Collaborative-Text Embedding and LLM: The aligned collaborative and textual knowledge is projected onto the token space of the LLM, allowing the LLM to leverage this joint knowledge for recommendation.
The proposed A-LLMRec has two key advantages: 1) it is model-agnostic, allowing integration with various existing CF recommender systems, and 2) it is efficient, as only the alignment network is trained, while the CF recommender and the LLM remain frozen.
Extensive experiments on various real-world datasets demonstrate the superiority of A-LLMRec, outperforming both traditional CF models and LLM-based models in both cold and warm scenarios, as well as in few-shot, cold user, and cross-domain settings. The paper also shows that A-LLMRec can generate natural language outputs based on the understanding of users and items through the aligned collaborative knowledge.
Estatísticas
The user-item interaction dataset contains 4 datasets from Amazon: Movies and TV, Video Games, Beauty, and Toys.
The number of users and items in the datasets range from 10K to 200K and 10K to 60K, respectively.
Citações
"Although modality-aware and LLM-based recommender systems have proven effective in cold scenarios with limited user-item interactions, we argue that these methods suffer from the lack of collaborative knowledge due to their heavy reliance on textual information."
"Our main idea is to enable an LLM to directly leverage the collaborative knowledge contained in a pre-trained state-of-the-art collaborative filtering recommender system (CF-RecSys) so that the emergent ability of the LLM as well as the high-quality user/item embeddings that are already trained by the state-of-the-art CF-RecSys can be jointly exploited."