TalkPlay-Tools:
Conversational Music Recommendation with LLM Tool Calling

Seungheon Doh*,1,2, Keunwoo Choi*,2, Juhan Nam1
KAIST1, talkpl.ai2
* Equal contribution

Abstract

We propose an LLM-based music recommendation system with tool calling to serve as a unified retrieval-ranking pipeline. While recent developments in large language models (LLMs) have successfully enabled generative recommenders with natural language interactions, their recommendation behavior is limited, leaving other simpler yet crucial components such as metadata or attribute filtering underutilized in the system. Our system positions an LLM as an end-to-end recommendation system that interprets user intent, plans tool invocations, and orchestrates special- ized components—boolean filters (SQL), sparse retrieval (BM25), dense retrieval (embedding similarity), and generative retrieval (semantic IDs). Through tool plan- ning, the LLM predicts the 1) type of tools, 2) order of tools, and 3) arguments of tools – to find music for matching user preferences, supporting diverse modalities while seamlessly integrating multiple database filtering methods. We demonstrate that this unified tool-calling framework achieves competitive performance across diverse recommendation scenarios by selectively employing appropriate retrieval methods based on user queries, envisioning a new paradigm for conversational music recommendation systems.

TalkPlay system overview

Demo Examples


Input: Chat History