Multi-armed Bandit

Learning Multi-Objective Rewards and User Utility Function in Contextual Bandits for Personalized Ranking

This paper tackles the problem of providing users with ranked lists of relevant search results, by incorporating contextual features of the users and search results, and learning how a user values multiple objectives. For example, to recommend a …

MOR-LinUCB: A Multi-Objective and Context-Aware Ranking Approach

Understanding users' search intents on the web can be enhanced by contextual information about users and web resources. Providing users with relevant search results requires balancing multiple objectives, such as users' explicitly stated preferences …