Weng Paul; Busa-Fekete Róbert; Hüllermeier Eyke: Interactive Q-learning with ordinal rewards and unreliable tutor.
In: ECML Workshop on Reinforcement Learning with Generalized Feedback: Beyond Numeric Rewards.