DP15738 Algorithmic collusion with imperfect monitoring

Author(s): Emilio Calvano, Giacomo Calzolari, Vincenzo Denicolò, Sergio Pastorello
Publication Date: January 2021
Keyword(s): artificial intelligence, Collusion, Imperfect Monitoring, Q-Learning
JEL(s): D43, D83, L13, L41
Programme Areas: Industrial Organization
Link to this Page: cepr.org/active/publications/discussion_papers/dp.php?dpno=15738

We show that if they are allowed enough time to complete the learning, Q-learning algorithms can learn to collude in an environment with imperfect monitoring adapted from Green and Porter (1984), without having been instructed to do so, and without communicating with one another. Collusion is sustained by punishments that take the form of "price wars" triggered by the observation of low prices. The punishments have a finite duration, being harsher initially and then gradually fading away. Such punishments are triggered both by deviations and by adverse demand shocks.