Author
Listed:
- Shidi Deng
- Maximilian Schiffer
- Martin Bichler
Abstract
Nowadays, a significant share of the business-to-consumer sector is based on online platforms like Amazon and Alibaba and uses AI for pricing strategies. This has sparked debate on whether pricing algorithms may tacitly collude to set supra-competitive prices without being explicitly designed to do so. Our study addresses these concerns by examining the risk of collusion when Reinforcement Learning (RL) algorithms are used to decide on pricing strategies in competitive markets. Prior research in this field focused on Tabular Q-learning (TQL) and led to opposing views on whether learning-based algorithms can result in supra-competitive prices. Building on this, our work contributes to this ongoing discussion by providing a more nuanced numerical study that goes beyond TQL, additionally capturing off- and on- policy Deep Reinforcement Learning (DRL) algorithms, two distinct families of DRL algorithms that recently gained attention for algorithmic pricing. We study multiple Bertrand oligopoly variants and show that algorithmic collusion depends on the algorithm used. In our experiments, we observed that TQL tends to exhibit higher collusion and price dispersion. Moreover, it suffers from instability and disparity, as agents with higher learning rates consistently achieve higher profits, and it lacks robustness in state representation, with pricing dynamics varying significantly based on information access. In contrast, DRL algorithms, such as PPO and DQN, generally converge to lower prices closer to the Nash equilibrium. Additionally, we show that when pre-trained TQL agents interact with DRL agents, the latter quickly outperforms the former, highlighting the advantages of DRL in pricing competition. Lastly, we find that competition between heterogeneous DRL algorithms, such as PPO and DQN, tends to reduce the likelihood of supra-competitive pricing.
Suggested Citation
Shidi Deng & Maximilian Schiffer & Martin Bichler, 2025.
"Exploring Competitive and Collusive Behaviors in Algorithmic Pricing with Deep Reinforcement Learning,"
Papers
2503.11270, arXiv.org.
Handle:
RePEc:arx:papers:2503.11270
Download full text from publisher
Corrections
All material on this site has been provided by the respective publishers and authors. You can help correct errors and omissions. When requesting a correction, please mention this item's handle: RePEc:arx:papers:2503.11270. See general information about how to correct material in RePEc.
If you have authored this item and are not yet registered with RePEc, we encourage you to do it here. This allows to link your profile to this item. It also allows you to accept potential citations to this item that we are uncertain about.
We have no bibliographic references for this item. You can help adding them by using this form .
If you know of missing items citing this one, you can help us creating those links by adding the relevant references in the same way as above, for each refering item. If you are a registered author of this item, you may also want to check the "citations" tab in your RePEc Author Service profile, as there may be some citations waiting for confirmation.
For technical questions regarding this item, or to correct its authors, title, abstract, bibliographic or download information, contact: arXiv administrators (email available below). General contact details of provider: http://arxiv.org/ .
Please note that corrections may take a couple of weeks to filter through
the various RePEc services.