Deep Reinforcement Learning for Tehran Stock Trading

Neda Yousefi

doi:10.56741/jnest.v1i02.171

Deep Reinforcement Learning for Tehran Stock Trading

Authors

Neda Yousefi Allameh Tabataba’i University, Tehran, Iran

DOI:

https://doi.org/10.56741/jnest.v1i02.171

Keywords:

machine learning, deep learning, reinforcement learning, deep deterministic policy gradient (DDPG), Advantage Actor Critic (A2C), stock trading

Abstract

One of the most interesting topics for research, as well as for making a profit, is stock trading. It is known that artificial intelligence has had a great influence on this path. A lot of research has been done to investigate the application of machine learning and deep learning methods in stock trading. Despite the large amount of research done in the field of prediction and automation trading, stock trading as a deep reinforcement-learning problem remains an open research area. The progress of reinforcement learning, as well as the intrinsic properties of reinforcement learning, make it a suitable method for market trading in theory. In this paper, single stock trading models are presented based on the fine-tuned state-of-the-art deep reinforcement learning algorithms (Deep Deterministic Policy Gradient (DDPG) and Advantage Actor Critic (A2C)). These algorithms are able to interact with the trading market and capture the financial market dynamics. The proposed models are compared, evaluated, and verified on historical stock trading data. Annualized return and Sharpe ratio have been used to evaluate the performance of proposed models. The results show that the agent designed based on both algorithms is able to make intelligent decisions on historical data. The DDPG strategy performs better than the A2C and achieves better results in terms of convergence, stability, and evaluation criteria.

Downloads

Download data is not yet available.

Author Biography

Neda Yousefi, Allameh Tabataba’i University, Tehran, Iran

Neda Yousefi received her bachelor’s degree in Applied Mathematics and her first master’s degree in Industrial Engineering (Economic and Social Systems Engineering) from the Amirkabir University of Technology, Iran (Tehran Polytechnic). In 2021, she received her second master’s degree in Computer Science (Soft Computing and Artificial intelligence) from Allameh Tabataba’i University Tehran, Iran. Her research interests are in Machine Learning, Deep Learning, Computer Vision, and Image Processing.

References

T. G. Fischer, “Reinforcement learning in financial markets-a survey,” FAU Discussion Papers in Economics, Tech. Rep., 2018.

Y. Wang, D. Wang, S. Zhang, Y. Feng, S. Li, and Q. Zhou, “Deep Q-trading,” cslt. riit. tsinghua. edu. cn, 2017.

Y. Deng, F. Bao, Y. Kong, Z. Ren, and Q. Dai, “Deep direct reinforcement learning for financial signal representation and trading,” IEEE Transactions on Neural Networks and Learning Systems, vol. 28, no. 3, pp. 653–664, 2016.

J. Moody and M. Saffell, “Learning to trade via direct reinforcement,” IEEE Transactions on Neural Networks, vol. 12, no. 4, pp. 875–889, 2001.

Y. Li, W. Zheng, and Z. Zheng, “Deep robust reinforcement learning for practical algorithmic trading,” IEEE Access, vol. 7, pp. 108 014–108 022, 2019.

D. Silver, G. Lever, N. Heess, T. Degris, D. Wierstra, and M. Riedmiller, “Deterministic policy gradient algorithms,” in International Conference on Machine Learning. PMLR, 2014, pp. 387–395.

T. P. Lillicrap, J. J. Hunt, A. Pritzel, N. Heess, T. Erez, Y. Tassa, D. Silver, and D. Wierstra, “Continuous control with deep reinforcement learning,” arXiv preprint arXiv:1509.02971, 2015.

X.-Y. Liu, Z. Xiong, S. Zhong, H. Yang, and A. Walid, “Practical deep reinforcement learning approach for stock trading,” arXiv preprint arXiv:1811.07522, 2018.

V. Mnih, A. P. Badia, M. Mirza, A. Graves, T. Lillicrap, T. Harley, D. Silver, and K. Kavukcuoglu, “Asynchronous methods for deep reinforcement learning,” in International Conference on Machine Learning. PMLR, 2016, pp. 1928–1937.

R. S. Sutton and A. G. Barto, “Reinforcement learning: An introduction. 2018,” Google Scholar Digital Library, 2011.

T. G. Fischer, “Reinforcement learning in financial markets-a survey,” FAU Discussion Papers in Economics, Tech. Rep., 2018.

D. Silver, H. Hasselt, M. Hessel, T. Schaul, A. Guez, T. Harley, G. Dulac-Arnold, D. Reichert, N. Rabinowitz, A. Barreto et al., “The predictron: End-to-end learning and planning,” in International Conference on Machine Learning. PMLR, 2017, pp. 3191–

V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu, J. Veness, M. G. Bellemare, A. Graves, M. Riedmiller, A. K. Fidjeland, G. Ostrovski et al., “Human-level control through deep reinforcement learning,” nature, vol. 518, no. 7540, pp. 529–533, 2015.

V. Konda and J. Tsitsiklis, “Actor-critic algorithms,” Advances in Neural Information Processing Systems, vol. 12, 1999.

J. Vitay, “Deep reinforcement learning,” 2020.

I. Cooper, “Arithmetic versus geometric mean estimators: Setting discount rates for capital budgeting,” European Financial Management, vol. 2, no. 2, pp. 157–167, 1996.

M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, M. Isard et al., “{TensorFlow}: a system for {Large-Scale} machine learning,” in 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 2016, pp. 265–283.

G. Brockman, V. Cheung, L. Pettersson, J. Schneider, J. Schulman, J. Tang, and W. Zaremba, “Openai gym,” arXiv preprint arXiv:1606.01540, 2016.

“Welcome to stable baselines docs! - RL baselines made easy,” https://stable-baselines.readthedocs.io/, accessed: 2022-11-15.

H. Yang, X.-Y. Liu, S. Zhong, and A. Walid, “Deep reinforcement learning for automated stock trading: An ensemble strategy,” in Proceedings of the First ACM International Conference on AI in Finance, 2020, pp. 1–8.