Citation: | WANG X Z, WANG M, LUO W. Intelligent decision technology in combat deduction based on soft actor-critic algorithm[J]. Chinese Journal of Ship Research, 2021, 16(6): 99–108. DOI: 10.19693/j.issn.1673-3185.02099 |
[1] |
胡荟, 吴振齐. 人工智能技术在美国军事情报工作中的当前应用及发展趋势探析[J]. 国防科技, 2020, 41(2): 15–20.
HU H, WU Z Q. Research on the current application and development trend of artificial intelligence technology in US military intelligence work[J]. National Defense Science & Technology, 2020, 41(2): 15–20 (in Chinese).
|
[2] |
付长军, 郑伟明, 葛蕾, 等. 人工智能在作战仿真中的应用研究[J]. 无线电工程, 2020, 50(4): 257–261. doi: 10.3969/j.issn.1003-3106.2020.04.001
FU C J, ZHENG W M, GE L, et al. Application of artificial intelligence in combat simulation[J]. Radio Engineering, 2020, 50(4): 257–261 (in Chinese). doi: 10.3969/j.issn.1003-3106.2020.04.001
|
[3] |
孙鹏, 谭玉玺, 李路遥. 基于态势描述的陆军作战仿真外部决策模型研究[J]. 指挥控制与仿真, 2016, 38(2): 15–19. doi: 10.3969/j.issn.1673-3819.2016.02.004
SUN P, TAN Y X, LI L Y. Research on external decision model of army operational simulation based on situation description[J]. Command Control & Simulation, 2016, 38(2): 15–19 (in Chinese). doi: 10.3969/j.issn.1673-3819.2016.02.004
|
[4] |
董倩, 纪梦琪, 朱一凡, 等. 空中作战决策行为树建模与仿真[J]. 指挥控制与仿真, 2019, 41(1): 12–19. doi: 10.3969/j.issn.1673-3819.2019.01.003
DONG Q, JI M Q, ZHU Y F, et al. Behavioral tree modeling and simulation for air operations decision[J]. Command Control & Simulation, 2019, 41(1): 12–19 (in Chinese). doi: 10.3969/j.issn.1673-3819.2019.01.003
|
[5] |
彭希璐, 王记坤, 张昶, 等. 面向智能决策的兵棋推演技术[C]//2019第七届中国指挥控制大会论文集. 北京: 中国指挥与控制学会, 2019: 193–198.
PENG X L, WANG J K, ZHANG C, et al. The technology of wargame based on intelligent decision[C]//Proceedings of the 7th China Command and Control Conference in 2019. Beijing: Chinese Institute of Command and Control, 2019: 193–198 (in Chinese).
|
[6] |
廖馨, 孙峥皓. 作战推演仿真中的智能决策技术应用探索[C]//第二十届中国系统仿真技术及其应用学术年会论文集. 乌鲁木齐: 中国自动化学会系统仿真专业委员会, 2019: 368–374.
LIAO X, SUN Z H. Exploration on application of intelligent decision-making in battle deduction simulation[C]//Proceedings of the 20th China Annual Conference on System Simulation Technology and its Application. Urumqi: System Simulation Committee of China Automation Society, 2019: 368–374 (in Chinese).
|
[7] |
崔文华, 李东, 唐宇波, 等. 基于深度强化学习的兵棋推演决策方法框架[J]. 国防科技, 2020, 41(2): 113–121.
CUI W H, LI D, TANG Y B, et al. Framework of wargaming decision-making methods based on deep reinforcement learning[J]. National Defense Science & Technology, 2020, 41(2): 113–121 (in Chinese).
|
[8] |
HAARNOJA T, ZHOU A, ABBEEL P, et al. Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor[C]//Proceedings of the 35th International Conference on Machine Learning. Stockholm, Sweden: ACM Press, 2018.
|
[9] |
SUTTON R S, BARTO A G. Reinforcement Learning: An Introduction[M]. Cambridge: MIT Press, 1998.
|
[10] |
SPIELBERG S, GOPALUNI R, LOEWEN P. Deep reinforcement learning approaches for process control[C]//2017 6th International Symposium on Advanced Control of Industrial Processes, [S. 1. ]: IEEE, 2017: 201–203.
|
[11] |
HAARNOJA T, ZHOU A, HARTIKAINEN K, et al. Soft actor-critic algorithms and applications [EB/OL]. ArXiv: 1812.05905, 2018(2018-12-13)[2020-08-30]. https://arxiv.org/abs/1812.05905.
|
[12] |
MNIH V, KAVUKCUOGLU K, SILVER D, et al. Human-level control through deep reinforcement learning[J]. Nature, 2015, 518(7540): 529–533. doi: 10.1038/nature14236
|
[13] |
SCHULMAN J, CHEN X, ABBEEL P. Equivalence between policy gradients and soft Q-learning[EB/OL]. ArXiv: 1704.06440, 2017. (2017-4-21)[2020-08-30]. https://arxiv.org/pdf/1704.06440.pdf.
|
[14] |
HAARNOJA T, TANG H, ABBEEL P, et al, Reinforcement learning with deep energy-based policies[C]//Proceedings of the 34th International Conference on Machine Learning. Sydney, Australia: ACM Press: MLR. org, 2017: 1352–1361.
|
[15] |
LILLICRAP T P, HUNT J J, PRITZEL A, et al. Continuous control with deep reinforcement learning[C]//Proceedings of the 4th International Conference on Learning Representations. San Juan, Puerto Rico: Elsevier, 2016.
|