阿法狗布局实战（阿法狗-第二期-蒙特卡洛模拟）

神兽 2023-08-26 06:59:09 741

阿法狗布局实战（阿法狗-第二期-蒙特卡洛模拟）在模拟结果之前，我们需要先计算庄家的赢面。而庄家的赢面代表着赌场的平均利润。假设有一个游戏，而我们的玩家是Jack，摇一个骰子得出一个随机数1～100。如果Jack 得出1～51，那么赌场赢；但是如果Jack得出52～100，那么Jack赢如果有对金融有一些了解的朋友，可能知道在期权Option定价的过程中有用到蒙特卡洛模拟。期权的定价就是典型的面对不确定性和随机性的定价。而面对随机性最简单的方式就是抽样或者实验。一个简单的例子，假设我们要知道骰子摇到 6 的概率，最简单的方式就是摇很多次骰子，然后最后统计下。蒙特卡洛本质就这么简单，就是摇骰子。下面将通过翻译 Toward DataScience的一个关于蒙特卡洛模拟的文章及代码来进一步解释该问题。

背景

本系列上一期已经涵盖了 97年IBM深蓝打败国际象棋大师的Minimax算法。而从上一期大家能看出，Minimax 属于穷举的方法，虽然有Alpga-Beta Pruning等优化，但是其本质还是属于穷举法之一。而也是因为其属于穷举法范畴，Minimax无法击败围棋大师。

而大家众所周知的阿法狗已经成功打败了人类围棋大师。而阿法狗采用了 MCTS（蒙特卡洛树搜索）和深度机器学习。其中非常重要的一个不同点就是，解决如何面对不确定性和随机性的问题。Minimax是使用穷举法来面对棋局变化的随机性，而阿法狗是通过MCTS，这其中的不同本系列将会覆盖。

而本期将给大家介绍蒙特卡洛模拟，Monte Carlo方法是应对随机性的一个有效方法。而了解蒙特卡洛模拟也是搞懂 MCTS（蒙特卡洛树搜索）的前提。

本期的文章除了开头及算法理解部分，其他均由翻译Toward DataScience文章而来。

蒙特卡洛模拟

如果有对金融有一些了解的朋友，可能知道在期权Option定价的过程中有用到蒙特卡洛模拟。期权的定价就是典型的面对不确定性和随机性的定价。而面对随机性最简单的方式就是抽样或者实验。

一个简单的例子，假设我们要知道骰子摇到 6 的概率，最简单的方式就是摇很多次骰子，然后最后统计下。蒙特卡洛本质就这么简单，就是摇骰子。

下面将通过翻译 Toward DataScience的一个关于蒙特卡洛模拟的文章及代码来进一步解释该问题。

蒙特卡洛模拟

假设有一个游戏，而我们的玩家是Jack，摇一个骰子得出一个随机数1～100。如果Jack 得出1～51，那么赌场赢；但是如果Jack得出52～100，那么Jack赢

阿法狗布局实战（阿法狗-第二期-蒙特卡洛模拟）(1)

在模拟结果之前，我们需要先计算庄家的赢面。而庄家的赢面代表着赌场的平均利润。

假设Jack下注一元在此游戏：

Jack赢的概率=49/100

赌场赢的概率=51/100

Jack的期望利润= 1* （ 49/100） - 1 * （51/100） = -0.02 = - 2%

因此，赌场的期望收益率是2%

现在，用Python来模拟不同的场景下，可视化玩家一直下注的结果

引入相关的Python Lib

#Import libraries import random import matplotlib.pyplot as plt

我们需要一个骰子，骰子随机产生1～100的平均概率分布。并且如果玩家获胜，骰子 return “True”；如果赌场获胜，return “False”

#Create function for simulating die roll #The die can take values from 1 to 100. If the number is between 1 #and 51 the house wins. #If the number is between 52 and 100 the player wins. def rolldice(): dice = random.randint(1 100) if dice <=51: return False elif dice >51 & dice <=100: return True

创建一个公式来模拟下注的过程。我们需要提供三个变量：

总资金金额：玩家初始的资金数（¥10000）
下注金额：每局的下注金额（Y100）
总局数：玩家参与的总局数

#Define a function for the play which takes 3 arguments : #1. total_funds = total money in hand the player is starting with #2. wager_amount = the betting amount each time the player plays #3. total_plays = the number of times the player bets on this game def play(total_funds wager_amount total_plays): #Create empty lists for : # 1.Play_number and # 2.Funds available # 3.Final Fund Play_num = [] Funds = [] #Start with play number 1 play = 1 #If number of plays is less than the max number of plays we have set while play < total_plays: #If we win if rolldice(): #Add the money to our funds total_funds = total_funds wager_amount #append the play number Play_num.append(play) #Append the new fund amount Funds.append(total_funds) #If the house wins else: #Add the money to our funds total_funds = total_funds - wager_amount #Append the play number Play_num.append(play) #Append the new fund amount Funds.append(total_funds) #Increase the play number by 1 play = play 1 #Line plot of funds over time plt.plot(Play_num Funds) Final_funds.append(Funds[-1]) return(Final_funds)

最后，循环上述代码，模拟不同局数下的结果

#Call the function to simulate the plays and calculate the remaining #funds of the player after all the bets #Intialize the scenario number to 1 x=1 #Create a list for calculating final funds Final_funds= [] while x<=100: ending_fund = play(10000 100 5) x=x 1 #Plot the line plot of "Account Value" vs "The number of plays" plt.ylabel('Player Money in $') plt.xlabel('Number of bets') plt.show() #Print the money the player ends with print("The player starts the game with $10 000 and ends with $" str(sum(ending_fund)/len(ending_fund)))