快捷搜索:  汽车  科技

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)论文地址:使用图卷积网络(graph convolution network)处理输入场景图,图卷积网络沿着边缘传递信息,计算所有对象的嵌入向量。这些向量被用于预测所有对象的边界框和分割掩码,他们结合起来形成一个粗略的场景布局。布局被传递到级联细化网络,该网络在增加的空间尺度上生成输出图像。这个模型针对一对鉴别器网络(discriminator networks)进行对抗训练,以确保输出图像看起来较为真实。01sg2im:从场景图生成图像这个优秀的开源代码使用图卷积(graph convolution)来处理输入的图形,通过预测对象的边界框和分割掩码来计算场景布局,并将布局转换为具有级联细化网络(cascaded refinement network)的图像。

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(1)

译者 | 婉清

编辑 | 姗姗

出品 | 人工智能头条

【导读】今天为大家介绍机器学习、深度学习中一些优秀、有意思的 Python 库,以及这些库的 Code 实践教程。涉及到的理论与学术内容会附上相应的论文与博客,方便大家参考学习。

01

sg2im:从场景图生成图像

这个优秀的开源代码使用图卷积(graph convolution)来处理输入的图形,通过预测对象的边界框和分割掩码来计算场景布局,并将布局转换为具有级联细化网络(cascaded refinement network)的图像。

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(2)

使用图卷积网络(graph convolution network)处理输入场景图,图卷积网络沿着边缘传递信息,计算所有对象的嵌入向量。这些向量被用于预测所有对象的边界框和分割掩码,他们结合起来形成一个粗略的场景布局。布局被传递到级联细化网络,该网络在增加的空间尺度上生成输出图像。这个模型针对一对鉴别器网络(discriminator networks)进行对抗训练,以确保输出图像看起来较为真实。

论文地址:

https://arxiv.org/abs/1804.01622

GitHub 地址:

https://github.com/google/sg2im

关于级联细化论文可参阅:

Photographic Image Synthesis with Cascaded Refinement Networks

https://arxiv.org/abs/1707.09405

▌如何运行和测试代码?

首先复制下面这段代码:

gitclonehttps://github.com/google/sg2im.git

原始代码是在 Ubuntu 16.04 上使用 Python 3.5 和 PyTorch 0.4 进行开发和测试的。不过在虚拟环境中建议尝试一下通过设置虚拟环境来运行,可以参考下面的代码:

python3-mvenvenv#Createavirtualenvironment
sourceenv/bin/activate#Activatevirtualenvironment
pipinstall-rrequirements.txt#Installdependencies
echo$PWD>env/lib/python3.5/site-packages/sg2im.pth#Addcurrentdirectorytopythonpath
#Workforawhile...
deactivate#Exitvirtualenvironment

注意:需要安装python-venv。下面的代码大家可以参考一下。

python3-mvenv--without-pipenv#Addedthe--without-pip
sourceenv/bin/activate#Activatevirtualenvironment
pipinstall-rrequirements.txt#Installdependencies
echo$PWD>env/lib/python3.6/site-packages/sg2im.pth#Addcurrentdirectorytopythonpath
#Workforawhile...
deactivate#Exitvirtualenvironment

还需要从 requirements.txt 这个文件中中删除pkg-resources=0.0.0,否则会出现 bug。至于为什么要删除pkg-resources==0.0.0可以参考链接中的内容介绍。

参考链接:

https://stackoverflow.com/questions/39577984/what-is-pkg-resources-0-0-0-in-output-of-pip-freeze-command/39638060。

接下来要运行预训练的模型。

先运行脚本bash scripts/download_models.sh,下载模型后再开始,这个过程大约需要 355 MB 的硬盘空间。

  • sg2im-models/coco64.pt:在COCO-Stuff数据集上训练模型并生成64x64的图像。

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(3)

  • sg2im-models/vg64.pt:在 Visual Genome 数据集上训练模型生成 64x64 图像。

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(4)

  • sg2im-models/vg128.pt:在 Visual Genome 数据集上训练模型生成 128x128 图像。

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(5)

参考论文:

Image Generation from Scene Graphs

https://arxiv.org/pdf/1804.01622.pdf

可以使用简单可读的 JSON 格式,运行脚本scripts/run_model.py,在新场景图上可以轻松运行任何预训练模型。如果要重新创建上面的绵羊图像,需要运行下面这行代码:

pythonscripts/run_model.py\
--checkpointsg2im-models/vg128.pt\
--scene_graphsscene_graphs/figure_6_sheep.json\
--output_diroutputs

下面是得到的图像结果

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(6)

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(7)

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(8)

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(9)

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(10)

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(11)

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(12)

接下来我们一起看一下这段代码:

[
{
"objects":["sky" "grass" "zebra"]
"relationships":[
[0 "above" 1]
[2 "standingon" 1]
]
}
{
"objects":["sky" "grass" "sheep"]
"relationships":[
[0 "above" 1]
[2 "standingon" 1]
]
}
{
"objects":["sky" "grass" "sheep" "sheep"]
"relationships":[
[0 "above" 1]
[2 "standingon" 1]
[3 "by" 2]
]
}
{
"objects":["sky" "grass" "sheep" "sheep" "tree"]
"relationships":[
[0 "above" 1]
[2 "standingon" 1]
[3 "by" 2]
[4 "behind" 2]
]
}
{
"objects":["sky" "grass" "sheep" "sheep" "tree" "ocean"]
"relationships":[
[0 "above" 1]
[2 "standingon" 1]
[3 "by" 2]
[4 "behind" 2]
[5 "by" 4]
]
}
{
"objects":["sky" "grass" "sheep" "sheep" "tree" "ocean" "boat"]
"relationships":[
[0 "above" 1]
[2 "standingon" 1]
[3 "by" 2]
[4 "behind" 2]
[5 "by" 4]
[6 "in" 5]
]
}
{
"objects":["sky" "grass" "sheep" "sheep" "tree" "ocean" "boat"]
"relationships":[
[0 "above" 1]
[2 "standingon" 1]
[3 "by" 2]
[4 "behind" 2]
[5 "by" 4]
[6 "on" 1]
]
}
]

首先分析第一段:

{
"objects":["sky" "grass" "zebra"]
"relationships":[
[0 "above" 1]
[2 "standingon" 1]
]
}

对象:sky [0]、grass [1]、zebra [2]

关系:sky [0] 在 grass [1] 的上面 ("above")

zebra [2] 站在 grass [1] 上 ("standing on")

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(13)

也可以创建一段类似的新代码来测试一下刚刚的效果:

[{
"objects":["sky" "grass" "dog" "cat" "tree" "ocean" "boat"]
"relationships":[
[0 "above" 1]
[2 "standingon" 1]
[3 "by" 2]
[4 "behind" 2]
[5 "by" 4]
[6 "on" 1]
]
}]

运行:

pythonscripts/run_model.py\
--checkpointsg2im-models/vg128.pt\
--scene_graphsscene_graphs/figure_blog.json\
--output_diroutputs

得到的图片是:

python机器学习基本教程电子版(机器学习中Python库的3个简单实践)(14)

虽然看着有点奇怪,但是这个过程还是很有意思的。

02

TheAlgorithms/Python:

在Python中实现的所有算法

编程是数据科学中的必备技能,在这个伟大的知识资源库中,为大家介绍几个重要的算法实现。但是这些仅用于演示,由于性能的原因,在Python标准库中有许多更好的实现。

在Python标准库中你可以找到机器学习代码、神经网络、动态变成、排序、哈希等等。下面的代码教程是关于如何在 Python 中用 Numpy 从零开始构建 K-means。

'''README Author-AnuragKumar(mailto:anuragkumarak95@gmail.com)
Requirements:
-sklearn
-numpy
-matplotlib
Python:
-3.5
Inputs:
-X a2Dnumpyarrayoffeatures.
-k numberofclusterstocreate.
-initial_centroids initialcentroidvaluesgeneratedbyutilityfunction(mentionedinusage).
-maxiter maximumnumberofiterationstoprocess.
-heterogeneity emptylistthatwillbefilledwithhetrogeneityvaluesifpassedtokmeansfunc.
Usage:
1.define'k'value 'X'featuresarrayand'hetrogeneity'emptylist

2.createinitial_centroids
initial_centroids=get_initial_centroids(
X
k
seed=0#seedvalueforinitialcentroidgeneration Noneforrandomness(default=None)
)
3.findcentroidsandclustersusingkmeansfunction.

centroids cluster_assignment=kmeans(
X
k
initial_centroids
maxiter=400
record_heterogeneity=heterogeneity
verbose=True#whethertoprintlogsinconsoleornot.(default=False)
)


4.Plotthelossfunction hetrogeneityvaluesforeveryiterationsavedinhetrogeneitylist.
plot_heterogeneity(
heterogeneity
k
)

5.Havefun..

'''
from__future__importprint_function
fromsklearn.metricsimportpairwise_distances
importnumpyasnp

TAG='K-MEANS-CLUST/'

defget_initial_centroids(data k seed=None):
'''Randomlychoosekdatapointsasinitialcentroids'''
ifseedisnotNone:#usefulforobtainingconsistentresults
np.random.seed(seed)
n=data.shape[0]#numberofdatapoints

#PickKindicesfromrange[0 N).
rand_indices=np.random.randint(0 n k)

#Keepcentroidsasdenseformat asmanyentrieswillbenonzeroduetoaveraging.
#Aslongasatleastonedocumentinaclustercontainsaword
#itwillcarryanonzeroweightintheTF-IDFvectorofthecentroid.
centroids=data[rand_indices :]

returncentroids

defcentroid_pairwise_dist(X centroids):
returnpairwise_distances(X centroids metric='euclidean')

defassign_clusters(data centroids):

#Computedistancesbetweeneachdatapointandthesetofcentroids:
#Fillintheblank(RHSonly)
distances_from_centroids=centroid_pairwise_dist(data centroids)

#Computeclusterassignmentsforeachdatapoint:
#Fillintheblank(RHSonly)
cluster_assignment=np.argmin(distances_from_centroids axis=1)

returncluster_assignment

defrevise_centroids(data k cluster_assignment):
new_centroids=[]
foriinrange(k):
#Selectalldatapointsthatbelongtoclusteri.Fillintheblank(RHSonly)
member_data_points=data[cluster_assignment==i]
#Computethemeanofthedatapoints.Fillintheblank(RHSonly)
centroid=member_data_points.mean(axis=0)
new_centroids.append(centroid)
new_centroids=np.array(new_centroids)

returnnew_centroids

defcompute_heterogeneity(data k centroids cluster_assignment):

heterogeneity=0.0
foriinrange(k):

#Selectalldatapointsthatbelongtoclusteri.Fillintheblank(RHSonly)
member_data_points=data[cluster_assignment==i :]

ifmember_data_points.shape[0]>0:#checkifi-thclusterisnon-empty
#Computedistancesfromcentroidtodatapoints(RHSonly)
distances=pairwise_distances(member_data_points [centroids[i]] metric='euclidean')
squared_distances=distances**2
heterogeneity =np.sum(squared_distances)

returnheterogeneity

frommatplotlibimportpyplotasplt
defplot_heterogeneity(heterogeneity k):
plt.figure(figsize=(7 4))
plt.plot(heterogeneity linewidth=4)
plt.xlabel('#Iterations')
plt.ylabel('Heterogeneity')
plt.title('Heterogeneityofclusteringovertime K={0:d}'.format(k))
plt.rcParams.update({'font.size':16})
plt.show()

defkmeans(data k initial_centroids maxiter=500 record_heterogeneity=None verbose=False):
'''Thisfunctionrunsk-meansongivendataandinitialsetofcentroids.
maxiter:maximumnumberofiterationstorun.(default=500)
record_heterogeneity:(optional)alist tostorethehistoryofheterogeneityasfunctionofiterations
ifNone donotstorethehistory.
verbose:ifTrue printhowmanydatapointschangedtheirclusterlabelsineachiteration'''
centroids=initial_centroids[:]
prev_cluster_assignment=None

foritrinrange(maxiter):
ifverbose:
print(itr end='')

#1.Makeclusterassignmentsusingnearestcentroids
cluster_assignment=assign_clusters(data centroids)

#2.Computeanewcentroidforeachofthekclusters averagingalldatapointsassignedtothatcluster.
centroids=revise_centroids(data k cluster_assignment)

#Checkforconvergence:ifnoneoftheassignmentschanged stop
ifprev_cluster_assignmentisnotNoneand\
(prev_cluster_assignment==cluster_assignment).all():
break

#Printnumberofnewassignments
ifprev_cluster_assignmentisnotNone:
num_changed=np.sum(prev_cluster_assignment!=cluster_assignment)
ifverbose:
print('{0:5d}elementschangedtheirclusterassignment.'.format(num_changed))

#Recordheterogeneityconvergencemetric
ifrecord_heterogeneityisnotNone:
#YOURCODEHERE
score=compute_heterogeneity(data k centroids cluster_assignment)
record_heterogeneity.append(score)

prev_cluster_assignment=cluster_assignment[:]

returncentroids cluster_assignment

#Mocktestbelow
ifFalse:#changetotruetorunthistestcase.
importsklearn.datasetsasds
dataset=ds.load_iris()
k=3
heterogeneity=[]
initial_centroids=get_initial_centroids(dataset['data'] k seed=0)
centroids cluster_assignment=kmeans(dataset['data'] k initial_centroids maxiter=400
record_heterogeneity=heterogeneity verbose=True)
plot_heterogeneity(heterogeneity k)

GitHub 地址:https://github.com/TheAlgorithms

猜您喜欢: