九色91_成人精品一区二区三区中文字幕_国产精品久久久久一区二区三区_欧美精品久久_国产精品99久久久久久久vr_www.国产视频

Hello! 歡迎來(lái)到小浪云!


這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南


本文經(jīng)機(jī)器之心(微信公眾號(hào):almosthuman2014)授權(quán)轉(zhuǎn)載,禁止二次轉(zhuǎn)載

從零開(kāi)始:深度學(xué)習(xí)軟件環(huán)境安裝指南(Ubuntu)本文gitHub地址:https://github.com/philferriere/dlwin

該配置版本最后更新的日期是今年七月,該更新版本允許本地使用 3 個(gè)不同的 gpu 加速后端,并添加對(duì) mkl blas 庫(kù)的支持。

目前有很多幫助我們?cè)?Linux 或 Mac OS 上構(gòu)建深度學(xué)習(xí)(DL)環(huán)境的指導(dǎo)文章,但很少有文章完整地?cái)⑹鋈绾胃咝У卦?Windows 10 上配置深度學(xué)習(xí)開(kāi)發(fā)環(huán)境。此外,很多開(kāi)發(fā)者安裝 WindowsUbuntu 雙系統(tǒng)或在 Windows 上安裝虛擬機(jī)以配置深度學(xué)習(xí)環(huán)境,但對(duì)于入門(mén)者來(lái)說(shuō),我們更希望還是直接使用 Windows 直接配置深度學(xué)習(xí)環(huán)境。因此,本文作者 Phil Ferriere 在 github 上發(fā)布了該教程,他希望能從最基本的環(huán)境變量配置開(kāi)始一步步搭建 keras 深度學(xué)習(xí)開(kāi)發(fā)環(huán)境。

如果讀者希望在 Windows 10 上配置深度學(xué)習(xí)環(huán)境,那么本文將為大家提供很多有利的信息。

01 依賴(lài)項(xiàng)

下面是我們將在 Windows 10(Version 1607 OS Build 14393.222)上配置深度學(xué)習(xí)環(huán)境所需要的工具和軟件包:

visual studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0:用于其 C/c++編譯器(而不是 ide)和 SDK,選擇該確定的版本是因?yàn)?CUDA 8.0.61 所支持的 Windows 編譯器。Anaconda (64-bit) w. Python 3.6 (Anaconda3-4.4.0) [for tensorflow support] or Python 2.7 (Anaconda2-4.4.0) [no Tensorflow support] with MKL:Anaconda 是一個(gè)開(kāi)源的 Python 發(fā)行版本,其包含了 conda、Python、numpyscipy 等 180 多個(gè)科學(xué)包及其依賴(lài)項(xiàng),是一個(gè)集成開(kāi)發(fā)環(huán)境。MKL 可以利用 CPU 加速許多線性代數(shù)運(yùn)算。CUDA 8.0.61 (64-bit):CUDA 是一種由 NVIDIA 推出的通用并行計(jì)算架構(gòu),該架構(gòu)使 GPU 能夠解決復(fù)雜的計(jì)算問(wèn)題,該軟件包能提供 GPU 數(shù)學(xué)庫(kù)、顯卡驅(qū)動(dòng)和 CUDA 編譯器等。cudnn v5.1 (Jan 20, 2017) for CUDA 8.0:用于加速卷積神經(jīng)網(wǎng)絡(luò)的運(yùn)算。Keras 2.0.5 with three different backends: Theano 0.9.0, Tensorflow-gpu 1.2.0, and CNTK 2.0:Keras 以 Theano、Tensorflow 或 CNTK 等框架為后端,并提供深度學(xué)習(xí)高級(jí) API。使用不同的后端在張量數(shù)學(xué)計(jì)算等方面會(huì)有不同的效果。

02 硬件

Dell Precision T7900, 64GB RAM:Intel Xeon E5-2630 v4 @ 2.20 GHz (1 processor, 10 cores total, 20 logical processors)NVIDIA GeForce Titan X, 12GB RAM:Driver version: 372.90 / Win 10 64

03 安裝步驟

我們可能喜歡讓所有的工具包和軟件包在一個(gè)根目錄下(如 e:toolkits.win),所以在下文只要看到以 e:toolkits.win 開(kāi)頭的路徑,那么我們可能就需要小心不要覆蓋或隨意更改必要的軟件包目錄。

Visual Studio 2015 Community Edition Update 3 w. Windows Kit 10.0.10240.0下載地址:https://www.visualstudio.com/vs/older-downloads

運(yùn)行下載的軟件包以安裝 Visual Studio,可能我們還需要做一些額外的配置:

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南
這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南
這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南
這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

基于我們安裝 VS 2015 的地址,需要將 C:Program Files (x86)microsoft Visual Studio 14.0VCbin 添加到 PATH 中。定義系統(tǒng)環(huán)境變量(sysenv variable)include 的值為 C:Program Files (x86)Windows Kits10Include10.0.10240.0ucrt定義系統(tǒng)環(huán)境變量(sysenv variable)LIB 的值為 C:Program Files (x86)Windows Kits10Lib10.0.10240.0umx64;C:Program Files (x86)Windows Kits10Lib10.0.10240.0ucrtx64

Anaconda 4.4.0 (64-bit) (Python 3.6 TF support / Python 2.7 no TF support))

本教程最初使用的是 Python 2.7,而隨著 TensorFlow 可作為 Keras 的后端,我們決定使用 Python 3.6 作為默認(rèn)配置。因此,根據(jù)我們配置的偏好,可以設(shè)置 e:toolkits.winanaconda3-4.4.0 或 e:toolkits.winanaconda2-4.4.0 為安裝 Anaconda 的文件夾名。

Python 3.6 版本的 Anaconda 下載地址:https://repo.continuum.io/archive/Anaconda3-4.4.0-Windows-x86_64.exePython 2.7 版本的 Anaconda 下載地址:https://repo.continuum.io/archive/Anaconda2-4.4.0-Windows-x86_64.exe

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

運(yùn)行安裝程序完成安裝:

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南
這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

如上,本教程選擇了第二個(gè)選項(xiàng),但不一定是最好的。

定義一下變量并更新 PATH:

定義系統(tǒng)環(huán)境(sysenv variable)變量 PYTHON_HOME 的值為 e:toolkits.winanaconda3-4.4.0添加 %PYTHON_HOME%, %PYTHON_HOME%Scripts 和 %PYTHON_HOME%Librarybin 到 PATH 中

創(chuàng)建 dlwin36 conda 環(huán)境

在安裝 Anaconda 后,打開(kāi) Windows 命令窗口并執(zhí)行:

#使用以下命令行創(chuàng)建環(huán)境$ conda create –yes -n dlwin36 numpy scipy mkl-service m2w64-toolchain libpython jupyter# 使用以下命令行激活環(huán)境:# > activate dlwin36## 使用以下命令行關(guān)閉環(huán)境:# > deactivate dlwin36## * for power-users using bash, you must source#

如上所示,使用 active dlwin36 命令激活這個(gè)新的環(huán)境。如果已經(jīng)有了舊的 dlwin36 環(huán)境,可以先用 conda env remove -n dlwin36 命令刪除。既然打算使用 GPU,為什么還要安裝 CPU 優(yōu)化的線性代數(shù)庫(kù)如 MKL 呢?在我們的設(shè)置中,大多數(shù)深度學(xué)習(xí)都是由 GPU 承擔(dān)的,這并沒(méi)錯(cuò),但 CPU 也不是無(wú)所事事。基于圖像的 Kaggle 競(jìng)賽一個(gè)重要部分是數(shù)據(jù)增強(qiáng)。如此看來(lái),數(shù)據(jù)增強(qiáng)是通過(guò)轉(zhuǎn)換原始訓(xùn)練樣本(利用圖像處理算子)獲得額外輸入樣本(即更多的訓(xùn)練圖像)的過(guò)程。基本的轉(zhuǎn)換比如下采樣和均值歸 0 的歸一化也是必需的。如果你覺(jué)得這樣太冒險(xiǎn),可以試試額外的預(yù)處理增強(qiáng)(噪聲消除、直方圖均化等等)。當(dāng)然也可以用 GPU 處理并把結(jié)果保存到文件中。然而在實(shí)踐過(guò)程中,這些計(jì)算通常都是在 CPU 上平行執(zhí)行的,而 GPU 正忙于學(xué)習(xí)深度神經(jīng)網(wǎng)絡(luò)的權(quán)重,況且增強(qiáng)數(shù)據(jù)是用完即棄的。因此,我們強(qiáng)烈推薦安裝 MKL,而 Theanos 用 BLAS 庫(kù)更好。

CUDA 8.0.61 (64-bit)

從英偉達(dá)網(wǎng)站下載 CUDA 8.0 (64-bit):https://developer.nvidia.com/cuda-downloads

選擇合適的操作系統(tǒng)

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

下載安裝包:

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

運(yùn)行安裝包,安裝文件到 e:toolkits.wincuda-8.0.61 中:

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南
這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南
這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

完成安裝后,安裝包應(yīng)該創(chuàng)建了一個(gè)名為 CUDA_PATH 的系統(tǒng)環(huán)境變量(sysenv variable),并且已經(jīng)添加了%CUDA_PATH%bin 和 %CUDA_PATH%libnvvp 到 PATH 中。檢查是否真正添加了,若 CUDA 環(huán)境變量因?yàn)橐恍┰虺鲥e(cuò)了,那么完成下面兩個(gè)步驟:

定義名為 CUDA_PATH 的系統(tǒng)環(huán)境變量的值為 e:toolkits.wincuda-8.0.61添加%CUDA_PATH%bin 和 %CUDA_PATH%libnvvp 到 PATH 中

cuDNN v5.1 (Jan 20, 2017) for CUDA 8.0

根據(jù)英偉達(dá)官網(wǎng)「cuDNN 為標(biāo)準(zhǔn)的運(yùn)算如前向和反向卷積、池化、歸一化和激活層等提供高度調(diào)優(yōu)的實(shí)現(xiàn)」,它是為卷積神經(jīng)網(wǎng)絡(luò)和深度學(xué)習(xí)設(shè)計(jì)的一款加速方案。

cuDNN 的下載地址:https://developer.nvidia.com/rdp/cudnn-download

我們需要選擇符合 CUDA 版本和 Window 10 編譯器的 cuDNN 軟件包,一般來(lái)說(shuō),cuDNN 5.1 可以支持 CUDA 8.0 和 Windows 10。

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

下載的 ZIP 文件包含三個(gè)目錄(bin、include、lib),抽取這三個(gè)的文件夾到%CUDA_PATH% 中。

安裝 Keras 2.0.5 和 Theano0.9.0 與 libgpuarray

運(yùn)行以下命令安裝 libgpuarray 0.6.2,即 Theano 0.9.0 唯一的穩(wěn)定版:

(dlwin36) $ conda install pygpu==0.6.2 nose#下面是該命令行安裝的效果Fetching package metadata ………..Solving package specifications: .Package plan for installation in environment e:toolkits.winanaconda3-4.4.0envsdlwin36:The following NEW packages will be INSTALLED: libgpuarray: 0.6.2-vc14_0 [vc14] nose: 1.3.7-py36_1 pygpu: 0.6.2-py36_0Proceed ([y]/n)? y

輸入以下命令安裝 Keras 和 Theano:

(dlwin36) $ pip install keras==2.0.5#下面是該命令行安裝的效果Collecting keras==2.0.5Requirement already satisfied: six in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packages (from keras==2.0.5)Collecting pyyaml (from keras==2.0.5)Collecting theano (from keras==2.0.5)Requirement already satisfied: scipy>=0.14 in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packages (from theano->keras==2.0.5)Requirement already satisfied: numpy>=1.9.1 in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packages (from theano->keras==2.0.5)Installing collected packages: pyyaml, theano, kerasSuccessfully installed keras-2.0.5 pyyaml-3.12 theano-0.9.0

安裝 CNTK 2.0 后端

根據(jù) CNTK 安裝文檔,我們可以使用以下 pip 命令行安裝 CNTK:

(dlwin36) $ pip install https://cntk.ai/PythonWheel/GPU/cntk-2.0-cp36-cp36m-win_amd64.whl#下面是該命令行安裝的效果Collecting cntk==2.0 from https://cntk.ai/PythonWheel/GPU/cntk-2.0-cp36-cp36m-win_amd64.whl Using cached https://cntk.ai/PythonWheel/GPU/cntk-2.0-cp36-cp36m-win_amd64.whlRequirement already satisfied: numpy>=1.11 in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packages (from cntk==2.0)Requirement already satisfied: scipy>=0.17 in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packages (from cntk==2.0)Installing collected packages: cntkSuccessfully installed cntk-2.0

該安裝將導(dǎo)致在 conda 環(huán)境目錄下額外安裝 CUDA 和 cuDNN DLLs:

(dlwin36) $ cd E:toolkits.winanaconda3-4.4.0envsdlwin36(dlwin36) $ dir cu*.dll#下面是該命令行安裝的效果Volume in drive E is datasetsVolume Serial number is 1ED0-657BDirectory of E:toolkits.winanaconda3-4.4.0envsdlwin3606/30/2017 02:47 PM 40,744,896 cublas64_80.dll06/30/2017 02:47 PM 366,016 cudart64_80.dll06/30/2017 02:47 PM 78,389,760 cudnn64_5.dll06/30/2017 02:47 PM 47,985,208 curand64_80.dll06/30/2017 02:47 PM 41,780,280 cusparse64_80.dll 5 File(s) 209,266,160 bytes 0 Dir(s) 400,471,019,520 bytes free

這個(gè)問(wèn)題并不是因?yàn)槔速M(fèi)硬盤(pán)空間,而是安裝的 cuDNN 版本和我們安裝在 c:toolkitscuda-8.0.61 下的 cuDNN 版本不同,因?yàn)樵?conda 環(huán)境目錄下的 DLL 將首先加載,所以我們需要這些 DLL 移除出%PATH% 目錄:

(dlwin36) $ md discard & move cu*.dll discard#下面是該命令行安裝的效果E:toolkits.winanaconda3-4.4.0envsdlwin36cublas64_80.dllE:toolkits.winanaconda3-4.4.0envsdlwin36cudart64_80.dllE:toolkits.winanaconda3-4.4.0envsdlwin36cudnn64_5.dllE:toolkits.winanaconda3-4.4.0envsdlwin36curand64_80.dllE:toolkits.winanaconda3-4.4.0envsdlwin36cusparse64_80.dll 5 file(s) moved.

安裝 TensorFlow-GPU 1.2.0 后端

運(yùn)行以下命令行使用 pip 安裝 TensorFlow:

(dlwin36) $ pip install tensorflow-gpu==1.2.0#以下是安裝效果Collecting tensorflow-gpu==1.2.0 Using cached tensorflow_gpu-1.2.0-cp36-cp36m-win_amd64.whlRequirement already satisfied: bleach==1.5.0 in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packages (from tensorflow-gpu==1.2.0)Requirement already satisfied: numpy>=1.11.0 in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packages (from tensorflow-gpu==1.2.0)Collecting markdown==2.2.0 (from tensorflow-gpu==1.2.0)Requirement already satisfied: wheel>=0.26 in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packages (from tensorflow-gpu==1.2.0)Collecting protobuf>=3.2.0 (from tensorflow-gpu==1.2.0)Collecting backports.weakref==1.0rc1 (from tensorflow-gpu==1.2.0) Using cached backports.weakref-1.0rc1-py3-none-any.whlCollecting html5lib==0.9999999 (from tensorflow-gpu==1.2.0)Collecting werkzeug>=0.11.10 (from tensorflow-gpu==1.2.0) Using cached Werkzeug-0.12.2-py2.py3-none-any.whlRequirement already satisfied: six>=1.10.0 in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packages (from tensorflow-gpu==1.2.0)Requirement already satisfied: setuptools in e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packagessetuptools-27.2.0-py3.6.egg (from protobuf>=3.2.0->tensorflow-gpu==1.2.0)Installing collected packages: markdown, protobuf, backports.weakref, html5lib, werkzeug, tensorflow-gpu Found existing installation: html5lib 0.999 DEPRECATION: Uninstalling a distutils installed project (html5lib) has been deprecated and will be removed in a future version. this is due to the fact that uninstalling a distutils project will only partially uninstall the project. Uninstalling html5lib-0.999: Successfully uninstalled html5lib-0.999Successfully installed backports.weakref-1.0rc1 html5lib-0.9999999 markdown-2.2.0 protobuf-3.3.0 tensorflow-gpu-1.2.0 werkzeug-0.12.2

使用 conda 檢查安裝的軟件包

完成以上安裝和配置后,我們應(yīng)該在 dlwin36 conda 環(huán)境中看到以下軟件包列表:

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南
這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南
這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

為了快速檢查上述三個(gè)后端安裝的效果,依次運(yùn)行一下命令行分別檢查 Theano、TensorFlow 和 CNTK 導(dǎo)入情況:

(dlwin36) $ python -c “import theano; print(‘theano: %s, %s’ % (theano.__version__, theano.__file__))”theano: 0.9.0, E:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packagestheano__init__.py(dlwin36) $ python -c “import pygpu; print(‘pygpu: %s, %s’ % (pygpu.__version__, pygpu.__file__))”pygpu: 0.6.2, e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packagespygpu__init__.py(dlwin36) $ python -c “import tensorflow; print(‘tensorflow: %s, %s’ % (tensorflow.__version__, tensorflow.__file__))”tensorflow: 1.2.0, E:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packagestensorflow__init__.py(dlwin36) $ python -c “import cntk; print(‘cntk: %s, %s’ % (cntk.__version__, cntk.__file__))”cntk: 2.0, E:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packagescntk__init__.py

驗(yàn)證 Theano 的安裝

因?yàn)?Theano 是安裝 Keras 時(shí)自動(dòng)安裝的,為了快速地在 CPU 模式、GPU 模式和帶 cuDNN 的 GPU 模式之間轉(zhuǎn)換,我們需要?jiǎng)?chuàng)建以下三個(gè)系統(tǒng)環(huán)境變量(sysenv variable):

系統(tǒng)環(huán)境變量 THEANO_FLAGS_CPU 的值定義為:floatX=float32,device=cpu系統(tǒng)環(huán)境變量 THEANO_FLAGS_GPU 的值定義為:floatX=float32,device=cuda0,dnn.enabled=False,gpuarray.preallocate=0.8系統(tǒng)環(huán)境變量 THEANO_FLAGS_GPU_DNN 的值定義為:floatX=float32,device=cuda0,optimizer_including=cudnn,gpuarray.preallocate=0.8,dnn.conv.algo_bwd_filter=deterministic,dnn.conv.algo_bwd_data=deterministic,dnn.include_path=e:/toolkits.win/cuda-8.0.61/include,dnn.library_path=e:/toolkits.win/cuda-8.0.61/lib/x64

現(xiàn)在,我們能直接使用 THEANO_FLAGS_CPU、THEANO_FLAGS_GPU 或 THEANO_FLAGS_GPU_DNN 直接設(shè)置 Theano 使用 CPU、GPU 還是 GPU+cuDNN。我們可以使用以下命令行驗(yàn)證這些變量是否成功加入環(huán)境中:

(dlwin36) $ set KERAS_BACKEND=theano(dlwin36) $ set | findstr /i theanoKERAS_BACKEND=theanoTHEANO_FLAGS=floatX=float32,device=cuda0,optimizer_including=cudnn,gpuarray.preallocate=0.8,dnn.conv.algo_bwd_filter=deterministic,dnn.conv.algo_bwd_data=deterministic,dnn.include_path=e:/toolkits.win/cuda-8.0.61/include,dnn.library_path=e:/toolkits.win/cuda-8.0.61/lib/x64THEANO_FLAGS_CPU=floatX=float32,device=cpuTHEANO_FLAGS_GPU=floatX=float32,device=cuda0,dnn.enabled=False,gpuarray.preallocate=0.8THEANO_FLAGS_GPU_DNN=floatX=float32,device=cuda0,optimizer_including=cudnn,gpuarray.preallocate=0.8,dnn.conv.algo_bwd_filter=deterministic,dnn.conv.algo_bwd_data=deterministic,dnn.include_path=e:/toolkits.win/cuda-8.0.61/include,dnn.library_path=e:/toolkits.win/cuda-8.0.61/lib/x64

更多具體的 Theano 驗(yàn)證代碼與命令請(qǐng)查看原文。

檢查系統(tǒng)環(huán)境變量

現(xiàn)在,不論 dlwin36 conda 環(huán)境什么時(shí)候激活,PATH 環(huán)境變量應(yīng)該需要看起來(lái)如下面列表一樣:

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

使用 Keras 驗(yàn)證 GPU+cuDNN 的安裝

我們可以使用 Keras 在 MNIST 數(shù)據(jù)集上訓(xùn)練簡(jiǎn)單的卷積神經(jīng)網(wǎng)絡(luò)(convnet)而驗(yàn)證 GPU 的 cuDNN 是否正確安裝,該文件名為 mnist_cnn.py,其可以在 Keras 案例中找到。該卷積神經(jīng)網(wǎng)絡(luò)的代碼如下:

Keras案例地址:https://github.com/fchollet/keras/blob/2.0.5/examples/mnist_cnn.py

(dlwin36) $ set KERAS_BACKEND=cntk(dlwin36) $ python mnist_cnn.pyUsing CNTK backendSelected GPU[0] GeForce GTX TITAN X as the process wide default device.x_train shape: (60000, 28, 28, 1)60000 train samples10000 test samplesTrain on 60000 samples, validate on 10000 samplesEpoch 1/12e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packagescntkcore.py:351: UserWarning: your data is of type “float64”, but your input variable (uid “input113”) expects ““. Please convert your data beforehand to speed up training. (sample.dtype, var.uid, str(var.dtype)))60000/60000 [==============================] – 8s – loss: 0.3275 – acc: 0.8991 – val_loss: 0.0754 – val_acc: 0.9749Epoch 2/1260000/60000 [==============================] – 7s – loss: 0.1114 – acc: 0.9662 – val_loss: 0.0513 – val_acc: 0.9841Epoch 3/1260000/60000 [==============================] – 7s – loss: 0.0862 – acc: 0.9750 – val_loss: 0.0429 – val_acc: 0.9859Epoch 4/1260000/60000 [==============================] – 7s – loss: 0.0721 – acc: 0.9784 – val_loss: 0.0373 – val_acc: 0.9868Epoch 5/1260000/60000 [==============================] – 7s – loss: 0.0649 – acc: 0.9803 – val_loss: 0.0339 – val_acc: 0.9878Epoch 6/1260000/60000 [==============================] – 8s – loss: 0.0580 – acc: 0.9831 – val_loss: 0.0337 – val_acc: 0.9890Epoch 7/1260000/60000 [==============================] – 8s – loss: 0.0529 – acc: 0.9846 – val_loss: 0.0326 – val_acc: 0.9895Epoch 8/1260000/60000 [==============================] – 8s – loss: 0.0483 – acc: 0.9858 – val_loss: 0.0307 – val_acc: 0.9897Epoch 9/1260000/60000 [==============================] – 8s – loss: 0.0456 – acc: 0.9864 – val_loss: 0.0299 – val_acc: 0.9898Epoch 10/1260000/60000 [==============================] – 8s – loss: 0.0407 – acc: 0.9875 – val_loss: 0.0274 – val_acc: 0.9906Epoch 11/1260000/60000 [==============================] – 8s – loss: 0.0405 – acc: 0.9883 – val_loss: 0.0276 – val_acc: 0.9904Epoch 12/1260000/60000 [==============================] – 8s – loss: 0.0372 – acc: 0.9889 – val_loss: 0.0274 – val_acc: 0.9906Test loss: 0.0274011099327Test accuracy: 0.9906

1. 使用帶 Theano 后端的 Keras

為了有一個(gè)能進(jìn)行對(duì)比的基線模型,首先我們使用 Theano 后端和 CPU 訓(xùn)練簡(jiǎn)單的卷積神經(jīng)網(wǎng)絡(luò):

(dlwin36) $ set KERAS_BACKEND=theano(dlwin36) $ set THEANO_FLAGS=%THEANO_FLAGS_CPU%(dlwin36) $ python mnist_cnn.py#以下為訓(xùn)練過(guò)程和結(jié)果Using Theano backend.x_train shape: (60000, 28, 28, 1)60000 train samples10000 test samplesTrain on 60000 samples, validate on 10000 samplesEpoch 1/1260000/60000 [==============================] – 233s – loss: 0.3344 – acc: 0.8972 – val_loss: 0.0743 – val_acc: 0.9777Epoch 2/1260000/60000 [==============================] – 234s – loss: 0.1106 – acc: 0.9674 – val_loss: 0.0504 – val_acc: 0.9837Epoch 3/1260000/60000 [==============================] – 237s – loss: 0.0865 – acc: 0.9741 – val_loss: 0.0402 – val_acc: 0.9865Epoch 4/1260000/60000 [==============================] – 238s – loss: 0.0692 – acc: 0.9792 – val_loss: 0.0362 – val_acc: 0.9874Epoch 5/1260000/60000 [==============================] – 241s – loss: 0.0614 – acc: 0.9821 – val_loss: 0.0370 – val_acc: 0.9879Epoch 6/1260000/60000 [==============================] – 245s – loss: 0.0547 – acc: 0.9839 – val_loss: 0.0319 – val_acc: 0.9885Epoch 7/1260000/60000 [==============================] – 248s – loss: 0.0517 – acc: 0.9840 – val_loss: 0.0293 – val_acc: 0.9900Epoch 8/1260000/60000 [==============================] – 256s – loss: 0.0465 – acc: 0.9863 – val_loss: 0.0294 – val_acc: 0.9905Epoch 9/1260000/60000 [==============================] – 264s – loss: 0.0422 – acc: 0.9870 – val_loss: 0.0276 – val_acc: 0.9902Epoch 10/1260000/60000 [==============================] – 263s – loss: 0.0423 – acc: 0.9875 – val_loss: 0.0287 – val_acc: 0.9902Epoch 11/1260000/60000 [==============================] – 262s – loss: 0.0389 – acc: 0.9884 – val_loss: 0.0291 – val_acc: 0.9898Epoch 12/1260000/60000 [==============================] – 270s – loss: 0.0377 – acc: 0.9885 – val_loss: 0.0272 – val_acc: 0.9910Test loss: 0.0271551907005Test accuracy: 0.991

我們現(xiàn)在使用以下命令行利用帶 Theano 的后端的 Keras 在 GPU 和 cuDNN 環(huán)境下訓(xùn)練卷積神經(jīng)網(wǎng)絡(luò):

(dlwin36) $ set THEANO_FLAGS=%THEANO_FLAGS_GPU_DNN%(dlwin36) $ python mnist_cnn.pyUsing Theano backend.Using cuDNN version 5110 on context NonePreallocating 9830/12288 Mb (0.800000) on cuda0Mapped name None to device cuda0: GeForce GTX TITAN X (0000:03:00.0)x_train shape: (60000, 28, 28, 1)60000 train samples10000 test samplesTrain on 60000 samples, validate on 10000 samplesEpoch 1/1260000/60000 [==============================] – 17s – loss: 0.3219 – acc: 0.9003 – val_loss: 0.0774 – val_acc: 0.9743Epoch 2/1260000/60000 [==============================] – 16s – loss: 0.1108 – acc: 0.9674 – val_loss: 0.0536 – val_acc: 0.9822Epoch 3/1260000/60000 [==============================] – 16s – loss: 0.0832 – acc: 0.9766 – val_loss: 0.0434 – val_acc: 0.9862Epoch 4/1260000/60000 [==============================] – 16s – loss: 0.0694 – acc: 0.9795 – val_loss: 0.0382 – val_acc: 0.9876Epoch 5/1260000/60000 [==============================] – 16s – loss: 0.0605 – acc: 0.9819 – val_loss: 0.0353 – val_acc: 0.9884Epoch 6/1260000/60000 [==============================] – 16s – loss: 0.0533 – acc: 0.9836 – val_loss: 0.0360 – val_acc: 0.9883Epoch 7/1260000/60000 [==============================] – 16s – loss: 0.0482 – acc: 0.9859 – val_loss: 0.0305 – val_acc: 0.9897Epoch 8/1260000/60000 [==============================] – 16s – loss: 0.0452 – acc: 0.9865 – val_loss: 0.0295 – val_acc: 0.9911Epoch 9/1260000/60000 [==============================] – 16s – loss: 0.0414 – acc: 0.9878 – val_loss: 0.0315 – val_acc: 0.9898Epoch 10/1260000/60000 [==============================] – 16s – loss: 0.0386 – acc: 0.9886 – val_loss: 0.0282 – val_acc: 0.9911Epoch 11/1260000/60000 [==============================] – 16s – loss: 0.0378 – acc: 0.9887 – val_loss: 0.0306 – val_acc: 0.9904Epoch 12/1260000/60000 [==============================] – 16s – loss: 0.0354 – acc: 0.9893 – val_loss: 0.0296 – val_acc: 0.9898Test loss: 0.0296215178292Test accuracy: 0.9898

我們看到每一個(gè) Epoch 的訓(xùn)練時(shí)間只需要 16 秒,相對(duì)于使用 CPU 要 250 秒左右取得了很大的提高(在同一個(gè)批量大小的情況下)。

2. 使用 TensorFlow 后端的 Keras

為了激活和測(cè)試 TensorFlow 后端,我們需要使用以下命令行:

(dlwin36) $ set KERAS_BACKEND=tensorflow(dlwin36) $ python mnist_cnn.pyUsing TensorFlow backend.x_train shape: (60000, 28, 28, 1)60000 train samples10000 test samplesTrain on 60000 samples, validate on 10000 samplesEpoch 1/122017-06-30 12:49:22.005585: W c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcoreplatformcpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use SSE instructions, but these are available on your machine and could speed up CPU computations.2017-06-30 12:49:22.005767: W c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcoreplatformcpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use SSE2 instructions, but these are available on your machine and could speed up CPU computations.2017-06-30 12:49:22.005996: W c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcoreplatformcpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use SSE3 instructions, but these are available on your machine and could speed up CPU computations.2017-06-30 12:49:22.006181: W c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcoreplatformcpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use SSE4.1 instructions, but these are available on your machine and could speed up CPU computations.2017-06-30 12:49:22.006361: W c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcoreplatformcpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use SSE4.2 instructions, but these are available on your machine and could speed up CPU computations.2017-06-30 12:49:22.006539: W c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcoreplatformcpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use AVX instructions, but these are available on your machine and could speed up CPU computations.2017-06-30 12:49:22.006717: W c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcoreplatformcpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use AVX2 instructions, but these are available on your machine and could speed up CPU computations.2017-06-30 12:49:22.006897: W c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcoreplatformcpu_feature_guard.cc:45] The TensorFlow library wasn’t compiled to use FMA instructions, but these are available on your machine and could speed up CPU computations.2017-06-30 12:49:22.453483: I c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcorecommon_runtimegpugpu_device.cc:940] Found device 0 with properties:name: GeForce GTX TITAN Xmajor: 5 minor: 2 memoryClockRate (GHz) 1.076pciBusID 0000:03:00.0Total memory: 12.00GiBFree memory: 10.06GiB2017-06-30 12:49:22.454375: I c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcorecommon_runtimegpugpu_device.cc:961] DMA: 02017-06-30 12:49:22.454489: I c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcorecommon_runtimegpugpu_device.cc:971] 0: Y2017-06-30 12:49:22.454624: I c:tf_jenkinshomeworkspacerelease-winmwindows-gpupy36tensorflowcorecommon_runtimegpugpu_device.cc:1030] Creating TensorFlow device (/gpu:0) -> (device: 0, name: GeForce GTX TITAN X, pci bus id: 0000:03:00.0)60000/60000 [==============================] – 8s – loss: 0.3355 – acc: 0.8979 – val_loss: 0.0749 – val_acc: 0.9760Epoch 2/1260000/60000 [==============================] – 5s – loss: 0.1134 – acc: 0.9667 – val_loss: 0.0521 – val_acc: 0.9825Epoch 3/1260000/60000 [==============================] – 5s – loss: 0.0863 – acc: 0.9745 – val_loss: 0.0436 – val_acc: 0.9854Epoch 4/1260000/60000 [==============================] – 5s – loss: 0.0722 – acc: 0.9787 – val_loss: 0.0381 – val_acc: 0.9872Epoch 5/1260000/60000 [==============================] – 5s – loss: 0.0636 – acc: 0.9811 – val_loss: 0.0339 – val_acc: 0.9880Epoch 6/1260000/60000 [==============================] – 5s – loss: 0.0552 – acc: 0.9838 – val_loss: 0.0328 – val_acc: 0.9888Epoch 7/1260000/60000 [==============================] – 5s – loss: 0.0515 – acc: 0.9851 – val_loss: 0.0318 – val_acc: 0.9893Epoch 8/1260000/60000 [==============================] – 5s – loss: 0.0479 – acc: 0.9862 – val_loss: 0.0311 – val_acc: 0.9891Epoch 9/1260000/60000 [==============================] – 5s – loss: 0.0441 – acc: 0.9870 – val_loss: 0.0310 – val_acc: 0.9898Epoch 10/1260000/60000 [==============================] – 5s – loss: 0.0407 – acc: 0.9871 – val_loss: 0.0302 – val_acc: 0.9903Epoch 11/1260000/60000 [==============================] – 5s – loss: 0.0405 – acc: 0.9877 – val_loss: 0.0309 – val_acc: 0.9892Epoch 12/1260000/60000 [==============================] – 5s – loss: 0.0373 – acc: 0.9886 – val_loss: 0.0309 – val_acc: 0.9898Test loss: 0.0308696583555Test accuracy: 0.9898

我們看到使用 TensorFlow 后端要比 Theano 后端在該任務(wù)上快 3 倍左右,它們都是用了 GPU 和 cuDNN 加速。這可能是因?yàn)樵谠摐y(cè)試中它們有相同的通道等級(jí)(channel ordering),但實(shí)際上兩個(gè)平臺(tái)在這一點(diǎn)是不一樣的。因此,程序可能強(qiáng)制 Theano 后端重新排序數(shù)據(jù)而造成性能上的差異。但在該案例下,TensorFlow 在 GPU 上的負(fù)載一直沒(méi)有超過(guò) 70%。

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

3. 使用 CNTK 后端的 Keras

為了激活和測(cè)試 CNTK 后算,我們需要使用以下命令行:

(dlwin36) $ set KERAS_BACKEND=cntk(dlwin36) $ python mnist_cnn.pyUsing CNTK backendSelected GPU[0] GeForce GTX TITAN X as the process wide default device.x_train shape: (60000, 28, 28, 1)60000 train samples10000 test samplesTrain on 60000 samples, validate on 10000 samplesEpoch 1/12e:toolkits.winanaconda3-4.4.0envsdlwin36libsite-packagescntkcore.py:351: UserWarning: your data is of type “float64”, but your input variable (uid “Input113”) expects ““. Please convert your data beforehand to speed up training. (sample.dtype, var.uid, str(var.dtype)))60000/60000 [==============================] – 8s – loss: 0.3275 – acc: 0.8991 – val_loss: 0.0754 – val_acc: 0.9749Epoch 2/1260000/60000 [==============================] – 7s – loss: 0.1114 – acc: 0.9662 – val_loss: 0.0513 – val_acc: 0.9841Epoch 3/1260000/60000 [==============================] – 7s – loss: 0.0862 – acc: 0.9750 – val_loss: 0.0429 – val_acc: 0.9859Epoch 4/1260000/60000 [==============================] – 7s – loss: 0.0721 – acc: 0.9784 – val_loss: 0.0373 – val_acc: 0.9868Epoch 5/1260000/60000 [==============================] – 7s – loss: 0.0649 – acc: 0.9803 – val_loss: 0.0339 – val_acc: 0.9878Epoch 6/1260000/60000 [==============================] – 8s – loss: 0.0580 – acc: 0.9831 – val_loss: 0.0337 – val_acc: 0.9890Epoch 7/1260000/60000 [==============================] – 8s – loss: 0.0529 – acc: 0.9846 – val_loss: 0.0326 – val_acc: 0.9895Epoch 8/1260000/60000 [==============================] – 8s – loss: 0.0483 – acc: 0.9858 – val_loss: 0.0307 – val_acc: 0.9897Epoch 9/1260000/60000 [==============================] – 8s – loss: 0.0456 – acc: 0.9864 – val_loss: 0.0299 – val_acc: 0.9898Epoch 10/1260000/60000 [==============================] – 8s – loss: 0.0407 – acc: 0.9875 – val_loss: 0.0274 – val_acc: 0.9906Epoch 11/1260000/60000 [==============================] – 8s – loss: 0.0405 – acc: 0.9883 – val_loss: 0.0276 – val_acc: 0.9904Epoch 12/1260000/60000 [==============================] – 8s – loss: 0.0372 – acc: 0.9889 – val_loss: 0.0274 – val_acc: 0.9906Test loss: 0.0274011099327Test accuracy: 0.9906

在具體的試驗(yàn)中,CNTK 同樣也十分快速,并且 GPU 負(fù)載達(dá)到了 80%。

這是一份你們需要的Windows版深度學(xué)習(xí)軟件安裝指南

END

投稿和反饋請(qǐng)發(fā)郵件至hzzy@hzbook.com。轉(zhuǎn)載大數(shù)據(jù)公眾號(hào)文章,請(qǐng)向原文作者申請(qǐng)授權(quán),否則產(chǎn)生的任何版權(quán)糾紛與大數(shù)據(jù)無(wú)關(guān)。

相關(guān)閱讀

主站蜘蛛池模板: 久久亚洲欧美日韩精品专区 | 久久伊人精品 | 国产激情视频在线 | 免费高潮视频95在线观看网站 | 精品国模一区二区三区欧美 | 奇米影视首页 | 国产精品免费一区二区三区四区 | 国产在线精品一区二区三区 | 中文字幕在线不卡 | 一区二区免费看 | 欧美日韩精品 | 在线观看成人免费视频 | 9999久久| 网色| av看看| 国产精品亚洲一区二区三区在线 | 97视频久久 | 一级黄a | k8久久久一区二区三区 | 欧美在线网站 | 国产亚洲一区二区三区在线观看 | 欧美日韩综合一区 | 一区二区三区四区免费视频 | 瑞克和莫蒂第五季在线观看 | 日韩欧美在 | 在线视频一区二区 | 天堂一区 | 亚洲精品无 | 成人天堂噜噜噜 | 中文字幕av网址 | 亚洲日本欧美 | 亚洲国产aⅴ精品一区二区 免费观看av | 性大毛片视频 | 午夜精品久久久久久久 | 美女三区| 亚洲91视频 | 国产一级毛片视频 | 日本午夜精品一区二区三区 | 一级大黄| 成人一级毛片 | 中文字幕在线观看www |