歡迎您光臨本站 註冊首頁

ltp是哈工大出品的自然語言處理工具箱, pyltp是python下對ltp(c++)的封裝.
在linux下我們很容易的安裝pyltp, 因為各種編譯工具比較方便. 但是在windows下需要安裝vs並且還得做一些配置, 因為我服務的人都是在windows下辦公, 需要讓他們能夠在windows下使用ltp, 所以才有了這篇筆記. 我的方案有兩個:
在win10 的bash下安裝ltp, 然後啟動ltp的server, 通過http協議來實現在windows下python調用ltp的方法.
安裝編譯好的wheel(目前只有python3.6/3.5 amd64)(我推薦這種方案)
我在文章最下面還引用了一種方法, 就是使用官方已經編譯好的可執行exe文件, 直接在命令行(如cmd)下調用.
第一種方案: bash下安裝
基本環境
windows 10
bash for windows
python 3.6
安裝bash on ubuntu on windows
這個大家自行百度, 安裝很簡單.
安裝編譯環境
sudo apt install cmake sudo apt install g++
安裝過程大概十幾分鍾.
下載ltp源碼
下載源碼, 這是github地址.
解壓到你能記住的位置
編譯
cd到源碼目錄, 比如我的目錄:
cd /mnt/d/bash-sites/ltp-3.4.0
運行編譯命令:
./configure make
編譯過程大概花費十幾分鍾. 現在我的目錄裡多了一個bin文件夾:
drwxrwxrwx 0 root root 512 Jan 31 15:42 ./ drwxrwxrwx 0 root root 512 Jan 31 15:30 ../ -rwxrwxrwx 1 root root 800 Jan 31 15:30 appveyor.yml* -rwxrwxrwx 1 root root 0 Jan 31 15:30 AUTHORS* drwxrwxrwx 0 root root 512 Jan 31 15:53 bin/ drwxrwxrwx 0 root root 512 Jan 31 15:42 build/ -rwxrwxrwx 1 root root 29301 Jan 31 15:30 ChangeLog.md* drwxrwxrwx 0 root root 512 Jan 31 15:30 cmake/ -rwxrwxrwx 1 root root 1439 Jan 31 15:30 CMakeLists.txt* drwxrwxrwx 0 root root 512 Jan 31 15:30 conf/ -rwxrwxrwx 1 root root 131 Jan 31 15:30 configure* -rwxrwxrwx 1 root root 902 Jan 31 15:30 COPYING* drwxrwxrwx 0 root root 512 Jan 31 15:30 doc/ -rwxrwxrwx 1 root root 79976 Jan 31 15:30 Doxyfile* drwxrwxrwx 0 root root 512 Jan 31 15:30 examples/ -rwxrwxrwx 1 root root 1028 Jan 31 15:30 .gitignore* drwxrwxrwx 0 root root 512 Jan 31 15:42 include/ -rwxrwxrwx 1 root root 85 Jan 31 15:30 INSTALL* drwxrwxrwx 0 root root 512 Jan 31 15:53 lib/ -rwxrwxrwx 1 root root 965 Jan 31 15:30 Makefile* -rwxrwxrwx 1 root root 6639 Jan 31 15:30 NEWS.md* -rwxrwxrwx 1 root root 4750 Jan 31 15:30 README.md* drwxrwxrwx 0 root root 512 Jan 31 15:30 src/ -rwxrwxrwx 1 root root 3048 Jan 31 15:30 subproject.d.json* drwxrwxrwx 0 root root 512 Jan 31 15:31 thirdparty/ drwxrwxrwx 0 root root 512 Jan 31 15:31 tools/ -rwxrwxrwx 1 root root 1372 Jan 31 15:30 .travis.yml*
配置server
一開始我啟動server遇到了這個錯誤.
[INFO] 2018-01-31 15:54:39 Loading segmentor model from "ltp_data/cws.model" ...
[ERROR] 2018-01-31 15:54:39 /mnt/d/bash-sites/ltp-3.4.0/src/ltp/LTPResource.cpp: line 50: LoadSegmentorResource(): Failed to load segmentor model
[ERROR] 2018-01-31 15:54:39 /mnt/d/bash-sites/ltp-3.4.0/src/ltp/Ltp.cpp: line 78: load(): in LTP::wordseg, failed to load segmentor resource
[ERROR] 2018-01-31 15:54:39 /mnt/d/bash-sites/ltp-3.4.0/src/server/ltp_server.cpp: line 172: main(): Failed to setup LTP engine.
因為缺少了模型文件, 在這裡下載最新的模型文件.
解壓到/ mnt/d/bash-sites/ltp-3.4.0/ltp_data/ 下, 這是ltp默認的數據模型存放位置.
然後就能順利啟動服務器啦.
syd@DESKTOP-J02R2VJ:/mnt/d/bash-sites/ltp-3.4.0$ ./bin/ltp_server --port 9090
[INFO] 2018-01-31 15:56:36 Loading segmentor model from "ltp_data/cws.model" ...
[INFO] 2018-01-31 15:56:36 segmentor model is loaded.
[INFO] 2018-01-31 15:56:36 Loading postagger model from "ltp_data/pos.model" ...
[INFO] 2018-01-31 15:56:36 postagger model is loaded
[INFO] 2018-01-31 15:56:36 Loading NER resource from "ltp_data/ner.model"
[INFO] 2018-01-31 15:56:36 NER resource is loaded.
[INFO] 2018-01-31 15:56:36 Loading parser resource from "ltp_data/parser.model"
[INFO] 2018-01-31 15:56:37 parser is loaded.
[INFO] 2018-01-31 15:56:37 Loading srl resource from "ltp_data/pisrl.model"
[dynet] random seed: 493907432
[dynet] allocating memory: 2000MB
[dynet] memory allocation done.
[INFO] 2018-01-31 15:56:39 srl resource is loaded.
[INFO] 2018-01-31 15:56:39 Resources loading finished.
[INFO] 2018-01-31 15:56:39 Start listening on port [9090]...
測試
隨便寫個請求, 看看效果:
import requests import json uri_base = "http://127.0.0.1:9090/ltp" data = {'s': '我認為他叫湯姆去拿外衣和鞋子。', 'x': 'n', 't': 'srl'} response = requests.get(uri_base, data=data) rdata = response.json() print(json.dumps(rdata, indent=4, ensure_ascii=False)) [ [ [ { "arg": [], "cont": "我", "id": 0, "ne": "O", "parent": 1, "pos": "r", "relate": "SBV" }, { "arg": [ { "beg": 0, "end": 0, "id": 0, "type": "A0" }, { "beg": 2, "end": 9, "id": 1, "type": "A1" } ], "cont": "認為", "id": 1, "ne": "O", "parent": -1, "pos": "v", "relate": "HED" }, { "arg": [], "cont": "他", "id": 2, "ne": "O", "parent": 3, "pos": "r", "relate": "SBV" }, { "arg": [ { "beg": 2, "end": 2, "id": 0, "type": "A0" }, { "beg": 4, "end": 4, "id": 1, "type": "A1" }, { "beg": 5, "end": 9, "id": 2, "type": "A2" } ], "cont": "叫", "id": 3, "ne": "O", "parent": 1, "pos": "v", "relate": "VOB" }, { "arg": [], "cont": "湯姆", "id": 4, "ne": "S-Nh", "parent": 3, "pos": "nh", "relate": "DBL" }, { "arg": [], "cont": "去", "id": 5, "ne": "O", "parent": 6, "pos": "v", "relate": "ADV" }, { "arg": [ { "beg": 7, "end": 9, "id": 0, "type": "A1" } ], "cont": "拿", "id": 6, "ne": "O", "parent": 3, "pos": "v", "relate": "VOB" }, { "arg": [], "cont": "外衣", "id": 7, "ne": "O", "parent": 6, "pos": "n", "relate": "VOB" }, { "arg": [], "cont": "和", "id": 8, "ne": "O", "parent": 9, "pos": "c", "relate": "LAD" }, { "arg": [], "cont": "鞋子", "id": 9, "ne": "O", "parent": 7, "pos": "n", "relate": "COO" }, { "arg": [], "cont": "。", "id": 10, "ne": "O", "parent": 1, "pos": "wp", "relate": "WP" } ] ] ]
第二種方案: 安裝wheel
下載wheels
下面兩個文件針對不同的python版本下載一個即可, 這是我在自己的電腦(win10)上編譯的,不知道你的系統是否能用,64bit的windows應該都可以,有問題在下面留言。
注意: 這兩個文件的區別是python版本號
安裝文件
下載好了以後, 在命令行下, cd到wheel文件所在的目錄, 然後使用命令pip install wheel文件名安裝.
測試
安裝好了以後, 打開python shell, 試用一下.
from pyltp import SentenceSplitter sents = SentenceSplitter.split('元芳你怎麼看?我就趴窗口上看唄!') # 分句 print('
'.join(sents))
下載models數據
第三種方案: 直接調用編譯好的ltp的可執行文件
可以參考這篇文章, 但是我在3.4版本中測試不成功, 加載srl資源失敗. 但是在3.3.1版本上測試是成功的.


[lousu-xi ] 哈工大自然語言處理工具箱之ltp在windows10下的安裝使用教程已經有489次圍觀

http://coctec.com/docs/windows10/shhow-post-233822.html