Extract Word from Synset using Wordnet in NLTK 3.0

Some time ago, someone on SO asked how to retrieve a list of words for a given synset using NLTK's wordnet wrapper. Here is one of the suggested responses:

for synset in wn.synsets('dog'):
    print synset.lemmas[0].name

Running this code with NLTK 3.0 yields TypeError: 'instancemethod' object is not subscriptable .

I tried each of the previously-proposed solutions (each of the solutions described on the page linked above), but each throws an error. I therefore wanted to ask: Is it possible to print the words for a list of synsets with NLTK 3.0? I would be thankful for any advice others can offer on this question.


WordNet works fine in NLTK 3.0. You are just accessing the lemmas (and names) in the wrong way. Try this instead:

>>> import nltk
>>> nltk.__version__
'3.0.0'
>>> from nltk.corpus import wordnet as wn
>>> for synset in wn.synsets('dog'):
    for lemma in synset.lemmas():
        print lemma.name()


dog
domestic_dog
Canis_familiaris
frump
dog
dog
cad
bounder
blackguard
...

synset.lemmas is a method and does not have a __getitem__() method (and so is not subscriptable).


You can also go directly to the lemma names with lemma_names() :

>>> wordnet.synset('dog.n.1').lemma_names()
['dog', 'domestic_dog', 'Canis_familiaris']

And it works for multiple languages

>>>> wordnet.synset('dog.n.1').lemma_names(lang='jpn')
['イヌ', 'ドッグ', '洋犬', '犬', '飼犬', '飼い犬']

Use:

wn.synset('dog.n.1').name() 

instead of:

wn.synset('dog.n.1').name 

because NLTK changed Synset properties to get functions instead. see https://github.com/nltk/nltk/commit/ba8ab7e23ea2b8d61029484098fd62d5986acd9c

This is a good list of changes to NLTK's API to suit py3.x: https://github.com/nltk/nltk/wiki/Porting-your-code-to-NLTK-3.0

链接地址: http://www.djcxy.com/p/62414.html

上一篇: 如何检查Wordnet数据库中是否存在单词

下一篇: 在NLTK 3.0中使用Wordnet从Synset中提取Word