NLP Basics of using Stanford corenlp server

19 May 2017

Stanford CoreNLP provides a set of natural language analysis tools written in Java

CoreNLP source code can be downloaded from https://github.com/stanfordnlp/CoreNLP

CoreNLP 3.7.0 package can be downloaded from link https://stanfordnlp.github.io/CoreNLP/index.html#download Also download the model files from the page

CoreNLP can be hosted as a service by running the following command

java -mx4g -cp stanford-corenlp-3.7.0-models.jar:stanford-corenlp-3.7.0.jar:slf4j-api.jar:slf4j-simple.jar edu.stanford.nlp.pipeline.StanfordCoreNLPServer 9000 10

This starts the coreNLP server at port 9000

Accessing the port 9000 from the browser will give a interface similar to one available on stanford coreNLP server website http://corenlp.run/

We can call the stanford CoreNLP server using client API from client application.

In this article we will be using python wrapper for coreNLP server

download and install the python package https://github.com/smilli/py-corenlp

import pycorenlp
from pycorenlp import StanfordCoreNLP
nlp = StanfordCoreNLP('http://localhost:9000')

CoreNLP works by using Annotators to build Annotations over a stream of text using a CoreNLP pipeline

There are a lot of different annotators available,

tokenize - Tokenizes the words ssplit - Splits a sequence of tokens into sentences. pos - Labels tokens with their POS tag. lemma- computes the LEMMA for the token ner - perform entity recognition for the token sentiment - sentiment analysis annotator

Now we send the request to NLP server with inputs as text NLP pipeline properties to be executed on the server

                    output = nlp.annotate(line, properties={
                    'annotators': 'tokenize,ssplit,pos,lemma,ner,sentiment',
                    'outputFormat': 'json'
                    })
										

For example the input no i dont want to tell you my name has the output

{u'sentences': [{u'tokens': [{u'index': 1, u'word': u'no', u'after': u' ', u'pos': u'DT', u'characterOffsetEnd': 2, u'characterOffsetBegin': 0, u'originalText': u'no', u'before': u''}, {u'index': 2, u'word': u'i', u'after': u' ', u'pos': u'FW', u'characterOffsetEnd': 4, u'characterOffsetBegin': 3, u'originalText': u'i', u'before': u' '}, {u'index': 3, u'word': u'dont', u'after': u' ', u'pos': u'FW', u'characterOffsetEnd': 9, u'characterOffsetBegin': 5, u'originalText': u'dont', u'before': u' '}, {u'index': 4, u'word': u'want', u'after': u' ', u'pos': u'VBP', u'characterOffsetEnd': 14, u'characterOffsetBegin': 10, u'originalText': u'want', u'before': u' '}, {u'index': 5, u'word': u'to', u'after': u' ', u'pos': u'TO', u'characterOffsetEnd': 17, u'characterOffsetBegin': 15, u'originalText': u'to', u'before': u' '}, {u'index': 6, u'word': u'tell', u'after': u' ', u'pos': u'VB', u'characterOffsetEnd': 22, u'characterOffsetBegin': 18, u'originalText': u'tell', u'before': u' '}, {u'index': 7, u'word': u'you', u'after': u' ', u'pos': u'PRP', u'characterOffsetEnd': 26, u'characterOffsetBegin': 23, u'originalText': u'you', u'before': u' '}, {u'index': 8, u'word': u'my', u'after': u' ', u'pos': u'PRP$', u'characterOffsetEnd': 29, u'characterOffsetBegin': 27, u'originalText': u'my', u'before': u' '}, {u'index': 9, u'word': u'name', u'after': u'', u'pos': u'NN', u'characterOffsetEnd': 34, u'characterOffsetBegin': 30, u'originalText': u'name', u'before': u' '}], u'index': 0, u'sentiment': u'Negative', u'sentimentValue': u'1', u'enhancedDependencies': [{u'dep': u'ROOT', u'dependent': 3, u'governorGloss': u'ROOT', u'governor': 0, u'dependentGloss': u'dont'}, {u'dep': u'neg', u'dependent': 1, u'governorGloss': u'dont', u'governor': 3, u'dependentGloss': u'no'}, {u'dep': u'compound', u'dependent': 2, u'governorGloss': u'dont', u'governor': 3, u'dependentGloss': u'i'}, {u'dep': u'acl:relcl', u'dependent': 4, u'governorGloss': u'dont', u'governor': 3, u'dependentGloss': u'want'}, {u'dep': u'mark', u'dependent': 5, u'governorGloss': u'tell', u'governor': 6, u'dependentGloss': u'to'}, {u'dep': u'xcomp', u'dependent': 6, u'governorGloss': u'want', u'governor': 4, u'dependentGloss': u'tell'}, {u'dep': u'iobj', u'dependent': 7, u'governorGloss': u'tell', u'governor': 6, u'dependentGloss': u'you'}, {u'dep': u'nmod:poss', u'dependent': 8, u'governorGloss': u'name', u'governor': 9, u'dependentGloss': u'my'}, {u'dep': u'dobj', u'dependent': 9, u'governorGloss': u'tell', u'governor': 6, u'dependentGloss': u'name'}], u'basicDependencies': [{u'dep': u'ROOT', u'dependent': 3, u'governorGloss': u'ROOT', u'governor': 0, u'dependentGloss': u'dont'}, {u'dep': u'neg', u'dependent': 1, u'governorGloss': u'dont', u'governor': 3, u'dependentGloss': u'no'}, {u'dep': u'compound', u'dependent': 2, u'governorGloss': u'dont', u'governor': 3, u'dependentGloss': u'i'}, {u'dep': u'acl:relcl', u'dependent': 4, u'governorGloss': u'dont', u'governor': 3, u'dependentGloss': u'want'}, {u'dep': u'mark', u'dependent': 5, u'governorGloss': u'tell', u'governor': 6, u'dependentGloss': u'to'}, {u'dep': u'xcomp', u'dependent': 6, u'governorGloss': u'want', u'governor': 4, u'dependentGloss': u'tell'}, {u'dep': u'iobj', u'dependent': 7, u'governorGloss': u'tell', u'governor': 6, u'dependentGloss': u'you'}, {u'dep': u'nmod:poss', u'dependent': 8, u'governorGloss': u'name', u'governor': 9, u'dependentGloss': u'my'}, {u'dep': u'dobj', u'dependent': 9, u'governorGloss': u'tell', u'governor': 6, u'dependentGloss': u'name'}], u'parse': u'(ROOT\n  (FRAG\n    (NP\n      (NP (DT no) (FW i) (FW dont))\n      (SBAR\n        (S\n          (VP (VBP want)\n            (S\n              (VP (TO to)\n                (VP (VB tell)\n                  (NP (PRP you))\n                  (NP (PRP$ my) (NN name)))))))))))', u'enhancedPlusPlusDependencies': [{u'dep': u'ROOT', u'dependent': 3, u'governorGloss': u'ROOT', u'governor': 0, u'dependentGloss': u'dont'}, {u'dep': u'neg', u'dependent': 1, u'governorGloss': u'dont', u'governor': 3, u'dependentGloss': u'no'}, {u'dep': u'compound', u'dependent': 2, u'governorGloss': u'dont', u'governor': 3, u'dependentGloss': u'i'}, {u'dep': u'acl:relcl', u'dependent': 4, u'governorGloss': u'dont', u'governor': 3, u'dependentGloss': u'want'}, {u'dep': u'mark', u'dependent': 5, u'governorGloss': u'tell', u'governor': 6, u'dependentGloss': u'to'}, {u'dep': u'xcomp', u'dependent': 6, u'governorGloss': u'want', u'governor': 4, u'dependentGloss': u'tell'}, {u'dep': u'iobj', u'dependent': 7, u'governorGloss': u'tell', u'governor': 6, u'dependentGloss': u'you'}, {u'dep': u'nmod:poss', u'dependent': 8, u'governorGloss': u'name', u'governor': 9, u'dependentGloss': u'my'}, {u'dep': u'dobj', u'dependent': 9, u'governorGloss': u'tell', u'governor': 6, u'dependentGloss': u'name'}]}]}

We can see that the sentiment Value is negative and sentiment value is 1

References

https://github.com/stanfordnlp/CoreNLP
https://stanfordnlp.github.io/CoreNLP/index.html

pyVision A Machine Learning and Signal Processing toolbox

NLP Basics of using Stanford corenlp server

pyVision
A Machine Learning and Signal Processing toolbox