Skip to content

elitcloud/elit-java

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

65 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ELIT for Java

This project provides the Java SDK and components for the Evolution of Language and Information Technology (ELIT) platform. It is under the Apache 2 license and currently led by the Emory NLP research group.

Installation

Add the following dependency to the pom.xml in your maven project.

<dependency>
    <groupId>cloud.elit</groupId>
    <artifactId>elit-sdk</artifactId>
    <version>0.0.2</version>
</dependency>

Web API

The following code makes a HTTP request to retrieve NLP output for the input string from all components in spaCy.

  • Replace Fields.ALL with Fields.TOK, Fields.LEM, Fields.POS, Fields.NER, or Fields.DEP if you wish to perform the NLP pipeline only up to tokenization, lemmatization, part-of-speech tagging, named entity recognition, or dependency parsing, respectively.
  • Replace Tools.SPACY with Tools.ELIT or Tools.NLP4J if you want to use components provided by ELIT or NLP4J instead.
import cloud.elit.sdk.api.Client;
import cloud.elit.sdk.api.TaskRequest;
import Document;
import Tools;
import Fields;

public class DecodeWebAPITest {
    static public void main(String[] args) {
        Client api = new Client();
        String input = "Hello World! Welcome to ELIT.";
        TaskRequest r = new TaskRequest(input, Fields.ALL, Tools.SPACY);

        String output = api.decode(r);
        System.out.println(output);
    }
}

The web-API then retrieves the NLP output in JSON as follows:

{
  "output": [{
    "sid": 0,
    "tok": ["Hello", "World", "!"],
    "lem": ["hello", "world", "!"],
    "pos": ["UH", "NN", "."],
    "ner": [],
    "dep": [[1, "intj"], [-1, "ROOT"], [1, "punct"]]
  },
  {
    "sid": 1,
    "tok": ["Welcome", "to", "ELIT", "."],
    "lem": ["welcome", "to", "elit", "."],
    "pos": ["VBP", "IN", "NN", "."],
    "ner": [[2, 3, "ORG"]],
    "dep": [[-1, "ROOT"], [2, "aux"], [0, "xcomp"], [0, "punct"]]
  }],
  "pipeline": {
    "dep": "spacy",
    "lem": "spacy",
    "ner": "spacy",
    "pos": "spacy",
    "tok": "spacy"}
}

Our SDK provides a convenient wrapper class to read the JSON output and convert it into a structure (see the Javadoc for more details).

import cloud.elit.sdk.api.Client;
import cloud.elit.sdk.api.TaskRequest;
import Document;
import Tools;
import Fields;
import Document;
import Sentence;
import NLPNode;

public class DecodeWebAPITest {
    static public void main(String[] args) {
        Client api = new Client();
        String input = "Hello World! Welcome to ELIT.";
        TaskRequest r = new TaskRequest(input, Fields.ALL, Tools.SPACY);

        String output = api.decode(r);
        Document doc = new Document(output);

        for (Sentence sen : doc) {
            for (NLPNode node : sen)
                System.out.println(String.format("%s(%s, %s)", 
                        node.getDependencyLabel(), 
                        node.getToken(), 
                        node.getParent().getToken()));
            System.out.println();
        }
    }
}

The above code generates the following output:

intj(Hello, World)
ROOT(World, @#r$%)
punct(!, World)

ROOT(Welcome, @#r$%)
aux(to, ELIT)
xcomp(ELIT, Welcome)
punct(., Welcome)

Our SDK also allows you to create an NLP pipeline consisting of multiple tools. The following code makes a request specifying ELIT for tokenization, NLP4J for part-of-speech tagging and spaCy for dependency parsing.

TaskRequest r = new TaskRequest(input, Fields.DEP, Tools.SPACY);
r.setDependencies(new TaskDependency(Fields.TOK, Tools.ELIT), new TaskDependency(Fields.POS, Tools.NLP4J));