opennlp.tools.lang.english
Class TreebankChunker

java.lang.Object
  extended by opennlp.tools.chunker.ChunkerME
      extended by opennlp.tools.lang.english.TreebankChunker
All Implemented Interfaces:
Chunker

public class TreebankChunker
extends ChunkerME

This is a chunker based on the CONLL chunking task which uses Penn Treebank constituents as the basis for the chunks. See http://cnts.uia.ac.be/conll2000/chunking/ for data and task definition.

Author:
Tom Morton

Field Summary
 
Fields inherited from class opennlp.tools.chunker.ChunkerME
beam, model
 
Constructor Summary
TreebankChunker(opennlp.maxent.MaxentModel mod)
          Creates an English Treebank Chunker which uses the specified model.
TreebankChunker(opennlp.maxent.MaxentModel mod, ChunkerContextGenerator cg)
          Creates an English Treebank Chunker which uses the specified model and context generator.
TreebankChunker(opennlp.maxent.MaxentModel mod, ChunkerContextGenerator cg, int beamSize)
          Creates an English Treebank Chunker which uses the specified model and context generator which will be decoded using the specified beamSize.
TreebankChunker(java.lang.String modelFile)
          Creates an English Treebank Chunker which uses the specified model file.
 
Method Summary
static void main(java.lang.String[] args)
          Chunks tokenized input from stdin.
protected  boolean validOutcome(java.lang.String outcome, java.lang.String[] sequence)
          This method determines wheter the outcome is valid for the preceeding sequence.
 
Methods inherited from class opennlp.tools.chunker.ChunkerME
chunk, chunk, probs, probs, train
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Constructor Detail

TreebankChunker

public TreebankChunker(java.lang.String modelFile)
                throws java.io.IOException
Creates an English Treebank Chunker which uses the specified model file.

Parameters:
modelFile - The name of the maxent model to be used.
Throws:
java.io.IOException - When the model file can't be open or read.

TreebankChunker

public TreebankChunker(opennlp.maxent.MaxentModel mod)
Creates an English Treebank Chunker which uses the specified model.

Parameters:
mod - The maxent model to be used.

TreebankChunker

public TreebankChunker(opennlp.maxent.MaxentModel mod,
                       ChunkerContextGenerator cg)
Creates an English Treebank Chunker which uses the specified model and context generator.

Parameters:
mod - The maxent model to be used.
cg - The context generator to be used.

TreebankChunker

public TreebankChunker(opennlp.maxent.MaxentModel mod,
                       ChunkerContextGenerator cg,
                       int beamSize)
Creates an English Treebank Chunker which uses the specified model and context generator which will be decoded using the specified beamSize.

Parameters:
mod - The maxent model to be used.
cg - The context generator to be used.
beamSize - The size of the beam used for decoding.
Method Detail

validOutcome

protected boolean validOutcome(java.lang.String outcome,
                               java.lang.String[] sequence)
Description copied from class: ChunkerME
This method determines wheter the outcome is valid for the preceeding sequence. This can be used to implement constraints on what sequences are valid.

Overrides:
validOutcome in class ChunkerME
Parameters:
outcome - The outcome.
sequence - The precceding sequence of outcome assignments.
Returns:
true is the outcome is valid for the sequence, false otherwise.

main

public static void main(java.lang.String[] args)
                 throws java.io.IOException
Chunks tokenized input from stdin.
Usage: java opennlp.tools.chunker.EnglishTreebankChunker model < tokenized_sentences

Throws:
java.io.IOException


Copyright 2008 Jason Baldridge, Gann Bierner, and Thomas Morton. All Rights Reserved.