opennlp.tools.tokenize
Class WhitespaceTokenizer

java.lang.Object
  extended by opennlp.tools.tokenize.WhitespaceTokenizer
All Implemented Interfaces:
Tokenizer

public class WhitespaceTokenizer
extends java.lang.Object

This tokenizer uses white spaces to tokenize the input text.


Field Summary
static WhitespaceTokenizer INSTANCE
          Use this static reference to retrieve an instance of the WhitespaceTokenizer.
 
Method Summary
 java.lang.String[] tokenize(java.lang.String s)
          Tokenize a string.
 Span[] tokenizePos(java.lang.String d)
          Tokenize a string.
 
Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
 

Field Detail

INSTANCE

public static final WhitespaceTokenizer INSTANCE
Use this static reference to retrieve an instance of the WhitespaceTokenizer.

Method Detail

tokenizePos

public Span[] tokenizePos(java.lang.String d)
Description copied from interface: Tokenizer
Tokenize a string.

Parameters:
d - The string to be tokenized.
Returns:
The Span[] with the spans (offsets into s) for each token as the individuals array elements.

tokenize

public java.lang.String[] tokenize(java.lang.String s)
Description copied from interface: Tokenizer
Tokenize a string.

Specified by:
tokenize in interface Tokenizer
Parameters:
s - The string to be tokenized.
Returns:
The String[] with the individual tokens as the array elements.


Copyright 2008 Jason Baldridge, Gann Bierner, and Thomas Morton. All Rights Reserved.