|
||||||||||
前のクラス 次のクラス | フレームあり フレームなし | |||||||||
概要: 入れ子 | フィールド | コンストラクタ | メソッド | 詳細: フィールド | コンストラクタ | メソッド |
java.lang.Objectnet.java.sen.dictionary.Tokenizer
public abstract class Tokenizer
A String Tokenizer
The Tokenizer uses a Dictionary
to assist the decomposition of
strings into potential morphemes
フィールドの概要 | |
---|---|
protected Node |
bosNode
A Node representing a beginning-of-string |
protected Dictionary |
dictionary
The Dictionary used to find possible morphemes |
protected Node |
eosNode
A Node representing an end-of-string |
protected CToken |
unknownCToken
A CToken representing an unknown morpheme |
protected java.lang.String |
unknownPartOfSpeechDescription
The part-of-speech code to use for unknown tokens |
コンストラクタの概要 | |
---|---|
Tokenizer(Dictionary dictionary,
java.lang.String unknownPartOfSpeechDescription)
Constructs a new Tokenizer that uses the specified
Dictionary to find possible morphemes within a given string |
メソッドの概要 | |
---|---|
Node |
getBOSNode()
Creates a unique beginning-of-string Node . |
Dictionary |
getDictionary()
|
Node |
getEOSNode()
Creates a unique end-of-string Node . |
Node |
getUnknownNode(char[] surface,
int start,
int length,
int span)
Creates an "unknown morpheme" Node with the specified
characteristics. |
abstract Node |
lookup(SentenceIterator iterator,
char[] surface)
Searches for possible morphemes from the given SentenceIterator. |
クラス java.lang.Object から継承されたメソッド |
---|
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait |
フィールドの詳細 |
---|
protected Dictionary dictionary
Dictionary
used to find possible morphemes
protected CToken unknownCToken
CToken
representing an unknown morpheme
protected Node bosNode
Node
representing a beginning-of-string
protected Node eosNode
Node
representing an end-of-string
protected java.lang.String unknownPartOfSpeechDescription
コンストラクタの詳細 |
---|
public Tokenizer(Dictionary dictionary, java.lang.String unknownPartOfSpeechDescription)
Tokenizer
that uses the specified
Dictionary
to find possible morphemes within a given string
dictionary
- The Dictionary
to search withinunknownPartOfSpeechDescription
- The part-of-speech code to use for
unknown tokensメソッドの詳細 |
---|
public Dictionary getDictionary()
public Node getBOSNode()
Node
. The Node
returned by this method is freshly cloned and not an alias of any
other Node
Node
public Node getEOSNode()
Node
. The Node
returned by
this method is freshly cloned and not an alias of any other Node
public Node getUnknownNode(char[] surface, int start, int length, int span)
Node
with the specified
characteristics. The Node
returned by this method is freshly
cloned and not an alias of any other Node
surface
- The underlying surface of which the Node
is partstart
- The index of the first character of the surface within the
Node
length
- The length of the Node
span
- The span of the Node
Node
public abstract Node lookup(SentenceIterator iterator, char[] surface) throws java.io.IOException
Node
that is returned links through
Node.rnext
to a list of matches which may be of varying
lengths
iterator
- The iterator to search fromsurface
- The underlying character surface
Node
s representing the possible
morphemes beginning at the given index
java.io.IOException
|
||||||||||
前のクラス 次のクラス | フレームあり フレームなし | |||||||||
概要: 入れ子 | フィールド | コンストラクタ | メソッド | 詳細: フィールド | コンストラクタ | メソッド |