Package | Description |
---|---|
org.apache.lucene.analysis.cn.smart |
Analyzer for Simplified Chinese, which indexes words.
|
org.apache.lucene.analysis.core |
Basic, general-purpose analysis components.
|
org.apache.lucene.analysis.custom |
A general-purpose Analyzer that can be created with a builder-style API.
|
org.apache.lucene.analysis.icu.segmentation |
Tokenizer that breaks text into words with the Unicode Text Segmentation algorithm.
|
org.apache.lucene.analysis.ja |
Analyzer for Japanese.
|
org.apache.lucene.analysis.ko |
Analyzer for Korean.
|
org.apache.lucene.analysis.ngram |
Character n-gram tokenizers and filters.
|
org.apache.lucene.analysis.path |
Analysis components for path-like strings such as filenames.
|
org.apache.lucene.analysis.pattern |
Set of components for pattern-based (regex) analysis.
|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
org.apache.lucene.analysis.th |
Analyzer for Thai.
|
org.apache.lucene.analysis.util |
Utility functions for text analysis.
|
org.apache.lucene.analysis.wikipedia |
Tokenizer that is aware of Wikipedia syntax.
|
org.apache.lucene.benchmark.byTask.utils |
Utilities used for the benchmark, and for the reports.
|
Modifier and Type | Class and Description |
---|---|
class |
HMMChineseTokenizerFactory
Factory for
HMMChineseTokenizer |
Modifier and Type | Class and Description |
---|---|
class |
KeywordTokenizerFactory
Factory for
KeywordTokenizer . |
class |
LetterTokenizerFactory
Factory for
LetterTokenizer . |
class |
LowerCaseTokenizerFactory
Deprecated.
Use
LetterTokenizerFactory followed by LowerCaseFilterFactory |
class |
WhitespaceTokenizerFactory
Factory for
WhitespaceTokenizer . |
Modifier and Type | Method and Description |
---|---|
TokenizerFactory |
CustomAnalyzer.getTokenizerFactory()
Returns the tokenizer that is used in this analyzer.
|
Modifier and Type | Method and Description |
---|---|
CustomAnalyzer.Builder |
CustomAnalyzer.Builder.withTokenizer(java.lang.Class<? extends TokenizerFactory> factory,
java.util.Map<java.lang.String,java.lang.String> params)
Uses the given tokenizer.
|
CustomAnalyzer.Builder |
CustomAnalyzer.Builder.withTokenizer(java.lang.Class<? extends TokenizerFactory> factory,
java.lang.String... params)
Uses the given tokenizer.
|
Modifier and Type | Class and Description |
---|---|
class |
ICUTokenizerFactory
Factory for
ICUTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
JapaneseTokenizerFactory
Factory for
JapaneseTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
KoreanTokenizerFactory
Factory for
KoreanTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
EdgeNGramTokenizerFactory
Creates new instances of
EdgeNGramTokenizer . |
class |
NGramTokenizerFactory
Factory for
NGramTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
PathHierarchyTokenizerFactory
Factory for
PathHierarchyTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
PatternTokenizerFactory
Factory for
PatternTokenizer . |
class |
SimplePatternSplitTokenizerFactory
Factory for
SimplePatternSplitTokenizer , for producing tokens by splitting according to the provided regexp. |
class |
SimplePatternTokenizerFactory
Factory for
SimplePatternTokenizer , for matching tokens based on the provided regexp. |
Modifier and Type | Class and Description |
---|---|
class |
ClassicTokenizerFactory
Factory for
ClassicTokenizer . |
class |
StandardTokenizerFactory
Factory for
StandardTokenizer . |
class |
UAX29URLEmailTokenizerFactory
Factory for
UAX29URLEmailTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
ThaiTokenizerFactory
Factory for
ThaiTokenizer . |
Modifier and Type | Method and Description |
---|---|
static TokenizerFactory |
TokenizerFactory.forName(java.lang.String name,
java.util.Map<java.lang.String,java.lang.String> args)
looks up a tokenizer by name from context classpath
|
Modifier and Type | Method and Description |
---|---|
static java.lang.Class<? extends TokenizerFactory> |
TokenizerFactory.lookupClass(java.lang.String name)
looks up a tokenizer class by name from context classpath
|
Modifier and Type | Class and Description |
---|---|
class |
WikipediaTokenizerFactory
Factory for
WikipediaTokenizer . |
Constructor and Description |
---|
AnalyzerFactory(java.util.List<CharFilterFactory> charFilterFactories,
TokenizerFactory tokenizerFactory,
java.util.List<TokenFilterFactory> tokenFilterFactories) |
Copyright © 2000–2019 The Apache Software Foundation. All rights reserved.