Package | Description |
---|---|
org.apache.lucene.analysis |
Text analysis.
|
org.apache.lucene.analysis.ar |
Analyzer for Arabic.
|
org.apache.lucene.analysis.bg |
Analyzer for Bulgarian.
|
org.apache.lucene.analysis.bn |
Analyzer for Bengali Language.
|
org.apache.lucene.analysis.br |
Analyzer for Brazilian Portuguese.
|
org.apache.lucene.analysis.ca |
Analyzer for Catalan.
|
org.apache.lucene.analysis.cjk |
Analyzer for Chinese, Japanese, and Korean, which indexes bigrams.
|
org.apache.lucene.analysis.ckb |
Analyzer for Sorani Kurdish.
|
org.apache.lucene.analysis.cn.smart |
Analyzer for Simplified Chinese, which indexes words.
|
org.apache.lucene.analysis.core |
Basic, general-purpose analysis components.
|
org.apache.lucene.analysis.custom |
A general-purpose Analyzer that can be created with a builder-style API.
|
org.apache.lucene.analysis.cz |
Analyzer for Czech.
|
org.apache.lucene.analysis.da |
Analyzer for Danish.
|
org.apache.lucene.analysis.de |
Analyzer for German.
|
org.apache.lucene.analysis.el |
Analyzer for Greek.
|
org.apache.lucene.analysis.en |
Analyzer for English.
|
org.apache.lucene.analysis.es |
Analyzer for Spanish.
|
org.apache.lucene.analysis.eu |
Analyzer for Basque.
|
org.apache.lucene.analysis.fa |
Analyzer for Persian.
|
org.apache.lucene.analysis.fi |
Analyzer for Finnish.
|
org.apache.lucene.analysis.fr |
Analyzer for French.
|
org.apache.lucene.analysis.ga |
Analyzer for Irish.
|
org.apache.lucene.analysis.gl |
Analyzer for Galician.
|
org.apache.lucene.analysis.hi |
Analyzer for Hindi.
|
org.apache.lucene.analysis.hu |
Analyzer for Hungarian.
|
org.apache.lucene.analysis.hy |
Analyzer for Armenian.
|
org.apache.lucene.analysis.id |
Analyzer for Indonesian.
|
org.apache.lucene.analysis.it |
Analyzer for Italian.
|
org.apache.lucene.analysis.ja |
Analyzer for Japanese.
|
org.apache.lucene.analysis.ko |
Analyzer for Korean.
|
org.apache.lucene.analysis.lt |
Analyzer for Lithuanian.
|
org.apache.lucene.analysis.lv |
Analyzer for Latvian.
|
org.apache.lucene.analysis.miscellaneous |
Miscellaneous Tokenstreams.
|
org.apache.lucene.analysis.morfologik |
This package provides dictionary-driven lemmatization ("accurate stemming")
filter and analyzer for the Polish Language, driven by the
Morfologik library developed
by Dawid Weiss and Marcin MiĆkowski.
|
org.apache.lucene.analysis.nl |
Analyzer for Dutch.
|
org.apache.lucene.analysis.no |
Analyzer for Norwegian.
|
org.apache.lucene.analysis.pl |
Analyzer for Polish.
|
org.apache.lucene.analysis.pt |
Analyzer for Portuguese.
|
org.apache.lucene.analysis.query |
Automatically filter high-frequency stopwords.
|
org.apache.lucene.analysis.ro |
Analyzer for Romanian.
|
org.apache.lucene.analysis.ru |
Analyzer for Russian.
|
org.apache.lucene.analysis.shingle |
Word n-gram filters.
|
org.apache.lucene.analysis.standard |
Fast, general-purpose grammar-based tokenizer
StandardTokenizer
implements the Word Break rules from the Unicode Text Segmentation algorithm, as specified in
Unicode Standard Annex #29. |
org.apache.lucene.analysis.sv |
Analyzer for Swedish.
|
org.apache.lucene.analysis.synonym |
Analysis components for Synonyms.
|
org.apache.lucene.analysis.th |
Analyzer for Thai.
|
org.apache.lucene.analysis.tr |
Analyzer for Turkish.
|
org.apache.lucene.analysis.uk |
Analyzer for Ukrainian.
|
org.apache.lucene.benchmark.byTask |
Benchmarking Lucene By Tasks
|
org.apache.lucene.benchmark.byTask.tasks |
Extendable benchmark tasks.
|
org.apache.lucene.benchmark.byTask.utils |
Utilities used for the benchmark, and for the reports.
|
org.apache.lucene.classification |
Uses already seen data (the indexed documents) to classify an input ( can be simple text or a structured document).
|
org.apache.lucene.classification.document |
Uses already seen data (the indexed documents) to classify new documents.
|
org.apache.lucene.classification.utils |
Utilities for evaluation, data preparation, etc.
|
org.apache.lucene.codecs |
Codecs API: API for customization of the encoding and structure of the index.
|
org.apache.lucene.collation |
Unicode collation support.
|
org.apache.lucene.document |
The logical representation of a
Document for indexing and searching. |
org.apache.lucene.index |
Code to maintain and access indices.
|
org.apache.lucene.index.memory |
High-performance single-document main memory Apache Lucene fulltext search index.
|
org.apache.lucene.queries.mlt |
Document similarity query generators.
|
org.apache.lucene.queryparser.classic |
A simple query parser implemented with JavaCC.
|
org.apache.lucene.queryparser.complexPhrase |
QueryParser which permits complex phrase query syntax eg "(john jon jonathan~) peters*"
|
org.apache.lucene.queryparser.ext |
Extendable QueryParser provides a simple and flexible extension mechanism by overloading query field names.
|
org.apache.lucene.queryparser.flexible.precedence |
Precedence Query Parser Implementation
|
org.apache.lucene.queryparser.flexible.standard |
Implementation of the Lucene classic query parser using the flexible query parser frameworks
|
org.apache.lucene.queryparser.flexible.standard.config |
Standard Lucene Query Configuration.
|
org.apache.lucene.queryparser.simple |
A simple query parser for human-entered queries.
|
org.apache.lucene.queryparser.xml |
Parser that produces Lucene Query objects from XML streams.
|
org.apache.lucene.queryparser.xml.builders |
XML Parser factories for different Lucene Query/Filters.
|
org.apache.lucene.sandbox.queries |
Additional queries (some may have caveats or limitations)
|
org.apache.lucene.search |
Code to search indices.
|
org.apache.lucene.search.highlight |
Highlighting search terms.
|
org.apache.lucene.search.suggest.analyzing |
Analyzer based autosuggest.
|
org.apache.lucene.search.suggest.document |
Support for document suggestion
|
org.apache.lucene.search.uhighlight |
The UnifiedHighlighter -- a flexible highlighter that can get offsets from postings, term vectors, or analysis.
|
org.apache.lucene.util |
Some utility classes.
|
Modifier and Type | Class and Description |
---|---|
class |
AnalyzerWrapper
Extension to
Analyzer suitable for Analyzers which wrap
other Analyzers. |
class |
DelegatingAnalyzerWrapper
An analyzer wrapper, that doesn't allow to wrap components or readers.
|
class |
MockAnalyzer
Analyzer for testing
|
class |
MockBytesAnalyzer
Analyzer for testing that encodes terms as UTF-16 bytes.
|
class |
MockPayloadAnalyzer
Wraps a whitespace tokenizer with a filter that sets
the first token, and odd tokens to posinc=1, and all others
to 0, encoding the position as pos: XXX in the payload.
|
class |
MockSynonymAnalyzer
adds synonym of "dog" for "dogs", and synonym of "cavy" for "guinea pig".
|
class |
StopwordAnalyzerBase
Base class for Analyzers that need to make use of stopword sets.
|
Modifier and Type | Method and Description |
---|---|
protected abstract Analyzer |
AnalyzerWrapper.getWrappedAnalyzer(java.lang.String fieldName)
Retrieves the wrapped Analyzer appropriate for analyzing the field with
the given name
|
Modifier and Type | Method and Description |
---|---|
static void |
BaseTokenStreamTestCase.assertAnalyzesTo(Analyzer a,
java.lang.String input,
java.lang.String[] output) |
static void |
BaseTokenStreamTestCase.assertAnalyzesTo(Analyzer a,
java.lang.String input,
java.lang.String[] output,
int[] posIncrements) |
static void |
BaseTokenStreamTestCase.assertAnalyzesTo(Analyzer a,
java.lang.String input,
java.lang.String[] output,
int[] startOffsets,
int[] endOffsets) |
static void |
BaseTokenStreamTestCase.assertAnalyzesTo(Analyzer a,
java.lang.String input,
java.lang.String[] output,
int[] startOffsets,
int[] endOffsets,
int[] posIncrements) |
static void |
BaseTokenStreamTestCase.assertAnalyzesTo(Analyzer a,
java.lang.String input,
java.lang.String[] output,
int[] startOffsets,
int[] endOffsets,
java.lang.String[] types,
int[] posIncrements) |
static void |
BaseTokenStreamTestCase.assertAnalyzesTo(Analyzer a,
java.lang.String input,
java.lang.String[] output,
int[] startOffsets,
int[] endOffsets,
java.lang.String[] types,
int[] posIncrements,
int[] posLengths) |
static void |
BaseTokenStreamTestCase.assertAnalyzesTo(Analyzer a,
java.lang.String input,
java.lang.String[] output,
int[] startOffsets,
int[] endOffsets,
java.lang.String[] types,
int[] posIncrements,
int[] posLengths,
boolean graphOffsetsAreCorrect) |
static void |
BaseTokenStreamTestCase.assertAnalyzesTo(Analyzer a,
java.lang.String input,
java.lang.String[] output,
int[] startOffsets,
int[] endOffsets,
java.lang.String[] types,
int[] posIncrements,
int[] posLengths,
boolean graphOffsetsAreCorrect,
byte[][] payloads) |
static void |
BaseTokenStreamTestCase.assertAnalyzesTo(Analyzer a,
java.lang.String input,
java.lang.String[] output,
java.lang.String[] types) |
static void |
BaseTokenStreamTestCase.assertAnalyzesToPositions(Analyzer a,
java.lang.String input,
java.lang.String[] output,
int[] posIncrements,
int[] posLengths) |
static void |
BaseTokenStreamTestCase.assertAnalyzesToPositions(Analyzer a,
java.lang.String input,
java.lang.String[] output,
java.lang.String[] types,
int[] posIncrements,
int[] posLengths) |
static void |
BaseTokenStreamTestCase.assertGraphStrings(Analyzer analyzer,
java.lang.String text,
java.lang.String... expectedStrings)
Enumerates all accepted strings in the token graph created by the analyzer on the provided text, and then
asserts that it's equal to the expected strings.
|
void |
CollationTestBase.assertThreadSafe(Analyzer analyzer) |
static void |
VocabularyAssert.assertVocabulary(Analyzer a,
java.io.InputStream vocOut)
Run a vocabulary test against one file: tab separated.
|
static void |
VocabularyAssert.assertVocabulary(Analyzer a,
java.io.InputStream voc,
java.io.InputStream out)
Run a vocabulary test against two data files.
|
static void |
VocabularyAssert.assertVocabulary(Analyzer a,
java.nio.file.Path zipFile,
java.lang.String vocOut)
Run a vocabulary test against a tab-separated data file inside a zip file
|
static void |
VocabularyAssert.assertVocabulary(Analyzer a,
java.nio.file.Path zipFile,
java.lang.String voc,
java.lang.String out)
Run a vocabulary test against two data files inside a zip file
|
static void |
BaseTokenStreamTestCase.checkAnalysisConsistency(java.util.Random random,
Analyzer a,
boolean useCharFilter,
java.lang.String text) |
static void |
BaseTokenStreamTestCase.checkAnalysisConsistency(java.util.Random random,
Analyzer a,
boolean useCharFilter,
java.lang.String text,
boolean graphOffsetsAreCorrect) |
static void |
BaseTokenStreamTestCase.checkOneTerm(Analyzer a,
java.lang.String input,
java.lang.String expected) |
static void |
BaseTokenStreamTestCase.checkRandomData(java.util.Random random,
Analyzer a,
int iterations)
utility method for blasting tokenstreams with data to make sure they don't do anything crazy
|
static void |
BaseTokenStreamTestCase.checkRandomData(java.util.Random random,
Analyzer a,
int iterations,
boolean simple)
utility method for blasting tokenstreams with data to make sure they don't do anything crazy
|
static void |
BaseTokenStreamTestCase.checkRandomData(java.util.Random random,
Analyzer a,
int iterations,
int maxWordLength)
utility method for blasting tokenstreams with data to make sure they don't do anything crazy
|
static void |
BaseTokenStreamTestCase.checkRandomData(java.util.Random random,
Analyzer a,
int iterations,
int maxWordLength,
boolean simple) |
static void |
BaseTokenStreamTestCase.checkRandomData(java.util.Random random,
Analyzer a,
int iterations,
int maxWordLength,
boolean simple,
boolean graphOffsetsAreCorrect) |
static java.util.Set<java.lang.String> |
BaseTokenStreamTestCase.getGraphStrings(Analyzer analyzer,
java.lang.String text)
Returns all paths accepted by the token stream graph produced by analyzing text with the provided analyzer.
|
abstract Analyzer.TokenStreamComponents |
Analyzer.ReuseStrategy.getReusableComponents(Analyzer analyzer,
java.lang.String fieldName)
Gets the reusable TokenStreamComponents for the field with the given name.
|
protected java.lang.Object |
Analyzer.ReuseStrategy.getStoredValue(Analyzer analyzer)
Returns the currently stored value.
|
abstract void |
Analyzer.ReuseStrategy.setReusableComponents(Analyzer analyzer,
java.lang.String fieldName,
Analyzer.TokenStreamComponents components)
Stores the given TokenStreamComponents as the reusable components for the
field with the give name.
|
protected void |
Analyzer.ReuseStrategy.setStoredValue(Analyzer analyzer,
java.lang.Object storedValue)
Sets the stored value.
|
void |
CollationTestBase.testFarsiRangeFilterCollating(Analyzer analyzer,
BytesRef firstBeg,
BytesRef firstEnd,
BytesRef secondBeg,
BytesRef secondEnd) |
void |
CollationTestBase.testFarsiRangeQueryCollating(Analyzer analyzer,
BytesRef firstBeg,
BytesRef firstEnd,
BytesRef secondBeg,
BytesRef secondEnd) |
void |
CollationTestBase.testFarsiTermRangeQuery(Analyzer analyzer,
BytesRef firstBeg,
BytesRef firstEnd,
BytesRef secondBeg,
BytesRef secondEnd) |
protected java.lang.String |
BaseTokenStreamTestCase.toDot(Analyzer a,
java.lang.String inputText) |
protected void |
BaseTokenStreamTestCase.toDotFile(Analyzer a,
java.lang.String inputText,
java.lang.String localFileName) |
static java.lang.String |
BaseTokenStreamTestCase.toString(Analyzer analyzer,
java.lang.String text)
Returns a
String summary of the tokens this analyzer produces on this text |
Modifier and Type | Method and Description |
---|---|
protected static CharArraySet |
StopwordAnalyzerBase.loadStopwordSet(boolean ignoreCase,
java.lang.Class<? extends Analyzer> aClass,
java.lang.String resource,
java.lang.String comment)
Creates a CharArraySet from a file resource associated with a class.
|
Modifier and Type | Class and Description |
---|---|
class |
ArabicAnalyzer
Analyzer for Arabic. |
Modifier and Type | Class and Description |
---|---|
class |
BulgarianAnalyzer
Analyzer for Bulgarian. |
Modifier and Type | Class and Description |
---|---|
class |
BengaliAnalyzer
Analyzer for Bengali.
|
Modifier and Type | Class and Description |
---|---|
class |
BrazilianAnalyzer
Analyzer for Brazilian Portuguese language. |
Modifier and Type | Class and Description |
---|---|
class |
CatalanAnalyzer
Analyzer for Catalan. |
Modifier and Type | Class and Description |
---|---|
class |
CJKAnalyzer
An
Analyzer that tokenizes text with StandardTokenizer ,
normalizes content with CJKWidthFilter , folds case with
LowerCaseFilter , forms bigrams of CJK with CJKBigramFilter ,
and filters stopwords with StopFilter |
Modifier and Type | Class and Description |
---|---|
class |
SoraniAnalyzer
Analyzer for Sorani Kurdish. |
Modifier and Type | Class and Description |
---|---|
class |
SmartChineseAnalyzer
SmartChineseAnalyzer is an analyzer for Chinese or mixed Chinese-English text.
|
Modifier and Type | Class and Description |
---|---|
class |
KeywordAnalyzer
"Tokenizes" the entire stream as a single token.
|
class |
SimpleAnalyzer
|
class |
StopAnalyzer
|
class |
UnicodeWhitespaceAnalyzer
An Analyzer that uses
UnicodeWhitespaceTokenizer . |
class |
WhitespaceAnalyzer
An Analyzer that uses
WhitespaceTokenizer . |
Modifier and Type | Class and Description |
---|---|
class |
CustomAnalyzer
A general-purpose Analyzer that can be created with a builder-style API.
|
Modifier and Type | Class and Description |
---|---|
class |
CzechAnalyzer
Analyzer for Czech language. |
Modifier and Type | Class and Description |
---|---|
class |
DanishAnalyzer
Analyzer for Danish. |
Modifier and Type | Class and Description |
---|---|
class |
GermanAnalyzer
Analyzer for German language. |
Modifier and Type | Class and Description |
---|---|
class |
GreekAnalyzer
Analyzer for the Greek language. |
Modifier and Type | Class and Description |
---|---|
class |
EnglishAnalyzer
Analyzer for English. |
Modifier and Type | Class and Description |
---|---|
class |
SpanishAnalyzer
Analyzer for Spanish. |
Modifier and Type | Class and Description |
---|---|
class |
BasqueAnalyzer
Analyzer for Basque. |
Modifier and Type | Class and Description |
---|---|
class |
PersianAnalyzer
Analyzer for Persian. |
Modifier and Type | Class and Description |
---|---|
class |
FinnishAnalyzer
Analyzer for Finnish. |
Modifier and Type | Class and Description |
---|---|
class |
FrenchAnalyzer
Analyzer for French language. |
Modifier and Type | Class and Description |
---|---|
class |
IrishAnalyzer
Analyzer for Irish. |
Modifier and Type | Class and Description |
---|---|
class |
GalicianAnalyzer
Analyzer for Galician. |
Modifier and Type | Class and Description |
---|---|
class |
HindiAnalyzer
Analyzer for Hindi.
|
Modifier and Type | Class and Description |
---|---|
class |
HungarianAnalyzer
Analyzer for Hungarian. |
Modifier and Type | Class and Description |
---|---|
class |
ArmenianAnalyzer
Analyzer for Armenian. |
Modifier and Type | Class and Description |
---|---|
class |
IndonesianAnalyzer
Analyzer for Indonesian (Bahasa)
|
Modifier and Type | Class and Description |
---|---|
class |
ItalianAnalyzer
Analyzer for Italian. |
Modifier and Type | Class and Description |
---|---|
class |
JapaneseAnalyzer
Analyzer for Japanese that uses morphological analysis.
|
Modifier and Type | Class and Description |
---|---|
class |
KoreanAnalyzer
Analyzer for Korean that uses morphological analysis.
|
Modifier and Type | Class and Description |
---|---|
class |
LithuanianAnalyzer
Analyzer for Lithuanian. |
Modifier and Type | Class and Description |
---|---|
class |
LatvianAnalyzer
Analyzer for Latvian. |
Modifier and Type | Class and Description |
---|---|
class |
LimitTokenCountAnalyzer
This Analyzer limits the number of tokens while indexing.
|
class |
PerFieldAnalyzerWrapper
This analyzer is used to facilitate scenarios where different
fields require different analysis techniques.
|
Modifier and Type | Method and Description |
---|---|
protected Analyzer |
LimitTokenCountAnalyzer.getWrappedAnalyzer(java.lang.String fieldName) |
protected Analyzer |
PerFieldAnalyzerWrapper.getWrappedAnalyzer(java.lang.String fieldName) |
Constructor and Description |
---|
LimitTokenCountAnalyzer(Analyzer delegate,
int maxTokenCount)
Build an analyzer that limits the maximum number of tokens per field.
|
LimitTokenCountAnalyzer(Analyzer delegate,
int maxTokenCount,
boolean consumeAllTokens)
Build an analyzer that limits the maximum number of tokens per field.
|
PerFieldAnalyzerWrapper(Analyzer defaultAnalyzer)
Constructs with default analyzer.
|
PerFieldAnalyzerWrapper(Analyzer defaultAnalyzer,
java.util.Map<java.lang.String,Analyzer> fieldAnalyzers)
Constructs with default analyzer and a map of analyzers to use for
specific fields.
|
Constructor and Description |
---|
PerFieldAnalyzerWrapper(Analyzer defaultAnalyzer,
java.util.Map<java.lang.String,Analyzer> fieldAnalyzers)
Constructs with default analyzer and a map of analyzers to use for
specific fields.
|
Modifier and Type | Class and Description |
---|---|
class |
MorfologikAnalyzer
Analyzer using Morfologik library. |
Modifier and Type | Class and Description |
---|---|
class |
DutchAnalyzer
Analyzer for Dutch language. |
Modifier and Type | Class and Description |
---|---|
class |
NorwegianAnalyzer
Analyzer for Norwegian. |
Modifier and Type | Class and Description |
---|---|
class |
PolishAnalyzer
Analyzer for Polish. |
Modifier and Type | Class and Description |
---|---|
class |
PortugueseAnalyzer
Analyzer for Portuguese. |
Modifier and Type | Class and Description |
---|---|
class |
QueryAutoStopWordAnalyzer
An
Analyzer used primarily at query time to wrap another analyzer and provide a layer of protection
which prevents very common words from being passed into queries. |
Modifier and Type | Method and Description |
---|---|
protected Analyzer |
QueryAutoStopWordAnalyzer.getWrappedAnalyzer(java.lang.String fieldName) |
Constructor and Description |
---|
QueryAutoStopWordAnalyzer(Analyzer delegate,
IndexReader indexReader)
Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for all
indexed fields from terms with a document frequency percentage greater than
QueryAutoStopWordAnalyzer.defaultMaxDocFreqPercent |
QueryAutoStopWordAnalyzer(Analyzer delegate,
IndexReader indexReader,
java.util.Collection<java.lang.String> fields,
float maxPercentDocs)
Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for the
given selection of fields from terms with a document frequency percentage
greater than the given maxPercentDocs
|
QueryAutoStopWordAnalyzer(Analyzer delegate,
IndexReader indexReader,
java.util.Collection<java.lang.String> fields,
int maxDocFreq)
Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for the
given selection of fields from terms with a document frequency greater than
the given maxDocFreq
|
QueryAutoStopWordAnalyzer(Analyzer delegate,
IndexReader indexReader,
float maxPercentDocs)
Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for all
indexed fields from terms with a document frequency percentage greater than
the given maxPercentDocs
|
QueryAutoStopWordAnalyzer(Analyzer delegate,
IndexReader indexReader,
int maxDocFreq)
Creates a new QueryAutoStopWordAnalyzer with stopwords calculated for all
indexed fields from terms with a document frequency greater than the given
maxDocFreq
|
Modifier and Type | Class and Description |
---|---|
class |
RomanianAnalyzer
Analyzer for Romanian. |
Modifier and Type | Class and Description |
---|---|
class |
RussianAnalyzer
Analyzer for Russian language. |
Modifier and Type | Class and Description |
---|---|
class |
ShingleAnalyzerWrapper
A ShingleAnalyzerWrapper wraps a
ShingleFilter around another Analyzer . |
Modifier and Type | Method and Description |
---|---|
Analyzer |
ShingleAnalyzerWrapper.getWrappedAnalyzer(java.lang.String fieldName) |
Constructor and Description |
---|
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer) |
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer,
int maxShingleSize) |
ShingleAnalyzerWrapper(Analyzer defaultAnalyzer,
int minShingleSize,
int maxShingleSize) |
ShingleAnalyzerWrapper(Analyzer delegate,
int minShingleSize,
int maxShingleSize,
java.lang.String tokenSeparator,
boolean outputUnigrams,
boolean outputUnigramsIfNoShingles,
java.lang.String fillerToken)
Creates a new ShingleAnalyzerWrapper
|
Modifier and Type | Class and Description |
---|---|
class |
ClassicAnalyzer
Filters
ClassicTokenizer with ClassicFilter , LowerCaseFilter and StopFilter , using a list of
English stop words. |
class |
StandardAnalyzer
Filters
StandardTokenizer with LowerCaseFilter and
StopFilter , using a configurable list of stop words. |
class |
UAX29URLEmailAnalyzer
Filters
UAX29URLEmailTokenizer
with LowerCaseFilter and
StopFilter , using a list of
English stop words. |
Modifier and Type | Method and Description |
---|---|
void |
WordBreakTestUnicode_9_0_0.test(Analyzer analyzer) |
void |
EmojiTokenizationTestUnicode_11_0.test(Analyzer analyzer) |
Modifier and Type | Class and Description |
---|---|
class |
SwedishAnalyzer
Analyzer for Swedish. |
Modifier and Type | Method and Description |
---|---|
protected SynonymMap |
SynonymGraphFilterFactory.loadSynonyms(ResourceLoader loader,
java.lang.String cname,
boolean dedup,
Analyzer analyzer)
Load synonyms with the given
SynonymMap.Parser class. |
protected SynonymMap |
SynonymFilterFactory.loadSynonyms(ResourceLoader loader,
java.lang.String cname,
boolean dedup,
Analyzer analyzer)
Deprecated.
Load synonyms with the given
SynonymMap.Parser class. |
Constructor and Description |
---|
Parser(boolean dedup,
Analyzer analyzer) |
SolrSynonymParser(boolean dedup,
boolean expand,
Analyzer analyzer) |
WordnetSynonymParser(boolean dedup,
boolean expand,
Analyzer analyzer) |
Modifier and Type | Class and Description |
---|---|
class |
ThaiAnalyzer
Analyzer for Thai language. |
Modifier and Type | Class and Description |
---|---|
class |
TurkishAnalyzer
Analyzer for Turkish. |
Modifier and Type | Class and Description |
---|---|
class |
UkrainianMorfologikAnalyzer
A dictionary-based
Analyzer for Ukrainian. |
Modifier and Type | Method and Description |
---|---|
Analyzer |
PerfRunData.getAnalyzer() |
Modifier and Type | Method and Description |
---|---|
void |
PerfRunData.setAnalyzer(Analyzer analyzer) |
Modifier and Type | Method and Description |
---|---|
static Analyzer |
NewAnalyzerTask.createAnalyzer(java.lang.String className) |
Modifier and Type | Method and Description |
---|---|
Analyzer |
AnalyzerFactory.create() |
Modifier and Type | Field and Description |
---|---|
protected Analyzer |
SimpleNaiveBayesClassifier.analyzer
Analyzer to be used for tokenizing unseen input text |
Constructor and Description |
---|
BM25NBClassifier(IndexReader indexReader,
Analyzer analyzer,
Query query,
java.lang.String classFieldName,
java.lang.String... textFieldNames)
Creates a new NaiveBayes classifier.
|
BooleanPerceptronClassifier(IndexReader indexReader,
Analyzer analyzer,
Query query,
java.lang.Integer batchSize,
java.lang.Double bias,
java.lang.String classFieldName,
java.lang.String textFieldName)
Creates a
BooleanPerceptronClassifier |
CachingNaiveBayesClassifier(IndexReader indexReader,
Analyzer analyzer,
Query query,
java.lang.String classFieldName,
java.lang.String... textFieldNames)
Creates a new NaiveBayes classifier with inside caching.
|
KNearestFuzzyClassifier(IndexReader indexReader,
Similarity similarity,
Analyzer analyzer,
Query query,
int k,
java.lang.String classFieldName,
java.lang.String... textFieldNames)
Creates a
KNearestFuzzyClassifier . |
KNearestNeighborClassifier(IndexReader indexReader,
Similarity similarity,
Analyzer analyzer,
Query query,
int k,
int minDocsFreq,
int minTermFreq,
java.lang.String classFieldName,
java.lang.String... textFieldNames)
Creates a
KNearestNeighborClassifier . |
SimpleNaiveBayesClassifier(IndexReader indexReader,
Analyzer analyzer,
Query query,
java.lang.String classFieldName,
java.lang.String... textFieldNames)
Creates a new NaiveBayes classifier.
|
Modifier and Type | Field and Description |
---|---|
protected java.util.Map<java.lang.String,Analyzer> |
KNearestNeighborDocumentClassifier.field2analyzer
map of per field analyzers
|
protected java.util.Map<java.lang.String,Analyzer> |
SimpleNaiveBayesDocumentClassifier.field2analyzer
Analyzer to be used for tokenizing document fields |
Constructor and Description |
---|
KNearestNeighborDocumentClassifier(IndexReader indexReader,
Similarity similarity,
Query query,
int k,
int minDocsFreq,
int minTermFreq,
java.lang.String classFieldName,
java.util.Map<java.lang.String,Analyzer> field2analyzer,
java.lang.String... textFieldNames)
Creates a
KNearestNeighborClassifier . |
SimpleNaiveBayesDocumentClassifier(IndexReader indexReader,
Query query,
java.lang.String classFieldName,
java.util.Map<java.lang.String,Analyzer> field2analyzer,
java.lang.String... textFieldNames)
Creates a new NaiveBayes classifier.
|
Modifier and Type | Method and Description |
---|---|
void |
DatasetSplitter.split(IndexReader originalIndex,
Directory trainingIndex,
Directory testIndex,
Directory crossValidationIndex,
Analyzer analyzer,
boolean termVectors,
java.lang.String classFieldName,
java.lang.String... fieldNames)
Split a given index into 3 indexes for training, test and cross validation tasks respectively
|
Constructor and Description |
---|
NearestFuzzyQuery(Analyzer analyzer)
Default constructor
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
StoredFieldsWriter.MergeVisitor.tokenStream(Analyzer analyzer,
TokenStream reuse) |
Modifier and Type | Class and Description |
---|---|
class |
CollationKeyAnalyzer
Configures
KeywordTokenizer with CollationAttributeFactory . |
class |
ICUCollationKeyAnalyzer
Configures
KeywordTokenizer with ICUCollationAttributeFactory . |
Modifier and Type | Method and Description |
---|---|
TokenStream |
LazyDocument.LazyField.tokenStream(Analyzer analyzer,
TokenStream reuse) |
TokenStream |
Field.tokenStream(Analyzer analyzer,
TokenStream reuse) |
TokenStream |
FeatureField.tokenStream(Analyzer analyzer,
TokenStream reuse) |
Modifier and Type | Method and Description |
---|---|
Analyzer |
IndexWriterConfig.getAnalyzer() |
Analyzer |
IndexWriter.getAnalyzer()
Returns the analyzer used by this index.
|
Analyzer |
LiveIndexWriterConfig.getAnalyzer()
Returns the default analyzer to use for indexing documents.
|
Modifier and Type | Method and Description |
---|---|
TokenStream |
IndexableField.tokenStream(Analyzer analyzer,
TokenStream reuse)
Creates the TokenStream used for indexing this field.
|
Constructor and Description |
---|
IndexWriterConfig(Analyzer analyzer)
Creates a new config that with the provided
Analyzer . |
RandomIndexWriter(java.util.Random r,
Directory dir,
Analyzer a)
create a RandomIndexWriter with a random config
|
Modifier and Type | Method and Description |
---|---|
void |
MemoryIndex.addField(IndexableField field,
Analyzer analyzer)
Adds a lucene
IndexableField to the MemoryIndex using the provided analyzer. |
void |
MemoryIndex.addField(java.lang.String fieldName,
java.lang.String text,
Analyzer analyzer)
Convenience method; Tokenizes the given field text and adds the resulting
terms to the index; Equivalent to adding an indexed non-keyword Lucene
Field that is tokenized, not stored,
termVectorStored with positions (or termVectorStored with positions and offsets), |
static MemoryIndex |
MemoryIndex.fromDocument(java.lang.Iterable<? extends IndexableField> document,
Analyzer analyzer)
Builds a MemoryIndex from a lucene
Document using an analyzer |
static MemoryIndex |
MemoryIndex.fromDocument(java.lang.Iterable<? extends IndexableField> document,
Analyzer analyzer,
boolean storeOffsets,
boolean storePayloads)
Builds a MemoryIndex from a lucene
Document using an analyzer |
static MemoryIndex |
MemoryIndex.fromDocument(java.lang.Iterable<? extends IndexableField> document,
Analyzer analyzer,
boolean storeOffsets,
boolean storePayloads,
long maxReusedBytes)
Builds a MemoryIndex from a lucene
Document using an analyzer |
Modifier and Type | Method and Description |
---|---|
Analyzer |
MoreLikeThis.getAnalyzer()
Returns an analyzer that will be used to parse source doc with.
|
Analyzer |
MoreLikeThisQuery.getAnalyzer() |
Modifier and Type | Method and Description |
---|---|
void |
MoreLikeThis.setAnalyzer(Analyzer analyzer)
Sets the analyzer to use.
|
void |
MoreLikeThisQuery.setAnalyzer(Analyzer analyzer) |
Constructor and Description |
---|
MoreLikeThisQuery(java.lang.String likeText,
java.lang.String[] moreLikeFields,
Analyzer analyzer,
java.lang.String fieldName) |
Modifier and Type | Method and Description |
---|---|
void |
QueryParserBase.init(java.lang.String f,
Analyzer a)
Initializes a query parser.
|
protected Query |
QueryParserBase.newFieldQuery(Analyzer analyzer,
java.lang.String field,
java.lang.String queryText,
boolean quoted) |
static Query |
MultiFieldQueryParser.parse(java.lang.String[] queries,
java.lang.String[] fields,
Analyzer analyzer)
Parses a query which searches on the fields specified.
|
static Query |
MultiFieldQueryParser.parse(java.lang.String[] queries,
java.lang.String[] fields,
BooleanClause.Occur[] flags,
Analyzer analyzer)
Parses a query, searching on the fields specified.
|
static Query |
MultiFieldQueryParser.parse(java.lang.String query,
java.lang.String[] fields,
BooleanClause.Occur[] flags,
Analyzer analyzer)
Parses a query, searching on the fields specified.
|
Constructor and Description |
---|
MultiFieldQueryParser(java.lang.String[] fields,
Analyzer analyzer)
Creates a MultiFieldQueryParser.
|
MultiFieldQueryParser(java.lang.String[] fields,
Analyzer analyzer,
java.util.Map<java.lang.String,java.lang.Float> boosts)
Creates a MultiFieldQueryParser.
|
QueryParser(java.lang.String f,
Analyzer a)
Create a query parser.
|
Constructor and Description |
---|
ComplexPhraseQueryParser(java.lang.String f,
Analyzer a) |
Constructor and Description |
---|
ExtendableQueryParser(java.lang.String f,
Analyzer a)
Creates a new
ExtendableQueryParser instance |
ExtendableQueryParser(java.lang.String f,
Analyzer a,
Extensions ext)
Creates a new
ExtendableQueryParser instance |
Constructor and Description |
---|
PrecedenceQueryParser(Analyzer analyer) |
Modifier and Type | Method and Description |
---|---|
Analyzer |
CommonQueryParserConfiguration.getAnalyzer() |
Analyzer |
StandardQueryParser.getAnalyzer() |
Modifier and Type | Method and Description |
---|---|
static Query |
QueryParserUtil.parse(java.lang.String[] queries,
java.lang.String[] fields,
Analyzer analyzer)
Parses a query which searches on the fields specified.
|
static Query |
QueryParserUtil.parse(java.lang.String[] queries,
java.lang.String[] fields,
BooleanClause.Occur[] flags,
Analyzer analyzer)
Parses a query, searching on the fields specified.
|
static Query |
QueryParserUtil.parse(java.lang.String query,
java.lang.String[] fields,
BooleanClause.Occur[] flags,
Analyzer analyzer)
Parses a query, searching on the fields specified.
|
void |
StandardQueryParser.setAnalyzer(Analyzer analyzer) |
Constructor and Description |
---|
StandardQueryParser(Analyzer analyzer)
Constructs a
StandardQueryParser object and sets an
Analyzer to it. |
Modifier and Type | Field and Description |
---|---|
static ConfigurationKey<Analyzer> |
StandardQueryConfigHandler.ConfigurationKeys.ANALYZER
Key used to set the
Analyzer used for terms found in the query |
Constructor and Description |
---|
SimpleQueryParser(Analyzer analyzer,
java.util.Map<java.lang.String,java.lang.Float> weights)
Creates a new parser searching over multiple fields with different weights.
|
SimpleQueryParser(Analyzer analyzer,
java.util.Map<java.lang.String,java.lang.Float> weights,
int flags)
Creates a new parser with custom flags used to enable/disable certain features.
|
SimpleQueryParser(Analyzer analyzer,
java.lang.String field)
Creates a new parser searching over a single field.
|
Modifier and Type | Field and Description |
---|---|
protected Analyzer |
CoreParser.analyzer |
Constructor and Description |
---|
CoreParser(Analyzer analyzer,
QueryParser parser)
Construct an XML parser that uses a single instance QueryParser for handling
UserQuery tags - all parse operations are synchronised on this parser
|
CoreParser(java.lang.String defaultField,
Analyzer analyzer)
Constructs an XML parser that creates a QueryParser for each UserQuery request.
|
CoreParser(java.lang.String defaultField,
Analyzer analyzer,
QueryParser parser) |
CorePlusExtensionsParser(Analyzer analyzer,
QueryParser parser)
Construct an XML parser that uses a single instance QueryParser for handling
UserQuery tags - all parse operations are synchronized on this parser
|
CorePlusExtensionsParser(java.lang.String defaultField,
Analyzer analyzer)
Constructs an XML parser that creates a QueryParser for each UserQuery request.
|
CorePlusQueriesParser(Analyzer analyzer,
QueryParser parser)
Construct an XML parser that uses a single instance QueryParser for handling
UserQuery tags - all parse operations are synchronized on this parser
|
CorePlusQueriesParser(java.lang.String defaultField,
Analyzer analyzer)
Constructs an XML parser that creates a QueryParser for each UserQuery request.
|
CorePlusQueriesParser(java.lang.String defaultField,
Analyzer analyzer,
QueryParser parser) |
Modifier and Type | Method and Description |
---|---|
protected QueryParser |
UserInputQueryBuilder.createQueryParser(java.lang.String fieldName,
Analyzer analyzer)
Method to create a QueryParser - designed to be overridden
|
Constructor and Description |
---|
FuzzyLikeThisQueryBuilder(Analyzer analyzer) |
LikeThisQueryBuilder(Analyzer analyzer,
java.lang.String[] defaultFieldNames) |
SpanOrTermsBuilder(Analyzer analyzer) |
TermsQueryBuilder(Analyzer analyzer) |
UserInputQueryBuilder(java.lang.String defaultField,
Analyzer analyzer) |
Constructor and Description |
---|
FuzzyLikeThisQuery(int maxNumTerms,
Analyzer analyzer) |
Modifier and Type | Field and Description |
---|---|
protected static Analyzer |
SearchEquivalenceTestBase.analyzer |
protected static Analyzer |
BaseExplanationTestCase.analyzer |
Modifier and Type | Method and Description |
---|---|
static TokenStream |
TokenSources.getAnyTokenStream(IndexReader reader,
int docId,
java.lang.String field,
Analyzer analyzer)
Deprecated.
|
static TokenStream |
TokenSources.getAnyTokenStream(IndexReader reader,
int docId,
java.lang.String field,
Document document,
Analyzer analyzer)
Deprecated.
|
java.lang.String |
Highlighter.getBestFragment(Analyzer analyzer,
java.lang.String fieldName,
java.lang.String text)
Highlights chosen terms in a text, extracting the most relevant section.
|
java.lang.String[] |
Highlighter.getBestFragments(Analyzer analyzer,
java.lang.String fieldName,
java.lang.String text,
int maxNumFragments)
Highlights chosen terms in a text, extracting the most relevant sections.
|
static TokenStream |
TokenSources.getTokenStream(Document doc,
java.lang.String field,
Analyzer analyzer)
Deprecated.
|
static TokenStream |
TokenSources.getTokenStream(IndexReader reader,
int docId,
java.lang.String field,
Analyzer analyzer)
Deprecated.
|
static TokenStream |
TokenSources.getTokenStream(java.lang.String field,
Fields tvFields,
java.lang.String text,
Analyzer analyzer,
int maxStartOffset)
Get a token stream from either un-inverting a term vector if possible, or by analyzing the text.
|
static TokenStream |
TokenSources.getTokenStream(java.lang.String field,
java.lang.String contents,
Analyzer analyzer)
Deprecated.
|
Modifier and Type | Field and Description |
---|---|
protected Analyzer |
AnalyzingInfixSuggester.indexAnalyzer
Analyzer used at index time
|
protected Analyzer |
AnalyzingInfixSuggester.queryAnalyzer
Analyzer used at search time
|
Modifier and Type | Method and Description |
---|---|
protected IndexWriterConfig |
AnalyzingInfixSuggester.getIndexWriterConfig(Analyzer indexAnalyzer,
IndexWriterConfig.OpenMode openMode)
Override this to customize index settings, e.g.
|
Constructor and Description |
---|
AnalyzingInfixSuggester(Directory dir,
Analyzer analyzer)
Create a new instance, loading from a previously built
AnalyzingInfixSuggester directory, if it exists.
|
AnalyzingInfixSuggester(Directory dir,
Analyzer indexAnalyzer,
Analyzer queryAnalyzer,
int minPrefixChars,
boolean commitOnBuild)
Create a new instance, loading from a previously built
AnalyzingInfixSuggester directory, if it exists.
|
AnalyzingInfixSuggester(Directory dir,
Analyzer indexAnalyzer,
Analyzer queryAnalyzer,
int minPrefixChars,
boolean commitOnBuild,
boolean allTermsRequired,
boolean highlight)
Create a new instance, loading from a previously built
AnalyzingInfixSuggester directory, if it exists.
|
AnalyzingInfixSuggester(Directory dir,
Analyzer indexAnalyzer,
Analyzer queryAnalyzer,
int minPrefixChars,
boolean commitOnBuild,
boolean allTermsRequired,
boolean highlight,
boolean closeIndexWriterOnBuild)
Create a new instance, loading from a previously built
AnalyzingInfixSuggester directory, if it exists.
|
AnalyzingSuggester(Directory tempDir,
java.lang.String tempFileNamePrefix,
Analyzer analyzer)
|
AnalyzingSuggester(Directory tempDir,
java.lang.String tempFileNamePrefix,
Analyzer indexAnalyzer,
Analyzer queryAnalyzer)
|
AnalyzingSuggester(Directory tempDir,
java.lang.String tempFileNamePrefix,
Analyzer indexAnalyzer,
Analyzer queryAnalyzer,
int options,
int maxSurfaceFormsPerAnalyzedForm,
int maxGraphExpansions,
boolean preservePositionIncrements)
Creates a new suggester.
|
BlendedInfixSuggester(Directory dir,
Analyzer analyzer)
Create a new instance, loading from a previously built
directory, if it exists.
|
BlendedInfixSuggester(Directory dir,
Analyzer indexAnalyzer,
Analyzer queryAnalyzer,
int minPrefixChars,
BlendedInfixSuggester.BlenderType blenderType,
int numFactor,
boolean commitOnBuild)
Create a new instance, loading from a previously built
directory, if it exists.
|
BlendedInfixSuggester(Directory dir,
Analyzer indexAnalyzer,
Analyzer queryAnalyzer,
int minPrefixChars,
BlendedInfixSuggester.BlenderType blenderType,
int numFactor,
java.lang.Double exponent,
boolean commitOnBuild,
boolean allTermsRequired,
boolean highlight)
Create a new instance, loading from a previously built
directory, if it exists.
|
FreeTextSuggester(Analyzer analyzer)
Instantiate, using the provided analyzer for both
indexing and lookup, using bigram model by default.
|
FreeTextSuggester(Analyzer indexAnalyzer,
Analyzer queryAnalyzer)
Instantiate, using the provided indexing and lookup
analyzers, using bigram model by default.
|
FreeTextSuggester(Analyzer indexAnalyzer,
Analyzer queryAnalyzer,
int grams)
Instantiate, using the provided indexing and lookup
analyzers, with the specified model (2
= bigram, 3 = trigram, etc.).
|
FreeTextSuggester(Analyzer indexAnalyzer,
Analyzer queryAnalyzer,
int grams,
byte separator)
Instantiate, using the provided indexing and lookup
analyzers, and specified model (2 = bigram, 3 =
trigram ,etc.).
|
FuzzySuggester(Directory tempDir,
java.lang.String tempFileNamePrefix,
Analyzer analyzer)
Creates a
FuzzySuggester instance initialized with default values. |
FuzzySuggester(Directory tempDir,
java.lang.String tempFileNamePrefix,
Analyzer indexAnalyzer,
Analyzer queryAnalyzer)
Creates a
FuzzySuggester instance with an index and query analyzer initialized with default values. |
FuzzySuggester(Directory tempDir,
java.lang.String tempFileNamePrefix,
Analyzer indexAnalyzer,
Analyzer queryAnalyzer,
int options,
int maxSurfaceFormsPerAnalyzedForm,
int maxGraphExpansions,
boolean preservePositionIncrements,
int maxEdits,
boolean transpositions,
int nonFuzzyPrefix,
int minFuzzyLength,
boolean unicodeAware)
Creates a
FuzzySuggester instance. |
Modifier and Type | Class and Description |
---|---|
class |
CompletionAnalyzer
Wraps an
Analyzer
to provide additional completion-only tuning
(e.g. |
Modifier and Type | Method and Description |
---|---|
Analyzer |
PrefixCompletionQuery.getAnalyzer()
Gets the analyzer used to analyze the prefix.
|
protected Analyzer |
CompletionAnalyzer.getWrappedAnalyzer(java.lang.String fieldName) |
Modifier and Type | Method and Description |
---|---|
TokenStream |
SuggestField.tokenStream(Analyzer analyzer,
TokenStream reuse) |
Modifier and Type | Field and Description |
---|---|
protected Analyzer |
AnalysisOffsetStrategy.analyzer |
protected Analyzer |
UnifiedHighlighter.indexAnalyzer |
Modifier and Type | Method and Description |
---|---|
Analyzer |
UnifiedHighlighter.getIndexAnalyzer()
...
|
Constructor and Description |
---|
AnalysisOffsetStrategy(UHComponents components,
Analyzer analyzer) |
MemoryIndexOffsetStrategy(UHComponents components,
Analyzer analyzer,
java.util.function.Function<Query,java.util.Collection<Query>> multiTermQueryRewrite) |
TokenStreamOffsetStrategy(UHComponents components,
Analyzer indexAnalyzer) |
UnifiedHighlighter(IndexSearcher indexSearcher,
Analyzer indexAnalyzer)
Constructs the highlighter with the given index searcher and analyzer.
|
Modifier and Type | Field and Description |
---|---|
protected Analyzer |
QueryBuilder.analyzer |
Modifier and Type | Method and Description |
---|---|
Analyzer |
QueryBuilder.getAnalyzer()
Returns the analyzer.
|
Modifier and Type | Method and Description |
---|---|
protected Query |
QueryBuilder.createFieldQuery(Analyzer analyzer,
BooleanClause.Occur operator,
java.lang.String field,
java.lang.String queryText,
boolean quoted,
int phraseSlop)
Creates a query from the analysis chain.
|
static IndexWriterConfig |
LuceneTestCase.newIndexWriterConfig(Analyzer a)
create a new index writer config with random defaults
|
static IndexWriterConfig |
LuceneTestCase.newIndexWriterConfig(java.util.Random r,
Analyzer a)
create a new index writer config with random defaults using the specified random
|
void |
QueryBuilder.setAnalyzer(Analyzer analyzer)
Sets the analyzer used to tokenize text.
|
Constructor and Description |
---|
QueryBuilder(Analyzer analyzer)
Creates a new QueryBuilder using the given analyzer.
|
Copyright © 2000–2019 The Apache Software Foundation. All rights reserved.