Package org.apache.lucene.analysis.hunspell
Stemming TokenFilter using a Java implementation of the
Hunspell stemming algorithm.
Dictionaries can be found on OpenOffice's wiki
-
Class Summary Class Description Dictionary In-memory structure for the dictionary (.dic) and affix (.aff) data of a hunspell dictionary.Dictionary.DoubleASCIIFlagParsingStrategy Implementation ofDictionary.FlagParsingStrategy
that assumes each flag is encoded as two ASCII characters whose codes must be combined into a single character.Dictionary.FlagParsingStrategy Abstraction of the process of parsing flags taken from the affix and dic filesDictionary.NumFlagParsingStrategy Implementation ofDictionary.FlagParsingStrategy
that assumes each flag is encoded in its numerical form.Dictionary.SimpleFlagParsingStrategy Simple implementation ofDictionary.FlagParsingStrategy
that treats the chars in each String as a individual flags.HunspellStemFilter TokenFilter that uses hunspell affix rules and words to stem tokens.HunspellStemFilterFactory TokenFilterFactory that creates instances ofHunspellStemFilter
.ISO8859_14Decoder Stemmer Stemmer uses the affix rules declared in the Dictionary to generate one or more stems for a word.