Class WFSTCompletionLookup

java.lang.Object
org.apache.lucene.search.suggest.Lookup
org.apache.lucene.search.suggest.fst.WFSTCompletionLookup
All Implemented Interfaces:
Accountable

public class WFSTCompletionLookup extends Lookup
Suggester based on a weighted FST: it first traverses the prefix, then walks the n shortest paths to retrieve top-ranked suggestions.

NOTE: Input weights must be between 0 and Integer.MAX_VALUE, any other values will be rejected.

  • Field Details

    • fst

      private FST<Long> fst
      FST, weights are encoded as costs: (Integer.MAX_VALUE-weight)
    • exactFirst

      private final boolean exactFirst
      True if exact match suggestions should always be returned first.
    • count

      private volatile long count
      Number of entries the lookup was built with
    • tempDir

      private final Directory tempDir
    • tempFileNamePrefix

      private final String tempFileNamePrefix
    • weightComparator

      static final Comparator<Long> weightComparator
  • Constructor Details

    • WFSTCompletionLookup

      public WFSTCompletionLookup(Directory tempDir, String tempFileNamePrefix)
    • WFSTCompletionLookup

      public WFSTCompletionLookup(Directory tempDir, String tempFileNamePrefix, boolean exactFirst)
      Creates a new suggester.
      Parameters:
      exactFirst - true if suggestions that match the prefix exactly should always be returned first, regardless of score. This has no performance impact, but could result in low-quality suggestions.
  • Method Details

    • build

      public void build(InputIterator iterator) throws IOException
      Description copied from class: Lookup
      Builds up a new internal Lookup representation based on the given InputIterator. The implementation might re-sort the data internally.
      Specified by:
      build in class Lookup
      Throws:
      IOException
    • store

      public boolean store(DataOutput output) throws IOException
      Description copied from class: Lookup
      Persist the constructed lookup data to a directory. Optional operation.
      Specified by:
      store in class Lookup
      Parameters:
      output - DataOutput to write the data to.
      Returns:
      true if successful, false if unsuccessful or not supported.
      Throws:
      IOException - when fatal IO error occurs.
    • load

      public boolean load(DataInput input) throws IOException
      Description copied from class: Lookup
      Discard current lookup data and load it from a previously saved copy. Optional operation.
      Specified by:
      load in class Lookup
      Parameters:
      input - the DataInput to load the lookup data.
      Returns:
      true if completed successfully, false if unsuccessful or not supported.
      Throws:
      IOException - when fatal IO error occurs.
    • lookup

      public List<Lookup.LookupResult> lookup(CharSequence key, Set<BytesRef> contexts, boolean onlyMorePopular, int num)
      Description copied from class: Lookup
      Look up a key and return possible completion for this key.
      Specified by:
      lookup in class Lookup
      Parameters:
      key - lookup key. Depending on the implementation this may be a prefix, misspelling, or even infix.
      contexts - contexts to filter the lookup by, or null if all contexts are allowed; if the suggestion contains any of the contexts, it's a match
      onlyMorePopular - return only more popular results
      num - maximum number of results to return
      Returns:
      a list of possible completions, with their relative weight (e.g. popularity)
    • lookupPrefix

      private Long lookupPrefix(BytesRef scratch, FST.Arc<Long> arc) throws IOException
      Throws:
      IOException
    • get

      public Object get(CharSequence key)
      Returns the weight associated with an input string, or null if it does not exist.
    • decodeWeight

      private static int decodeWeight(long encoded)
      cost -> weight
    • encodeWeight

      private static int encodeWeight(long value)
      weight -> cost
    • ramBytesUsed

      public long ramBytesUsed()
      Returns byte size of the underlying FST.
    • getChildResources

      public Collection<Accountable> getChildResources()
      Description copied from interface: Accountable
      Returns nested resources of this class. The result should be a point-in-time snapshot (to avoid race conditions).
      See Also:
    • getCount

      public long getCount()
      Description copied from class: Lookup
      Get the number of entries the lookup was built with
      Specified by:
      getCount in class Lookup
      Returns:
      total number of suggester entries