Class AutomatonTermsEnum

  • All Implemented Interfaces:
    BytesRefIterator

    public class AutomatonTermsEnum
    extends FilteredTermsEnum
    A FilteredTermsEnum that enumerates terms based upon what is accepted by a DFA.

    The algorithm is such:

    1. As long as matches are successful, keep reading sequentially.
    2. When a match fails, skip to the next string in lexicographic order that does not enter a reject state.

    The algorithm does not attempt to actually skip to the next string that is completely accepted. This is not possible when the language accepted by the FSM is not finite (i.e. * operator).

    • Field Detail

      • commonSuffixRef

        private final BytesRef commonSuffixRef
      • finite

        private final boolean finite
      • automaton

        private final Automaton automaton
      • visited

        private final long[] visited
      • curGen

        private long curGen
      • linear

        private boolean linear
      • linearUpperBound

        private final BytesRef linearUpperBound
    • Constructor Detail

      • AutomatonTermsEnum

        public AutomatonTermsEnum​(TermsEnum tenum,
                                  CompiledAutomaton compiled)
        Construct an enumerator based upon an automaton, enumerating the specified field, working on a supplied TermsEnum
        Parameters:
        compiled - CompiledAutomaton
    • Method Detail

      • setLinear

        private void setLinear​(int position)
        Sets the enum to operate in linear fashion, as we have found a looping transition at position: we set an upper bound and act like a TermRangeQuery for this portion of the term space.
      • nextString

        private boolean nextString()
        Increments the byte buffer to the next String in binary order after s that will not put the machine into a reject state. If such a string does not exist, returns false. The correctness of this method depends upon the automaton being deterministic, and having no transitions to dead states.
        Returns:
        true if more possible solutions exist for the DFA
      • nextString

        private boolean nextString​(int state,
                                   int position)
        Returns the next String in lexicographic order that will not put the machine into a reject state. This method traverses the DFA from the given position in the String, starting at the given state. If this cannot satisfy the machine, returns false. This method will walk the minimal path, in lexicographic order, as long as possible. If this method returns false, then there might still be more solutions, it is necessary to backtrack to find out.
        Parameters:
        state - current non-reject state
        position - useful portion of the string
        Returns:
        true if more possible solutions exist for the DFA from this position
      • backtrack

        private int backtrack​(int position)
        Attempts to backtrack thru the string after encountering a dead end at some given position. Returns false if no more possible strings can match.
        Parameters:
        position - current position in the input String
        Returns:
        position >= 0 if more possible solutions exist for the DFA