Class LCSubstringSolver.ConcurrentSuffixTreeImpl<V>
- java.lang.Object
-
- com.googlecode.concurrenttrees.radix.ConcurrentRadixTree<V>
-
- com.googlecode.concurrenttrees.solver.LCSubstringSolver.ConcurrentSuffixTreeImpl<V>
-
- All Implemented Interfaces:
PrettyPrintable
,RadixTree<V>
,java.io.Serializable
- Enclosing class:
- LCSubstringSolver
class LCSubstringSolver.ConcurrentSuffixTreeImpl<V> extends ConcurrentRadixTree<V>
-
-
Nested Class Summary
-
Nested classes/interfaces inherited from class com.googlecode.concurrenttrees.radix.ConcurrentRadixTree
ConcurrentRadixTree.KeyValuePairImpl<O>, ConcurrentRadixTree.NodeKeyPair
-
-
Field Summary
-
Fields inherited from class com.googlecode.concurrenttrees.radix.ConcurrentRadixTree
root
-
-
Constructor Summary
Constructors Constructor Description ConcurrentSuffixTreeImpl(NodeFactory nodeFactory)
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description protected void
acquireWriteLock()
(package private) java.lang.CharSequence
getLongestCommonSubstring()
The main algorithm to find the longest common substring.protected java.lang.Iterable<ConcurrentRadixTree.NodeKeyPair>
lazyTraverseDescendants(java.lang.CharSequence startKey, Node startNode)
Traverses the tree using depth-first, preordered traversal, starting at the given node, using lazy evaluation such that the next node is only determined when next() is called on the iterator returned.protected void
releaseWriteLock()
(package private) boolean
subTreeReferencesAllOriginalDocuments(java.lang.CharSequence startKey, Node startNode)
Returns true if the given node and its descendants collectively reference all original documents added to the solver.-
Methods inherited from class com.googlecode.concurrenttrees.radix.ConcurrentRadixTree
getClosestKeys, getKeysStartingWith, getKeyValuePairsForClosestKeys, getKeyValuePairsForKeysStartingWith, getNode, getValueForExactKey, getValuesForClosestKeys, getValuesForKeysStartingWith, put, putIfAbsent, remove, size, transformKeyForResult
-
-
-
-
Constructor Detail
-
ConcurrentSuffixTreeImpl
public ConcurrentSuffixTreeImpl(NodeFactory nodeFactory)
-
-
Method Detail
-
acquireWriteLock
protected void acquireWriteLock()
- Overrides:
acquireWriteLock
in classConcurrentRadixTree<V>
-
releaseWriteLock
protected void releaseWriteLock()
- Overrides:
releaseWriteLock
in classConcurrentRadixTree<V>
-
lazyTraverseDescendants
protected java.lang.Iterable<ConcurrentRadixTree.NodeKeyPair> lazyTraverseDescendants(java.lang.CharSequence startKey, Node startNode)
Description copied from class:ConcurrentRadixTree
Traverses the tree using depth-first, preordered traversal, starting at the given node, using lazy evaluation such that the next node is only determined when next() is called on the iterator returned. The traversal algorithm uses iteration instead of recursion to allow deep trees to be traversed without requiring large JVM stack sizes. Each node that is encountered is returned from the iterator along with a key associated with that node, in a NodeKeyPair object. The key will be prefixed by the given start key, and will be generated by appending to the start key the edges traversed along the path to that node from the start node.- Overrides:
lazyTraverseDescendants
in classConcurrentRadixTree<V>
- Parameters:
startKey
- The key which matches the given start nodestartNode
- The start node- Returns:
- An iterator which when iterated traverses the tree using depth-first, preordered traversal, starting at the given start node
-
getLongestCommonSubstring
java.lang.CharSequence getLongestCommonSubstring()
The main algorithm to find the longest common substring.- Traverses all nodes in the suffix tree
- For each node checks if the path from the root via edges to that node is longer than the longest common substring encountered so far (and so is a candidate)
-
Calls helper method
subTreeReferencesAllOriginalDocuments(CharSequence, Node)
, supplying the candidate node. That method returns true if nodes in the sub-tree descending from that node collectively references all of the original documents added to the solver - If the nodes in the sub-tree do collectively reference all documents, then the path from root to the current node is a substring of all documents
- Updates the longest common substring encountered so far if the conditions above hold for the current node
- Continues traversing the tree until all nodes have been checked
subTreeReferencesAllOriginalDocuments(CharSequence, Node)
will stop traversal early if it finds all original documents early. This method currently does not apply a similar optimization, and will actually descend into and apply the same tests to branches which the helper method already indicated are dead-ends(!). Future work might be to use this knowledge, skip dead-end branches, but it would involve not using any of the traversal logic from superclasses and overriding it all here for this one use case.- Returns:
- The longest common substring
-
subTreeReferencesAllOriginalDocuments
boolean subTreeReferencesAllOriginalDocuments(java.lang.CharSequence startKey, Node startNode)
Returns true if the given node and its descendants collectively reference all original documents added to the solver. This method will traverse the entire sub-tree until it has encountered all of the original documents. If it encounters all of the original documents early, before exhausting all nodes, returns early.- Parameters:
startKey
- The key associated with the start node (concatenation of edges from root leading to it)startNode
- The root of the sub-tree to traverse- Returns:
- True if the given node and its descendants collectively reference all original documents added to the solver, false if the sub-tree does not reference all documents added to the solver
-
-