Class SegmentInfos

java.lang.Object
org.apache.lucene.index.SegmentInfos
All Implemented Interfaces:
Cloneable, Iterable<SegmentCommitInfo>

public final class SegmentInfos extends Object implements Cloneable, Iterable<SegmentCommitInfo>
A collection of segmentInfo objects with methods for operating on those segments in relation to the file system.

The active segments in the index are stored in the segment info file, segments_N. There may be one or more segments_N files in the index; however, the one with the largest generation is the active one (when older segments_N files are present it's because they temporarily cannot be deleted, or a custom IndexDeletionPolicy is in use). This file lists each segment by name and has details about the codec and generation of deletes.

Files:

  • segments_N: Header, LuceneVersion, Version, NameCounter, SegCount, MinSegmentLuceneVersion, <SegName, SegID, SegCodec, DelGen, DeletionCount, FieldInfosGen, DocValuesGen, UpdatesFiles>SegCount, CommitUserData, Footer
Data types:
  • Header --> IndexHeader
  • LuceneVersion --> Which Lucene code Version was used for this commit, written as three vInt: major, minor, bugfix
  • MinSegmentLuceneVersion --> Lucene code Version of the oldest segment, written as three vInt: major, minor, bugfix; this is only written only if there's at least one segment
  • NameCounter, SegCount, DeletionCount --> Int32
  • Generation, Version, DelGen, Checksum, FieldInfosGen, DocValuesGen --> Int64
  • SegID --> Int8ID_LENGTH
  • SegName, SegCodec --> String
  • CommitUserData --> Map<String,String>
  • UpdatesFiles --> Map<Int32, Set<String>>
  • Footer --> CodecFooter
Field Descriptions:
  • Version counts how often the index has been changed by adding or deleting documents.
  • NameCounter is used to generate names for new segment files.
  • SegName is the name of the segment, and is used as the file name prefix for all of the files that compose the segment's index.
  • DelGen is the generation count of the deletes file. If this is -1, there are no deletes. Anything above zero means there are deletes stored by LiveDocsFormat.
  • DeletionCount records the number of deleted documents in this segment.
  • SegCodec is the name of the Codec that encoded this segment.
  • SegID is the identifier of the Codec that encoded this segment.
  • CommitUserData stores an optional user-supplied opaque Map<String,String> that was passed to IndexWriter.setLiveCommitData(Iterable).
  • FieldInfosGen is the generation count of the fieldInfos file. If this is -1, there are no updates to the fieldInfos in that segment. Anything above zero means there are updates to fieldInfos stored by FieldInfosFormat .
  • DocValuesGen is the generation count of the updatable DocValues. If this is -1, there are no updates to DocValues in that segment. Anything above zero means there are updates to DocValues stored by DocValuesFormat.
  • UpdatesFiles stores the set of files that were updated in that segment per field.
  • Field Details

    • VERSION_70

      public static final int VERSION_70
      The version that added information about the Lucene version at the time when the index has been created.
      See Also:
    • VERSION_72

      public static final int VERSION_72
      The version that updated segment name counter to be long instead of int.
      See Also:
    • VERSION_74

      public static final int VERSION_74
      The version that recorded softDelCount
      See Also:
    • VERSION_86

      public static final int VERSION_86
      The version that recorded SegmentCommitInfo IDs
      See Also:
    • VERSION_CURRENT

      static final int VERSION_CURRENT
      See Also:
    • OLD_SEGMENTS_GEN

      private static final String OLD_SEGMENTS_GEN
      Name of the generation reference file name
      See Also:
    • counter

      public long counter
      Used to name new segments.
    • version

      public long version
      Counts how often the index has been changed.
    • generation

      private long generation
    • lastGeneration

      private long lastGeneration
    • userData

      public Map<String,String> userData
      Opaque Map<String, String> that user can specify during IndexWriter.commit
    • segments

      private List<SegmentCommitInfo> segments
    • infoStream

      private static PrintStream infoStream
      If non-null, information about loading segments_N files will be printed here.
      See Also:
    • id

      private byte[] id
      Id for this commit; only written starting with Lucene 5.0
    • luceneVersion

      private Version luceneVersion
      Which Lucene version wrote this commit.
    • minSegmentLuceneVersion

      private Version minSegmentLuceneVersion
      Version of the oldest segment in the index, or null if there are no segments.
    • indexCreatedVersionMajor

      private final int indexCreatedVersionMajor
      The Lucene version major that was used to create the index.
    • pendingCommit

      boolean pendingCommit
  • Constructor Details

    • SegmentInfos

      public SegmentInfos(int indexCreatedVersionMajor)
      Sole constructor.
      Parameters:
      indexCreatedVersionMajor - the Lucene version major at index creation time, or 6 if the index was created before 7.0
  • Method Details

    • info

      public SegmentCommitInfo info(int i)
      Returns SegmentCommitInfo at the provided index.
    • getLastCommitGeneration

      public static long getLastCommitGeneration(String[] files)
      Get the generation of the most recent commit to the list of index files (N in the segments_N file).
      Parameters:
      files - -- array of file names to check
    • getLastCommitGeneration

      public static long getLastCommitGeneration(Directory directory) throws IOException
      Get the generation of the most recent commit to the index in this directory (N in the segments_N file).
      Parameters:
      directory - -- directory to search for the latest segments_N file
      Throws:
      IOException
    • getLastCommitSegmentsFileName

      public static String getLastCommitSegmentsFileName(String[] files)
      Get the filename of the segments_N file for the most recent commit in the list of index files.
      Parameters:
      files - -- array of file names to check
    • getLastCommitSegmentsFileName

      public static String getLastCommitSegmentsFileName(Directory directory) throws IOException
      Get the filename of the segments_N file for the most recent commit to the index in this Directory.
      Parameters:
      directory - -- directory to search for the latest segments_N file
      Throws:
      IOException
    • getSegmentsFileName

      public String getSegmentsFileName()
      Get the segments_N filename in use by this segment infos.
    • generationFromSegmentsFileName

      public static long generationFromSegmentsFileName(String fileName)
      Parse the generation off the segments file name and return it.
    • getNextPendingGeneration

      private long getNextPendingGeneration()
      return generation of the next pending_segments_N that will be written
    • getId

      public byte[] getId()
      Since Lucene 5.0, every commit (segments_N) writes a unique id. This will return that id
    • readCommit

      public static final SegmentInfos readCommit(Directory directory, String segmentFileName) throws IOException
      Read a particular segmentFileName. Note that this may throw an IOException if a commit is in process.
      Parameters:
      directory - -- directory containing the segments file
      segmentFileName - -- segment file to load
      Throws:
      CorruptIndexException - if the index is corrupt
      IOException - if there is a low-level IO error
    • readCommit

      static final SegmentInfos readCommit(Directory directory, String segmentFileName, int minSupportedMajorVersion) throws IOException
      Throws:
      IOException
    • readCommit

      public static final SegmentInfos readCommit(Directory directory, ChecksumIndexInput input, long generation) throws IOException
      Read the commit from the provided ChecksumIndexInput.
      Throws:
      IOException
    • readCommit

      static final SegmentInfos readCommit(Directory directory, ChecksumIndexInput input, long generation, int minSupportedMajorVersion) throws IOException
      Read the commit from the provided ChecksumIndexInput.
      Throws:
      IOException
    • parseSegmentInfos

      private static void parseSegmentInfos(Directory directory, DataInput input, SegmentInfos infos, int format) throws IOException
      Throws:
      IOException
    • readCodec

      private static Codec readCodec(DataInput input) throws IOException
      Throws:
      IOException
    • readLatestCommit

      public static final SegmentInfos readLatestCommit(Directory directory) throws IOException
      Find the latest commit (segments_N file) and load all SegmentCommitInfos.
      Throws:
      IOException
    • readLatestCommit

      static final SegmentInfos readLatestCommit(Directory directory, int minSupportedMajorVersion) throws IOException
      Throws:
      IOException
    • write

      private void write(Directory directory) throws IOException
      Throws:
      IOException
    • write

      public void write(IndexOutput out) throws IOException
      Write ourselves to the provided IndexOutput
      Throws:
      IOException
    • clone

      public SegmentInfos clone()
      Returns a copy of this instance, also copying each SegmentInfo.
      Overrides:
      clone in class Object
    • getVersion

      public long getVersion()
      version number when this SegmentInfos was generated.
    • getGeneration

      public long getGeneration()
      Returns current generation.
    • getLastGeneration

      public long getLastGeneration()
      Returns last succesfully read or written generation.
    • setInfoStream

      public static void setInfoStream(PrintStream infoStream)
      If non-null, information about retries when loading the segments file will be printed to this.
    • getInfoStream

      public static PrintStream getInfoStream()
      Returns infoStream.
      See Also:
    • message

      private static void message(String message)
      Prints the given message to the infoStream. Note, this method does not check for null infoStream. It assumes this check has been performed by the caller, which is recommended to avoid the (usually) expensive message creation.
    • updateGeneration

      public void updateGeneration(SegmentInfos other)
      Carry over generation numbers from another SegmentInfos
    • updateGenerationVersionAndCounter

      void updateGenerationVersionAndCounter(SegmentInfos other)
    • setNextWriteGeneration

      public void setNextWriteGeneration(long generation)
      Set the generation to be used for the next commit
    • rollbackCommit

      final void rollbackCommit(Directory dir)
    • prepareCommit

      final void prepareCommit(Directory dir) throws IOException
      Call this to start a commit. This writes the new segments file, but writes an invalid checksum at the end, so that it is not visible to readers. Once this is called you must call finishCommit(org.apache.lucene.store.Directory) to complete the commit or rollbackCommit(org.apache.lucene.store.Directory) to abort it.

      Note: changed() should be called prior to this method if changes have been made to this SegmentInfos instance

      Throws:
      IOException
    • files

      public Collection<String> files(boolean includeSegmentsFile) throws IOException
      Returns all file names referenced by SegmentInfo. The returned collection is recomputed on each invocation.
      Throws:
      IOException
    • finishCommit

      final String finishCommit(Directory dir) throws IOException
      Returns the committed segments_N filename.
      Throws:
      IOException
    • commit

      public final void commit(Directory dir) throws IOException
      Writes and syncs to the Directory dir, taking care to remove the segments file on exception

      Note: changed() should be called prior to this method if changes have been made to this SegmentInfos instance

      Throws:
      IOException
    • toString

      public String toString()
      Returns readable description of this segment.
      Overrides:
      toString in class Object
    • getUserData

      public Map<String,String> getUserData()
      Return userData saved with this commit.
      See Also:
    • setUserData

      public void setUserData(Map<String,String> data, boolean doIncrementVersion)
      Sets the commit data.
    • replace

      void replace(SegmentInfos other)
      Replaces all segments in this instance, but keeps generation, version, counter so that future commits remain write once.
    • totalMaxDoc

      public int totalMaxDoc()
      Returns sum of all segment's maxDocs. Note that this does not include deletions
    • changed

      public void changed()
      Call this before committing if changes have been made to the segments.
    • setVersion

      void setVersion(long newVersion)
    • applyMergeChanges

      void applyMergeChanges(MergePolicy.OneMerge merge, boolean dropSegment)
      applies all changes caused by committing a merge to this SegmentInfos
    • createBackupSegmentInfos

      List<SegmentCommitInfo> createBackupSegmentInfos()
    • rollbackSegmentInfos

      void rollbackSegmentInfos(List<SegmentCommitInfo> infos)
    • iterator

      public Iterator<SegmentCommitInfo> iterator()
      Returns an unmodifiable Iterator of contained segments in order.
      Specified by:
      iterator in interface Iterable<SegmentCommitInfo>
    • asList

      public List<SegmentCommitInfo> asList()
      Returns all contained segments as an unmodifiable List view.
    • size

      public int size()
      Returns number of SegmentCommitInfos.
    • add

      public void add(SegmentCommitInfo si)
      Appends the provided SegmentCommitInfo.
    • addAll

      public void addAll(Iterable<SegmentCommitInfo> sis)
      Appends the provided SegmentCommitInfos.
    • clear

      public void clear()
      Clear all SegmentCommitInfos.
    • remove

      public boolean remove(SegmentCommitInfo si)
      Remove the provided SegmentCommitInfo.

      WARNING: O(N) cost

    • remove

      void remove(int index)
      Remove the SegmentCommitInfo at the provided index.

      WARNING: O(N) cost

    • contains

      boolean contains(SegmentCommitInfo si)
      Return true if the provided SegmentCommitInfo is contained.

      WARNING: O(N) cost

    • indexOf

      int indexOf(SegmentCommitInfo si)
      Returns index of the provided SegmentCommitInfo.

      WARNING: O(N) cost

    • getCommitLuceneVersion

      public Version getCommitLuceneVersion()
      Returns which Lucene Version wrote this commit, or null if the version this index was written with did not directly record the version.
    • getMinSegmentLuceneVersion

      public Version getMinSegmentLuceneVersion()
      Returns the version of the oldest segment, or null if there are no segments.
    • getIndexCreatedVersionMajor

      public int getIndexCreatedVersionMajor()
      Return the version major that was used to initially create the index. This version is set when the index is first created and then never changes. This information was added as of version 7.0 so older indices report 6 as a creation version.