Class PackWriter

  • All Implemented Interfaces:
    java.lang.AutoCloseable

    public class PackWriter
    extends java.lang.Object
    implements java.lang.AutoCloseable

    PackWriter class is responsible for generating pack files from specified set of objects from repository. This implementation produce pack files in format version 2.

    Source of objects may be specified in two ways:

    • (usually) by providing sets of interesting and uninteresting objects in repository - all interesting objects and their ancestors except uninteresting objects and their ancestors will be included in pack, or
    • by providing iterator of RevObject specifying exact list and order of objects in pack

    Typical usage consists of creating an instance, configuring options, preparing the list of objects by calling preparePack(Iterator) or preparePack(ProgressMonitor, Set, Set), and streaming with writePack(ProgressMonitor, ProgressMonitor, OutputStream). If the pack is being stored as a file the matching index can be written out after writing the pack by writeIndex(OutputStream). An optional bitmap index can be made by calling prepareBitmapIndex(ProgressMonitor) followed by writeBitmapIndex(OutputStream).

    Class provide set of configurable options and ProgressMonitor support, as operations may take a long time for big repositories. Deltas searching algorithm is NOT IMPLEMENTED yet - this implementation relies only on deltas and objects reuse.

    This class is not thread safe. It is intended to be used in one thread as a single pass to produce one pack. Invoking methods multiple times or out of order is not supported as internal data structures are destroyed during certain phases to save memory when packing large repositories.

    • Field Detail

      • PACK_VERSION_GENERATED

        private static final int PACK_VERSION_GENERATED
        See Also:
        Constant Field Values
      • NONE

        public static final java.util.Set<ObjectId> NONE
        Empty set of objects for preparePack().
      • instances

        private static final java.util.Map<java.lang.ref.WeakReference<PackWriter>,​java.lang.Boolean> instances
      • instancesIterable

        private static final java.lang.Iterable<PackWriter> instancesIterable
      • edgeObjects

        private java.util.List<ObjectToPack> edgeObjects
      • cachedPacks

        private java.util.List<CachedPack> cachedPacks
      • tagTargets

        private java.util.Set<ObjectId> tagTargets
      • excludeFromBitmapSelection

        private java.util.Set<? extends ObjectId> excludeFromBitmapSelection
      • excludeInPackLast

        private ObjectIdSet excludeInPackLast
      • myDeflater

        private java.util.zip.Deflater myDeflater
      • reuseSupport

        private final ObjectReuseAsIs reuseSupport
        reader recast to the reuse interface, if it supports it.
      • selfRef

        private final java.lang.ref.WeakReference<PackWriter> selfRef
      • sortedByName

        private java.util.List<ObjectToPack> sortedByName
      • packcsum

        private byte[] packcsum
      • deltaBaseAsOffset

        private boolean deltaBaseAsOffset
      • reuseDeltas

        private boolean reuseDeltas
      • reuseDeltaCommits

        private boolean reuseDeltaCommits
      • reuseValidate

        private boolean reuseValidate
      • thin

        private boolean thin
      • useCachedPacks

        private boolean useCachedPacks
      • useBitmaps

        private boolean useBitmaps
      • ignoreMissingUninteresting

        private boolean ignoreMissingUninteresting
      • pruneCurrentObjectList

        private boolean pruneCurrentObjectList
      • shallowPack

        private boolean shallowPack
      • canBuildBitmaps

        private boolean canBuildBitmaps
      • indexDisabled

        private boolean indexDisabled
      • depth

        private int depth
      • unshallowObjects

        private java.util.Collection<? extends ObjectId> unshallowObjects
      • crc32

        private java.util.zip.CRC32 crc32
    • Method Detail

      • getInstances

        public static java.lang.Iterable<PackWriter> getInstances()
        Get all allocated, non-released PackWriters instances.
        Returns:
        all allocated, non-released PackWriters instances.
      • setClientShallowCommits

        public void setClientShallowCommits​(java.util.Set<ObjectId> clientShallowCommits)
        Records the set of shallow commits in the client.
        Parameters:
        clientShallowCommits - the shallow commits in the client
      • isDeltaBaseAsOffset

        public boolean isDeltaBaseAsOffset()
        Check whether writer can store delta base as an offset (new style reducing pack size) or should store it as an object id (legacy style, compatible with old readers). Default setting: false
        Returns:
        true if delta base is stored as an offset; false if it is stored as an object id.
      • setDeltaBaseAsOffset

        public void setDeltaBaseAsOffset​(boolean deltaBaseAsOffset)
        Set writer delta base format. Delta base can be written as an offset in a pack file (new approach reducing file size) or as an object id (legacy approach, compatible with old readers). Default setting: false
        Parameters:
        deltaBaseAsOffset - boolean indicating whether delta base can be stored as an offset.
      • isReuseDeltaCommits

        public boolean isReuseDeltaCommits()
        Check if the writer will reuse commits that are already stored as deltas.
        Returns:
        true if the writer would reuse commits stored as deltas, assuming delta reuse is already enabled.
      • setReuseDeltaCommits

        public void setReuseDeltaCommits​(boolean reuse)
        Set the writer to reuse existing delta versions of commits.
        Parameters:
        reuse - if true, the writer will reuse any commits stored as deltas. By default the writer does not reuse delta commits.
      • isReuseValidatingObjects

        public boolean isReuseValidatingObjects()
        Check if the writer validates objects before copying them.
        Returns:
        true if validation is enabled; false if the reader will handle object validation as a side-effect of it consuming the output.
      • setReuseValidatingObjects

        public void setReuseValidatingObjects​(boolean validate)
        Enable (or disable) object validation during packing.
        Parameters:
        validate - if true the pack writer will validate an object before it is put into the output. This additional validation work may be necessary to avoid propagating corruption from one local pack file to another local pack file.
      • isThin

        public boolean isThin()
        Whether this writer is producing a thin pack.
        Returns:
        true if this writer is producing a thin pack.
      • setThin

        public void setThin​(boolean packthin)
        Whether writer may pack objects with delta base object not within set of objects to pack
        Parameters:
        packthin - a boolean indicating whether writer may pack objects with delta base object not within set of objects to pack, but belonging to party repository (uninteresting/boundary) as determined by set; this kind of pack is used only for transport; true - to produce thin pack, false - otherwise.
      • isUseCachedPacks

        public boolean isUseCachedPacks()
        Whether to reuse cached packs.
        Returns:
        true to reuse cached packs. If true index creation isn't available.
      • setUseCachedPacks

        public void setUseCachedPacks​(boolean useCached)
        Whether to use cached packs
        Parameters:
        useCached - if set to true and a cached pack is present, it will be appended onto the end of a thin-pack, reducing the amount of working set space and CPU used by PackWriter. Enabling this feature prevents PackWriter from creating an index for the newly created pack, so its only suitable for writing to a network client, where the client will make the index.
      • isUseBitmaps

        public boolean isUseBitmaps()
        Whether to use bitmaps
        Returns:
        true to use bitmaps for ObjectWalks, if available.
      • setUseBitmaps

        public void setUseBitmaps​(boolean useBitmaps)
        Whether to use bitmaps
        Parameters:
        useBitmaps - if set to true, bitmaps will be used when preparing a pack.
      • isIndexDisabled

        public boolean isIndexDisabled()
        Whether the index file cannot be created by this PackWriter.
        Returns:
        true if the index file cannot be created by this PackWriter.
      • setIndexDisabled

        public void setIndexDisabled​(boolean noIndex)
        Whether to disable creation of the index file.
        Parameters:
        noIndex - true to disable creation of the index file.
      • isIgnoreMissingUninteresting

        public boolean isIgnoreMissingUninteresting()
        Whether to ignore missing uninteresting objects
        Returns:
        true to ignore objects that are uninteresting and also not found on local disk; false to throw a MissingObjectException out of preparePack(ProgressMonitor, Set, Set) if an uninteresting object is not in the source repository. By default, true, permitting gracefully ignoring of uninteresting objects.
      • setIgnoreMissingUninteresting

        public void setIgnoreMissingUninteresting​(boolean ignore)
        Whether writer should ignore non existing uninteresting objects
        Parameters:
        ignore - true if writer should ignore non existing uninteresting objects during construction set of objects to pack; false otherwise - non existing uninteresting objects may cause MissingObjectException
      • setTagTargets

        public void setTagTargets​(java.util.Set<ObjectId> objects)
        Set the tag targets that should be hoisted earlier during packing.

        Callers may put objects into this set before invoking any of the preparePack methods to influence where an annotated tag's target is stored within the resulting pack. Typically these will be clustered together, and hoisted earlier in the file even if they are ancient revisions, allowing readers to find tag targets with better locality.

        Parameters:
        objects - objects that annotated tags point at.
      • setShallowPack

        public void setShallowPack​(int depth,
                                   java.util.Collection<? extends ObjectId> unshallow)
        Configure this pack for a shallow clone.
        Parameters:
        depth - maximum depth of history to return. 1 means return only the "wants".
        unshallow - objects which used to be shallow on the client, but are being extended as part of this fetch
      • setFilterSpec

        public void setFilterSpec​(@NonNull
                                  FilterSpec filter)
        Parameters:
        filter - the filter which indicates what and what not this writer should include
      • setPackfileUriConfig

        public void setPackfileUriConfig​(PackWriter.PackfileUriConfig config)
        Parameters:
        config - configuration related to packfile URIs
        Since:
        5.5
      • getObjectCount

        public long getObjectCount()
                            throws java.io.IOException
        Returns objects number in a pack file that was created by this writer.
        Returns:
        number of objects in pack.
        Throws:
        java.io.IOException - a cached pack cannot supply its object count.
      • getUnoffloadedObjectCount

        private long getUnoffloadedObjectCount()
                                        throws java.io.IOException
        Throws:
        java.io.IOException
      • excludeObjects

        public void excludeObjects​(ObjectIdSet idx)
        Add a pack index whose contents should be excluded from the result.
        Parameters:
        idx - objects in this index will not be in the output pack.
      • preparePack

        public void preparePack​(@NonNull
                                java.util.Iterator<RevObject> objectsSource)
                         throws java.io.IOException
        Prepare the list of objects to be written to the pack stream.

        Iterator exactly determines which objects are included in a pack and order they appear in pack (except that objects order by type is not needed at input). This order should conform general rules of ordering objects in git - by recency and path (type and delta-base first is internally secured) and responsibility for guaranteeing this order is on a caller side. Iterator must return each id of object to write exactly once.

        Parameters:
        objectsSource - iterator of object to store in a pack; order of objects within each type is important, ordering by type is not needed; allowed types for objects are Constants.OBJ_COMMIT, Constants.OBJ_TREE, Constants.OBJ_BLOB and Constants.OBJ_TAG; objects returned by iterator may be later reused by caller as object id and type are internally copied in each iteration.
        Throws:
        java.io.IOException - when some I/O problem occur during reading objects.
      • preparePack

        public void preparePack​(ProgressMonitor countingMonitor,
                                @NonNull
                                java.util.Set<? extends ObjectId> want,
                                @NonNull
                                java.util.Set<? extends ObjectId> have)
                         throws java.io.IOException
        Prepare the list of objects to be written to the pack stream.

        Basing on these 2 sets, another set of objects to put in a pack file is created: this set consists of all objects reachable (ancestors) from interesting objects, except uninteresting objects and their ancestors. This method uses class ObjectWalk extensively to find out that appropriate set of output objects and their optimal order in output pack. Order is consistent with general git in-pack rules: sort by object type, recency, path and delta-base first.

        Parameters:
        countingMonitor - progress during object enumeration.
        want - collection of objects to be marked as interesting (start points of graph traversal). Must not be null.
        have - collection of objects to be marked as uninteresting (end points of graph traversal). Pass NONE if all objects reachable from want are desired, such as when serving a clone.
        Throws:
        java.io.IOException - when some I/O problem occur during reading objects.
      • preparePack

        public void preparePack​(ProgressMonitor countingMonitor,
                                @NonNull
                                java.util.Set<? extends ObjectId> want,
                                @NonNull
                                java.util.Set<? extends ObjectId> have,
                                @NonNull
                                java.util.Set<? extends ObjectId> shallow)
                         throws java.io.IOException
        Prepare the list of objects to be written to the pack stream.

        Like preparePack(ProgressMonitor, Set, Set) but also allows specifying commits that should not be walked past ("shallow" commits). The caller is responsible for filtering out commits that should not be shallow any more ("unshallow" commits as in setShallowPack(int, java.util.Collection<? extends org.eclipse.jgit.lib.ObjectId>)) from the shallow set.

        Parameters:
        countingMonitor - progress during object enumeration.
        want - objects of interest, ancestors of which will be included in the pack. Must not be null.
        have - objects whose ancestors (up to and including shallow commits) do not need to be included in the pack because they are already available from elsewhere. Must not be null.
        shallow - commits indicating the boundary of the history marked with have. Shallow commits have parents but those parents are considered not to be already available. Parents of shallow commits and earlier generations will be included in the pack if requested by want. Must not be null.
        Throws:
        java.io.IOException - an I/O problem occurred while reading objects.
      • preparePack

        public void preparePack​(ProgressMonitor countingMonitor,
                                @NonNull
                                java.util.Set<? extends ObjectId> want,
                                @NonNull
                                java.util.Set<? extends ObjectId> have,
                                @NonNull
                                java.util.Set<? extends ObjectId> shallow,
                                @NonNull
                                java.util.Set<? extends ObjectId> noBitmaps)
                         throws java.io.IOException
        Prepare the list of objects to be written to the pack stream.

        Like preparePack(ProgressMonitor, Set, Set) but also allows specifying commits that should not be walked past ("shallow" commits). The caller is responsible for filtering out commits that should not be shallow any more ("unshallow" commits as in setShallowPack(int, java.util.Collection<? extends org.eclipse.jgit.lib.ObjectId>)) from the shallow set.

        Parameters:
        countingMonitor - progress during object enumeration.
        want - objects of interest, ancestors of which will be included in the pack. Must not be null.
        have - objects whose ancestors (up to and including shallow commits) do not need to be included in the pack because they are already available from elsewhere. Must not be null.
        shallow - commits indicating the boundary of the history marked with have. Shallow commits have parents but those parents are considered not to be already available. Parents of shallow commits and earlier generations will be included in the pack if requested by want. Must not be null.
        noBitmaps - collection of objects to be excluded from bitmap commit selection.
        Throws:
        java.io.IOException - an I/O problem occurred while reading objects.
      • getObjectWalk

        private ObjectWalk getObjectWalk()
      • preparePack

        public void preparePack​(ProgressMonitor countingMonitor,
                                @NonNull
                                ObjectWalk walk,
                                @NonNull
                                java.util.Set<? extends ObjectId> interestingObjects,
                                @NonNull
                                java.util.Set<? extends ObjectId> uninterestingObjects,
                                @NonNull
                                java.util.Set<? extends ObjectId> noBitmaps)
                         throws java.io.IOException
        Prepare the list of objects to be written to the pack stream.

        Basing on these 2 sets, another set of objects to put in a pack file is created: this set consists of all objects reachable (ancestors) from interesting objects, except uninteresting objects and their ancestors. This method uses class ObjectWalk extensively to find out that appropriate set of output objects and their optimal order in output pack. Order is consistent with general git in-pack rules: sort by object type, recency, path and delta-base first.

        Parameters:
        countingMonitor - progress during object enumeration.
        walk - ObjectWalk to perform enumeration.
        interestingObjects - collection of objects to be marked as interesting (start points of graph traversal). Must not be null.
        uninterestingObjects - collection of objects to be marked as uninteresting (end points of graph traversal). Pass NONE if all objects reachable from want are desired, such as when serving a clone.
        noBitmaps - collection of objects to be excluded from bitmap commit selection.
        Throws:
        java.io.IOException - when some I/O problem occur during reading objects.
      • willInclude

        public boolean willInclude​(AnyObjectId id)
                            throws java.io.IOException
        Determine if the pack file will contain the requested object.
        Parameters:
        id - the object to test the existence of.
        Returns:
        true if the object will appear in the output pack file.
        Throws:
        java.io.IOException - a cached pack cannot be examined.
      • get

        public ObjectToPack get​(AnyObjectId id)
        Lookup the ObjectToPack object for a given ObjectId.
        Parameters:
        id - the object to find in the pack.
        Returns:
        the object we are packing, or null.
      • computeName

        public ObjectId computeName()
        Computes SHA-1 of lexicographically sorted objects ids written in this pack, as used to name a pack file in repository.
        Returns:
        ObjectId representing SHA-1 name of a pack that was created.
      • writeIndex

        public void writeIndex​(java.io.OutputStream indexStream)
                        throws java.io.IOException
        Create an index file to match the pack file just written.

        Called after writePack(ProgressMonitor, ProgressMonitor, OutputStream).

        Writing an index is only required for local pack storage. Packs sent on the network do not need to create an index.

        Parameters:
        indexStream - output for the index data. Caller is responsible for closing this stream.
        Throws:
        java.io.IOException - the index data could not be written to the supplied stream.
      • writeBitmapIndex

        public void writeBitmapIndex​(java.io.OutputStream bitmapIndexStream)
                              throws java.io.IOException
        Create a bitmap index file to match the pack file just written.

        Called after prepareBitmapIndex(ProgressMonitor).

        Parameters:
        bitmapIndexStream - output for the bitmap index data. Caller is responsible for closing this stream.
        Throws:
        java.io.IOException - the index data could not be written to the supplied stream.
      • sortByName

        private java.util.List<ObjectToPack> sortByName()
      • writePack

        public void writePack​(ProgressMonitor compressMonitor,
                              ProgressMonitor writeMonitor,
                              java.io.OutputStream packStream)
                       throws java.io.IOException
        Write the prepared pack to the supplied stream.

        Called after preparePack(ProgressMonitor, ObjectWalk, Set, Set, Set) or preparePack(ProgressMonitor, Set, Set).

        Performs delta search if enabled and writes the pack stream.

        All reused objects data checksum (Adler32/CRC32) is computed and validated against existing checksum.

        Parameters:
        compressMonitor - progress monitor to report object compression work.
        writeMonitor - progress monitor to report the number of objects written.
        packStream - output stream of pack data. The stream should be buffered by the caller. The caller is responsible for closing the stream.
        Throws:
        java.io.IOException - an error occurred reading a local object's data to include in the pack, or writing compressed object data to the output stream.
        WriteAbortedException - the write operation is aborted by ObjectCountCallback .
      • getStatistics

        public PackStatistics getStatistics()
        Get statistics of what this PackWriter did in order to create the final pack stream.
        Returns:
        description of what this PackWriter did in order to create the final pack stream. This should only be invoked after the calls to create the pack/index/bitmap have completed.
      • getState

        public PackWriter.State getState()
        Get snapshot of the current state of this PackWriter.
        Returns:
        snapshot of the current state of this PackWriter.
      • close

        public void close()

        Release all resources used by this writer.

        Specified by:
        close in interface java.lang.AutoCloseable
      • searchForReuse

        private void searchForReuse​(ProgressMonitor monitor)
                             throws java.io.IOException
        Throws:
        java.io.IOException
      • cutDeltaChains

        private void cutDeltaChains​(BlockList<ObjectToPack> list)
                             throws java.io.IOException
        Throws:
        java.io.IOException
      • findObjectsNeedingDelta

        private int findObjectsNeedingDelta​(ObjectToPack[] list,
                                            int cnt,
                                            int type)
      • reselectNonDelta

        private void reselectNonDelta​(ObjectToPack otp)
                               throws java.io.IOException
        Throws:
        java.io.IOException
      • singleThreadDeltaSearch

        private void singleThreadDeltaSearch​(ProgressMonitor monitor,
                                             ObjectToPack[] list,
                                             int cnt)
                                      throws java.io.IOException
        Throws:
        java.io.IOException
      • parallelDeltaSearch

        private void parallelDeltaSearch​(ProgressMonitor monitor,
                                         ObjectToPack[] list,
                                         int cnt,
                                         int threads)
                                  throws java.io.IOException
        Throws:
        java.io.IOException
      • runTasks

        private static void runTasks​(java.util.concurrent.ExecutorService pool,
                                     ThreadSafeProgressMonitor pm,
                                     DeltaTask.Block tb,
                                     java.util.List<java.lang.Throwable> errors)
                              throws java.io.IOException
        Throws:
        java.io.IOException
      • writeObjects

        private void writeObjects​(PackOutputStream out)
                           throws java.io.IOException
        Throws:
        java.io.IOException
      • writeObjects

        private void writeObjects​(PackOutputStream out,
                                  java.util.List<ObjectToPack> list)
                           throws java.io.IOException
        Throws:
        java.io.IOException
      • writeObjectImpl

        private void writeObjectImpl​(PackOutputStream out,
                                     ObjectToPack otp)
                              throws java.io.IOException
        Throws:
        java.io.IOException
      • writeWholeObjectDeflate

        private void writeWholeObjectDeflate​(PackOutputStream out,
                                             ObjectToPack otp)
                                      throws java.io.IOException
        Throws:
        java.io.IOException
      • writeDeltaObjectDeflate

        private void writeDeltaObjectDeflate​(PackOutputStream out,
                                             ObjectToPack otp)
                                      throws java.io.IOException
        Throws:
        java.io.IOException
      • buffer

        private byte[] buffer​(AnyObjectId objId)
                       throws java.io.IOException
        Throws:
        java.io.IOException
      • deflater

        private java.util.zip.Deflater deflater()
      • writeChecksum

        private void writeChecksum​(PackOutputStream out)
                            throws java.io.IOException
        Throws:
        java.io.IOException
      • pruneEdgesFromObjectList

        private static void pruneEdgesFromObjectList​(java.util.List<ObjectToPack> list)
      • addObject

        public void addObject​(RevObject object)
                       throws IncorrectObjectTypeException
        Include one object to the output file.

        Objects are written in the order they are added. If the same object is added twice, it may be written twice, creating a larger than necessary file.

        Parameters:
        object - the object to add.
        Throws:
        IncorrectObjectTypeException - the object is an unsupported type.
      • addObject

        private void addObject​(RevObject object,
                               int pathHashCode)
      • addObject

        private void addObject​(AnyObjectId src,
                               int type,
                               int pathHashCode)
      • depthSkip

        private boolean depthSkip​(@NonNull
                                  RevObject obj,
                                  ObjectWalk walker)
        Determines if the object should be omitted from the pack as a result of its depth (probably because of the tree: filter).

        Causes walker to skip traversing the current tree, which ought to have just started traversal, assuming this method is called as soon as a new depth is reached.

        This method increments the treesTraversed statistic.

        Parameters:
        obj - the object to check whether it should be omitted.
        walker - the walker being used for traveresal.
        Returns:
        whether the given object should be skipped.
      • filterAndAddObject

        private void filterAndAddObject​(@NonNull
                                        AnyObjectId src,
                                        int type,
                                        int pathHashCode,
                                        @NonNull
                                        java.util.Set<? extends AnyObjectId> want)
                                 throws java.io.IOException
        Throws:
        java.io.IOException
      • exclude

        private boolean exclude​(AnyObjectId objectId)
      • select

        public void select​(ObjectToPack otp,
                           StoredObjectRepresentation next)
        Select an object representation for this writer.

        An ObjectReader implementation should invoke this method once for each representation available for an object, to allow the writer to find the most suitable one for the output.

        Parameters:
        otp - the object being packed.
        next - the next available representation from the repository.
      • prepareBitmapIndex

        public boolean prepareBitmapIndex​(ProgressMonitor pm)
                                   throws java.io.IOException
        Prepares the bitmaps to be written to the bitmap index file.

        Bitmaps can be used to speed up fetches and clones by storing the entire object graph at selected commits. Writing a bitmap index is an optional feature that not all pack users may require.

        Called after writeIndex(OutputStream).

        To reduce memory internal state is cleared during this method, rendering the PackWriter instance useless for anything further than a call to write out the new bitmaps with writeBitmapIndex(OutputStream).

        Parameters:
        pm - progress monitor to report bitmap building work.
        Returns:
        whether a bitmap index may be written.
        Throws:
        java.io.IOException - when some I/O problem occur during reading objects.
      • reuseDeltaFor

        private boolean reuseDeltaFor​(ObjectToPack otp)