Class GC


  • public class GC
    extends java.lang.Object
    A garbage collector for git FileRepository. Instances of this class are not thread-safe. Don't use the same instance from multiple threads. This class started as a copy of DfsGarbageCollector from Shawn O. Pearce adapted to FileRepositories.
    • Nested Class Summary

      Nested Classes 
      Modifier and Type Class Description
      static class  GC.RepoStatistics
      A class holding statistical data for a FileRepository regarding how many objects are stored as loose or packed objects
    • Constructor Summary

      Constructors 
      Constructor Description
      GC​(FileRepository repo)
      Creates a new garbage collector with default values.
    • Method Summary

      All Methods Static Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private void addRepackAllOption()  
      private boolean canBeSafelyDeleted​(java.nio.file.Path path, java.time.Instant threshold)  
      private void checkCancelled()  
      private void delete​(java.nio.file.Path d)  
      private void deleteDir​(java.nio.file.Path dir)  
      private void deleteEmptyRefsFolders()  
      private void deleteOldPacks​(java.util.Collection<PackFile> oldPacks, java.util.Collection<PackFile> newPacks)
      Delete old pack files.
      private void deleteOrphans()
      Deletes orphans
      private void deleteTempPacksIdx()  
      private java.util.Collection<PackFile> doGc()  
      private static boolean equals​(Ref r1, Ref r2)  
      private java.util.concurrent.ExecutorService executor()  
      java.util.Collection<PackFile> gc()
      Runs a garbage collector on a FileRepository.
      private java.util.Collection<Ref> getAllRefs()
      Returns a collection of all refs and additional refs.
      private long getExpireDate()  
      private int getLooseObjectLimit()  
      private long getPackExpireDate()  
      private java.lang.String getPruneExpireStr()  
      GC.RepoStatistics getStatistics()
      Returns information about objects and pack files for a FileRepository.
      private boolean isDirectory​(java.nio.file.Path p)  
      private static boolean isHead​(Ref ref)  
      private static boolean isTag​(Ref ref)  
      private java.util.Set<ObjectId> listNonHEADIndexObjects()
      Return a list of those objects in the index which differ from whats in HEAD
      private java.util.Set<ObjectId> listRefLogObjects​(Ref ref, long minTime)  
      private void loosen​(ObjectDirectoryInserter inserter, ObjectReader reader, PackFile pack, java.util.HashSet<ObjectId> existing)
      Loosen objects in a pack file which are not also in the newly-created pack files.
      private java.io.File nameFor​(java.lang.String name, java.lang.String ext)  
      private boolean needGc()  
      void packRefs()
      Pack ref storage.
      void prune​(java.util.Set<ObjectId> objectsToKeep)
      Like "git prune" this method tries to prune all loose objects which are unreferenced.
      private void prunePack​(java.lang.String packName)
      Delete files associated with a single pack file.
      void prunePacked()
      Like "git prune-packed" this method tries to prune all loose objects which can be found in packs.
      private void prunePreserved()
      Delete the preserved directory including all pack files within
      private void removeOldPack​(java.io.File packFile, java.lang.String packName, PackExt ext, int deleteOptions)
      Deletes old pack file, unless 'preserve-oldpacks' is set, in which case it moves the pack file to the preserved directory
      private void removeReferenced​(java.util.Map<ObjectId,​java.io.File> id2File, ObjectWalk w)
      Remove all entries from a map which key is the id of an object referenced by the given ObjectWalk
      java.util.Collection<PackFile> repack()
      Packs all objects which reachable from any of the heads into one pack file.
      void setAuto​(boolean auto)
      Set the gc --auto option.
      (package private) void setBackground​(boolean background)  
      static void setExecutor​(java.util.concurrent.ExecutorService e)
      Set the executor for running auto-gc in the background.
      void setExpire​(java.util.Date expire)
      During gc() or prune() each unreferenced, loose object which has been created or modified after or at expire will not be pruned.
      void setExpireAgeMillis​(long expireAgeMillis)
      During gc() or prune() each unreferenced, loose object which has been created or modified in the last expireAgeMillis milliseconds will not be pruned.
      void setPackConfig​(PackConfig pconfig)
      Set the PackConfig used when (re-)writing packfiles.
      void setPackExpire​(java.util.Date packExpire)
      During gc() or prune() packfiles which are created or modified after or at packExpire will not be deleted.
      void setPackExpireAgeMillis​(long packExpireAgeMillis)
      During gc() or prune() packfiles which are created or modified in the last packExpireAgeMillis milliseconds will not be deleted.
      GC setProgressMonitor​(ProgressMonitor pm)
      Set the progress monitor used for garbage collection methods.
      (package private) boolean tooManyLooseObjects()
      Quickly estimate number of loose objects, SHA1 is distributed evenly so counting objects in one directory (bucket 17) is sufficient
      (package private) boolean tooManyPacks()  
      private PackFile writePack​(java.util.Set<? extends ObjectId> want, java.util.Set<? extends ObjectId> have, java.util.Set<ObjectId> tags, java.util.Set<ObjectId> tagTargets, java.util.List<ObjectIdSet> excludeObjects)  
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Field Detail

      • LOG

        private static final org.slf4j.Logger LOG
      • PRUNE_EXPIRE_DEFAULT

        private static final java.lang.String PRUNE_EXPIRE_DEFAULT
        See Also:
        Constant Field Values
      • PRUNE_PACK_EXPIRE_DEFAULT

        private static final java.lang.String PRUNE_PACK_EXPIRE_DEFAULT
        See Also:
        Constant Field Values
      • PATTERN_LOOSE_OBJECT

        private static final java.util.regex.Pattern PATTERN_LOOSE_OBJECT
      • PACK_EXT

        private static final java.lang.String PACK_EXT
      • BITMAP_EXT

        private static final java.lang.String BITMAP_EXT
      • INDEX_EXT

        private static final java.lang.String INDEX_EXT
      • KEEP_EXT

        private static final java.lang.String KEEP_EXT
      • executor

        private static volatile java.util.concurrent.ExecutorService executor
      • expireAgeMillis

        private long expireAgeMillis
      • expire

        private java.util.Date expire
      • packExpireAgeMillis

        private long packExpireAgeMillis
      • packExpire

        private java.util.Date packExpire
      • lastPackedRefs

        private java.util.Collection<Ref> lastPackedRefs
        the refs which existed during the last call to repack(). This is needed during prune(Set) where we can optimize by looking at the difference between the current refs and the refs which existed during last repack().
      • lastRepackTime

        private long lastRepackTime
        Holds the starting time of the last repack() execution. This is needed in prune() to inspect only those reflog entries which have been added since last repack().
      • automatic

        private boolean automatic
        Whether gc should do automatic housekeeping
      • background

        private boolean background
        Whether to run gc in a background thread
    • Constructor Detail

      • GC

        public GC​(FileRepository repo)
        Creates a new garbage collector with default values. An expirationTime of two weeks and null as progress monitor will be used.
        Parameters:
        repo - the repo to work on
    • Method Detail

      • setExecutor

        public static void setExecutor​(java.util.concurrent.ExecutorService e)
        Set the executor for running auto-gc in the background. If no executor is set JGit's own WorkQueue will be used.
        Parameters:
        e - the executor to be used for running auto-gc
      • gc

        public java.util.Collection<PackFile> gc()
                                          throws java.io.IOException,
                                                 java.text.ParseException
        Runs a garbage collector on a FileRepository. It will
        • pack loose references into packed-refs
        • repack all reachable objects into new pack files and delete the old pack files
        • prune all loose objects which are now reachable by packs
        If setAuto(boolean) was set to true gc will first check whether any housekeeping is required; if not, it exits without performing any work. If setBackground(boolean) was set to true collectGarbage will start the gc in the background, and then return immediately. In this case, errors will not be reported except in gc.log.
        Returns:
        the collection of PackFile's which are newly created
        Throws:
        java.io.IOException
        java.text.ParseException - If the configuration parameter "gc.pruneexpire" couldn't be parsed
      • executor

        private java.util.concurrent.ExecutorService executor()
      • doGc

        private java.util.Collection<PackFile> doGc()
                                             throws java.io.IOException,
                                                    java.text.ParseException
        Throws:
        java.io.IOException
        java.text.ParseException
      • loosen

        private void loosen​(ObjectDirectoryInserter inserter,
                            ObjectReader reader,
                            PackFile pack,
                            java.util.HashSet<ObjectId> existing)
                     throws java.io.IOException
        Loosen objects in a pack file which are not also in the newly-created pack files.
        Parameters:
        inserter -
        reader -
        pack -
        existing -
        Throws:
        java.io.IOException
      • deleteOldPacks

        private void deleteOldPacks​(java.util.Collection<PackFile> oldPacks,
                                    java.util.Collection<PackFile> newPacks)
                             throws java.text.ParseException,
                                    java.io.IOException
        Delete old pack files. What is 'old' is defined by specifying a set of old pack files and a set of new pack files. Each pack file contained in old pack files but not contained in new pack files will be deleted. If preserveOldPacks is set, keep a copy of the pack file in the preserve directory. If an expirationDate is set then pack files which are younger than the expirationDate will not be deleted nor preserved.

        If we're not immediately expiring loose objects, loosen any objects in the old pack files which aren't in the new pack files.

        Parameters:
        oldPacks -
        newPacks -
        Throws:
        java.text.ParseException
        java.io.IOException
      • removeOldPack

        private void removeOldPack​(java.io.File packFile,
                                   java.lang.String packName,
                                   PackExt ext,
                                   int deleteOptions)
                            throws java.io.IOException
        Deletes old pack file, unless 'preserve-oldpacks' is set, in which case it moves the pack file to the preserved directory
        Parameters:
        packFile -
        packName -
        ext -
        deleteOptions -
        Throws:
        java.io.IOException
      • prunePreserved

        private void prunePreserved()
        Delete the preserved directory including all pack files within
      • prunePack

        private void prunePack​(java.lang.String packName)
        Delete files associated with a single pack file. First try to delete the ".pack" file because on some platforms the ".pack" file may be locked and can't be deleted. In such a case it is better to detect this early and give up on deleting files for this packfile. Otherwise we may delete the ".index" file and when failing to delete the ".pack" file we are left with a ".pack" file without a ".index" file.
        Parameters:
        packName -
      • prunePacked

        public void prunePacked()
                         throws java.io.IOException
        Like "git prune-packed" this method tries to prune all loose objects which can be found in packs. If certain objects can't be pruned (e.g. because the filesystem delete operation fails) this is silently ignored.
        Throws:
        java.io.IOException
      • prune

        public void prune​(java.util.Set<ObjectId> objectsToKeep)
                   throws java.io.IOException,
                          java.text.ParseException
        Like "git prune" this method tries to prune all loose objects which are unreferenced. If certain objects can't be pruned (e.g. because the filesystem delete operation fails) this is silently ignored.
        Parameters:
        objectsToKeep - a set of objects which should explicitly not be pruned
        Throws:
        java.io.IOException
        java.text.ParseException - If the configuration parameter "gc.pruneexpire" couldn't be parsed
      • getExpireDate

        private long getExpireDate()
                            throws java.text.ParseException
        Throws:
        java.text.ParseException
      • getPruneExpireStr

        private java.lang.String getPruneExpireStr()
      • getPackExpireDate

        private long getPackExpireDate()
                                throws java.text.ParseException
        Throws:
        java.text.ParseException
      • equals

        private static boolean equals​(Ref r1,
                                      Ref r2)
      • packRefs

        public void packRefs()
                      throws java.io.IOException
        Pack ref storage. For a RefDirectory database, this packs all non-symbolic, loose refs into packed-refs. For Reftable, all of the data is compacted into a single table.
        Throws:
        java.io.IOException
      • repack

        public java.util.Collection<PackFile> repack()
                                              throws java.io.IOException
        Packs all objects which reachable from any of the heads into one pack file. Additionally all objects which are not reachable from any head but which are reachable from any of the other refs (e.g. tags), special refs (e.g. FETCH_HEAD) or index are packed into a separate pack file. Objects included in pack files which have a .keep file associated are never repacked. All old pack files which existed before are deleted.
        Returns:
        a collection of the newly created pack files
        Throws:
        java.io.IOException - when during reading of refs, index, packfiles, objects, reflog-entries or during writing to the packfiles IOException occurs
      • isHead

        private static boolean isHead​(Ref ref)
      • isTag

        private static boolean isTag​(Ref ref)
      • deleteEmptyRefsFolders

        private void deleteEmptyRefsFolders()
                                     throws java.io.IOException
        Throws:
        java.io.IOException
      • canBeSafelyDeleted

        private boolean canBeSafelyDeleted​(java.nio.file.Path path,
                                           java.time.Instant threshold)
      • deleteDir

        private void deleteDir​(java.nio.file.Path dir)
      • isDirectory

        private boolean isDirectory​(java.nio.file.Path p)
      • delete

        private void delete​(java.nio.file.Path d)
      • deleteOrphans

        private void deleteOrphans()
        Deletes orphans

        A file is considered an orphan if it is either a "bitmap" or an index file, and its corresponding pack file is missing in the list.

      • deleteTempPacksIdx

        private void deleteTempPacksIdx()
      • listRefLogObjects

        private java.util.Set<ObjectId> listRefLogObjects​(Ref ref,
                                                          long minTime)
                                                   throws java.io.IOException
        Parameters:
        ref - the ref which log should be inspected
        minTime - only reflog entries not older then this time are processed
        Returns:
        the ObjectIds contained in the reflog
        Throws:
        java.io.IOException
      • getAllRefs

        private java.util.Collection<Ref> getAllRefs()
                                              throws java.io.IOException
        Returns a collection of all refs and additional refs. Additional refs which don't start with "refs/" are not returned because they should not save objects from being garbage collected. Examples for such references are ORIG_HEAD, MERGE_HEAD, FETCH_HEAD and CHERRY_PICK_HEAD.
        Returns:
        a collection of refs pointing to live objects.
        Throws:
        java.io.IOException
      • nameFor

        private java.io.File nameFor​(java.lang.String name,
                                     java.lang.String ext)
      • getStatistics

        public GC.RepoStatistics getStatistics()
                                        throws java.io.IOException
        Returns information about objects and pack files for a FileRepository.
        Returns:
        information about objects and pack files for a FileRepository
        Throws:
        java.io.IOException
      • setProgressMonitor

        public GC setProgressMonitor​(ProgressMonitor pm)
        Set the progress monitor used for garbage collection methods.
        Parameters:
        pm - a ProgressMonitor object.
        Returns:
        this
      • setExpireAgeMillis

        public void setExpireAgeMillis​(long expireAgeMillis)
        During gc() or prune() each unreferenced, loose object which has been created or modified in the last expireAgeMillis milliseconds will not be pruned. Only older objects may be pruned. If set to 0 then every object is a candidate for pruning.
        Parameters:
        expireAgeMillis - minimal age of objects to be pruned in milliseconds.
      • setPackExpireAgeMillis

        public void setPackExpireAgeMillis​(long packExpireAgeMillis)
        During gc() or prune() packfiles which are created or modified in the last packExpireAgeMillis milliseconds will not be deleted. Only older packfiles may be deleted. If set to 0 then every packfile is a candidate for deletion.
        Parameters:
        packExpireAgeMillis - minimal age of packfiles to be deleted in milliseconds.
      • setPackConfig

        public void setPackConfig​(@NonNull
                                  PackConfig pconfig)
        Set the PackConfig used when (re-)writing packfiles. This allows to influence how packs are written and to implement something similar to "git gc --aggressive"
        Parameters:
        pconfig - the PackConfig used when writing packs
      • setExpire

        public void setExpire​(java.util.Date expire)
        During gc() or prune() each unreferenced, loose object which has been created or modified after or at expire will not be pruned. Only older objects may be pruned. If set to null then every object is a candidate for pruning.
        Parameters:
        expire - instant in time which defines object expiration objects with modification time before this instant are expired objects with modification time newer or equal to this instant are not expired
      • setPackExpire

        public void setPackExpire​(java.util.Date packExpire)
        During gc() or prune() packfiles which are created or modified after or at packExpire will not be deleted. Only older packfiles may be deleted. If set to null then every packfile is a candidate for deletion.
        Parameters:
        packExpire - instant in time which defines packfile expiration
      • setAuto

        public void setAuto​(boolean auto)
        Set the gc --auto option. With this option, gc checks whether any housekeeping is required; if not, it exits without performing any work. Some JGit commands run gc --auto after performing operations that could create many loose objects.

        Housekeeping is required if there are too many loose objects or too many packs in the repository. If the number of loose objects exceeds the value of the gc.auto option JGit GC consolidates all existing packs into a single pack (equivalent to -A option), whereas git-core would combine all loose objects into a single pack using repack -d -l. Setting the value of gc.auto to 0 disables automatic packing of loose objects.

        If the number of packs exceeds the value of gc.autoPackLimit, then existing packs (except those marked with a .keep file) are consolidated into a single pack by using the -A option of repack. Setting gc.autoPackLimit to 0 disables automatic consolidation of packs.

        Like git the following jgit commands run auto gc:

        • fetch
        • merge
        • rebase
        • receive-pack
        The auto gc for receive-pack can be suppressed by setting the config option receive.autogc = false
        Parameters:
        auto - defines whether gc should do automatic housekeeping
      • setBackground

        void setBackground​(boolean background)
        Parameters:
        background - whether to run the gc in a background thread.
      • needGc

        private boolean needGc()
      • addRepackAllOption

        private void addRepackAllOption()
      • tooManyPacks

        boolean tooManyPacks()
        Returns:
        true if number of packs > gc.autopacklimit (default 50)
      • tooManyLooseObjects

        boolean tooManyLooseObjects()
        Quickly estimate number of loose objects, SHA1 is distributed evenly so counting objects in one directory (bucket 17) is sufficient
        Returns:
        true if number of loose objects > gc.auto (default 6700)
      • getLooseObjectLimit

        private int getLooseObjectLimit()