Git and libgit2

Updating Trees of Git Repositories

Published 2023-03-11. Last modified 2025-09-12.
Time to read: 2 minutes.

This page is part of the git collection.

I need to keep several hundred git repositories up-to-date. I have a directory tree of website repos, and a directory tree of code repos. Updating these trees was tedious until I wrote the initial version of the update script back in 2008.

Environment Variables

/etc/environment is a system-wide configuration file, which is sourced every time a user logs in. It is owned by root, so your account needs to be a member of the admin group, or you will have to use sudo to modify it.

The /etc/environment file in all of my systems defines two environment variables:

sites
Points to the root of the website directory tree
work
Points to the root of the code project tree
/etc/environment
export sites=/var/www
export work=/var/work

Now $sites and $work will be defined for all users every time they log in.

In addition, I define subordinate environment variables for each project in a file called $work/.evars

$work/.evars
export cadenzaHome=$work/cadenzaHome
export cadenzaCode=$cadenzaHome/cadenzaCode
export cadenzaDependencies=$cadenzaCode/cadenzaDependencies
export awslib_scala=$cadenzaDependencies/awslib_scala
export shoppingcart=$cadenzaDependencies/shoppingcart
export clients=$work/clients
export django=$work/django
export msp=$sites/www.mslinn.com
... 

$work/.evars is included by ~/.bashrc.

~/.bashrc
source $work/.evars

Switching Directories

The above environment variables allow me to easily move to a git project directory without having to remember where it resides on the computer that I am currently using:

Shell
$ cd $clients
$ pwd /var/work/clients

Updating Git Directory Trees

I first wrote a Bash version of a command I called update, years later I wrote a multithreaded Ruby version that runs orders of magnitude faster for large directory trees. I also called this version update; note that it requires a properly set up Ruby development environment.

The site and work environment variables are used by the update scripts.

#!/bin/bash

# Update all git directories below current directory or specified directory
# Skips directories that contain a file called .ignore
# See https://stackoverflow.com/a/61207488/553865

if [ "$( curl -sL -w "%{http_code}\n" https://www.github.com -o /dev/null )" != 200 ]; then
  echo "Cannot connect to GitHub"
  exit 2
fi

HIGHLIGHT="\e[01;34m"
NORMAL='\e[00m'

export PATH=${PATH/':./:'/:}
export PATH=${PATH/':./bin:'/:}
#echo "$PATH"

if [ -z "$1" ]; then
  ROOTS="$sites $work"
else
  ROOTS="$@"
fi

echo "Updating $ROOTS"
DIRS="$( find -L $ROOTS -type d \( -execdir test -e {}/.ignore \; -prune \) -o \( -execdir test -d {}/.git \; -prune -print \) )"

echo -e "${HIGHLIGHT}Scanning ${PWD}${NORMAL}"
for d in $DIRS; do
  cd "$d" > /dev/null || exit 2
  echo -e "\n${HIGHLIGHT}Updating `pwd`$NORMAL"
  git pull
  cd - > /dev/null || exit 3
done

The Ruby version of update is waaaay faster than the Bash version! 💕💕💕

#!/usr/bin/env ruby
# Multithreaded Ruby script to update all git directories below specified roots.
require 'English'
require 'colorize'
require 'etc'
require 'set'
require 'shellwords'
require 'optparse'
require 'timeout'

MAX_THREADS = [1, (Etc.nprocessors * 0.75).to_i].max
GIT_TIMEOUT = 300 # 5 minutes per git pull

QUIET = 0
NORMAL = 1
VERBOSE = 2
DEBUG = 3

$verbosity = NORMAL

Signal.trap('INT') { exit!(-1) }

def log(level, msg)
  puts msg if $verbosity >= level
end

def help
  puts <<~HELP
    Usage: #{$PROGRAM_NAME} [OPTIONS] [DIRECTORY ...]
    Recursively updates all git repositories under the specified DIRECTORY roots.
    If no directories are given, uses the environment variables 'sites', 'sitesUbuntu' and 'work' as roots.
    Skips directories containing a .ignore file.
    Options:
      -h, --help       Show this help message and exit
      -q, --quiet      Suppress normal output, only show errors
      -v, --verbose    Increase verbosity (can be repeated: -vv for debug)
    Example:
      #{$PROGRAM_NAME} $sites $work
  HELP
  exit
end

begin
  OptionParser.new do |opts|
    opts.on('-h', '--help') { help }
    opts.on('-q', '--quiet') { $verbosity = 0 }
    opts.on('-v', '--verbose') { $verbosity += 1 }
  end.parse!
rescue OptionParser::InvalidOption => e
  puts "Error: #{e.message}".red
  puts
  help
  exit!(-2)
end

ROOTS = %w[sites sitesUbuntu work].freeze

# Determine roots and their display names
if ARGV.empty?
  @roots = ROOTS.each_with_object({}) do |r, h|
    if env_val = ENV[r]
      h[r] = env_val.split.map { |p| File.expand_path(p) }
    end
  end
  @display_roots = @roots.keys.map { |r| "$#{r}" }
else
  @roots = ARGV.each_with_object({}) do |arg, h|
    if arg.start_with?('$')
      root_name = arg[1..]
      root_path = ENV[root_name] || arg
    else
      root_name = arg
      root_path = arg
    end
    h[root_name] = [File.expand_path(root_path)]
  end
  @display_roots = ARGV.dup
end

log NORMAL, "Updating #{@display_roots.join(' ')}".green

work_queue = Queue.new
visited = Set.new
processed = Set.new
threads = []

def find_git_repos(root_path, visited, processed, work_queue)
  log DEBUG, "Scanning #{root_path}".yellow
  Dir.foreach(root_path) do |entry|
    next if ['.', '..'].include?(entry)

    path = File.join(root_path, entry)

    if File.directory?(path)
      if File.exist?(File.join(path, '.ignore'))
        log DEBUG, "Skipping #{path} (has .ignore)".yellow
        next
      end

      if File.exist?(File.join(path, '.git'))
        unless visited.include?(path) || processed.include?(path)
          visited.add(path)
          processed.add(path)
          log DEBUG, "Enqueueing repo: #{path}".yellow
          work_queue << path
        end
      else
        find_git_repos(path, visited, processed, work_queue)
      end
    end
  end
rescue SystemCallError => e
  log NORMAL, "Error scanning #{root_path}: #{e.message}".red
end

# Scan directories and enqueue git repos
@roots.each_value do |paths|
  paths.each do |root_path|
    find_git_repos(root_path, visited, processed, work_queue)
  end
end

log(VERBOSE, "Queue has #{work_queue.size} entries.".green)
log(VERBOSE, "#{MAX_THREADS} threads will be used to process the queue.".green)

MAX_THREADS.times do |i| # Start worker threads
  threads << Thread.new do
    until work_queue.empty?
      dir = begin
        work_queue.pop(true)
      rescue StandardError
        nil
      end
      break if dir.nil?

      abbrev_dir = dir
      @roots.each do |root_name, paths|
        paths.each do |expanded|
          next unless dir.start_with?(expanded)

          rel_path = dir[expanded.length..]
          prefix = @display_roots.find { |d| d == "$#{root_name}" || d == root_name } || "$#{root_name}"
          abbrev_dir = prefix + rel_path
          break
        end
      end

      log NORMAL, "Updating #{abbrev_dir}".green
      log VERBOSE, "Thread #{i}: git -C #{dir} pull".yellow

      output = nil
      status = nil
      begin
        Timeout.timeout(GIT_TIMEOUT) do
          output = `git -C #{Shellwords.escape(dir)} pull 2>&1`
          status = $CHILD_STATUS.exitstatus
        end
      rescue Timeout::Error
        log NORMAL, "[TIMEOUT] Thread #{i}: git pull timed out in #{abbrev_dir}".red
        status = -1
      rescue StandardError => e
        log NORMAL, "[ERROR] Thread #{i}: Failed in #{abbrev_dir}: #{e}".red
        status = -1
      end

      if status != 0
        puts "[ERROR] git pull failed in #{abbrev_dir} (exit code #{status}):\n#{output}".red
      elsif $verbosity >= VERBOSE
        puts output.strip.green
      end
    end
  end
end

# Wait for all threads to finish
threads.each(&:join)

Here is the help message for the Ruby version:

Shell
$ update -h
Usage: /mnt/d/work/mslinn_bin/git/update [OPTIONS] [DIRECTORY ...]
Recursively updates all git repositories under the specified DIRECTORY roots.
If no directories are given, uses the environment variables 'sites', 'sitesUbuntu' and 'work' as roots.
Skips directories containing a .ignore file.
Options:
  -h, --help       Show this help message and exit
  -q, --quiet      Suppress normal output, only show errors
  -v, --verbose    Increase verbosity (can be repeated: -vvv for debug)
Example:
  /mnt/d/work/mslinn_bin/git/update $sites $work 

Most of the time I want to update everything in both directory trees, so for that no arguments are required:

Shell
$ update
Updating /var/www /var/work
Updating /var/work/cadenzaHome/cadenzaCode/cadenzaDependencies/awslib_scala
Already up to date.
Updating /var/work/cadenzaHome/cadenzaCode/cadenzaDependencies/shoppingcart Already up to date.
...

It is also possible to specify the roots of one or more directory trees of git repositories:

Shell
$ update /path/to/another/tree $my_gems $my_plugins
Updating /path/to/another/tree /mnt/f/work/ruby/my_gems /mnt/f/work/jekyll/my_plugins 
😁

I hope you find the update scripts to be as useful as I have!

* indicates a required field.

Please select the following to receive Mike Slinn’s newsletter:

You can unsubscribe at any time by clicking the link in the footer of emails.

Mike Slinn uses Mailchimp as his marketing platform. By clicking below to subscribe, you acknowledge that your information will be transferred to Mailchimp for processing. Learn more about Mailchimp’s privacy practices.