If you do a lot of Chef development, you are creating cookbooks with recipes. But also converging, testing and destroying virtual machines with Test Kitchen that uses Vagrant. In this process, virtual machines are often destroyed and build from the ground up. For every new build, all of the yum or debian packages must be downloaded and installed again. If your internet connection is not blazing fast, this will certainly slow down your work tempo a lot.

Install and configure Squid

But no worries, we can fix that with the use of the wonderful vagrant-proxyconf plugin and a locally installed caching proxy like Squid! This proxy will run locally on your host machine and can for example be used for all connections from Vagrant boxes. Special rules can be defined to only cache .rpm or .deb files if desired.

On Mac OS X the Squid proxy is very easy to install and configure with SquidMan .

I set a 16G cache size, 256M maximum object size and a 90 days cache period for rpm and deb files.

My complete configuration (generated from default SquidMan template with some additional refresh_pattern lines):

~/Library/Preferences/squid.conf
# ----------------------------------------------------------------------
# WARNING - do not edit this template unless you know what you are doing
# ----------------------------------------------------------------------

# the parent cache



# disk and memory cache settings
cache_dir ufs /Users/pieter/Library/Caches/squid 16384 16 256
maximum_object_size 262144 KB


# store coredumps in the first cache dir
coredump_dir /Users/pieter/Library/Caches/squid


# the hostname squid displays in error messages
visible_hostname localhost


# log & process ID file details
cache_access_log stdio:/Users/pieter/Library/Logs/squid/squid-access.log
cache_store_log stdio:/Users/pieter/Library/Logs/squid/squid-store.log
cache_log /Users/pieter/Library/Logs/squid/squid-cache.log
pid_filename /tmp/squid.pid


# Squid listening port
http_port 8080


# Access Control lists
acl SSL_ports port 443
acl Safe_ports port 80      # http
acl Safe_ports port 21      # ftp
acl Safe_ports port 443     # https
acl Safe_ports port 70      # gopher
acl Safe_ports port 210     # wais
acl Safe_ports port 1025-65535  # unregistered ports
acl Safe_ports port 280     # http-mgmt
acl Safe_ports port 488     # gss-http
acl Safe_ports port 591     # filemaker
acl Safe_ports port 777     # multiling http
acl CONNECT method CONNECT




# Only allow cachemgr access from localhost
http_access allow localhost manager
http_access deny manager


# Deny requests to certain unsafe ports
http_access deny !Safe_ports


# Deny CONNECT to other than secure SSL ports
http_access deny CONNECT !SSL_ports


# protect web apps running on the proxy host from external users
http_access deny to_localhost


# rules for client access go here
http_access allow localhost



# after allowed hosts, deny all other access to this proxy
# don't list any other access settings below this point
http_access deny all


# specify which hosts have direct access (bypassing the parent proxy)

always_direct deny all


# hierarchy stop list (squid-recommended)
hierarchy_stoplist cgi-bin ?



# refresh patterns (squid-recommended)
#
# The rpm, deb, iso and tar.gz files are cached for 129600 minutes,
# which is 90 days. The refresh-ims and override-expire options are
# described in the configuration here:
# http://www.squid-cache.org/Doc/config/refresh_pattern/.
# But basically, refresh-ims makes squid check with the backend
# server when someone does a conditional get, to be cautious. The
# override-expire option lets us override the specified expiry time.
# This is illegal according the RFC, but works for our specific purposes.
refresh_pattern -i .rpm$ 129600 100% 129600 refresh-ims override-expire
refresh_pattern -i .deb$ 129600 100% 129600 refresh-ims override-expire
refresh_pattern -i .iso$ 129600 100% 129600 refresh-ims override-expire
refresh_pattern -i .tar.gz$ 129600 100% 129600 refresh-ims override-expire
refresh_pattern ^ftp:       1440    20% 10080
refresh_pattern ^gopher:    1440    0%  1440
refresh_pattern -i (/cgi-bin/|\?) 0 0%  0
refresh_pattern .       0   20% 4320

Install and configure vagrant-proxyconf

Now the proxy is setup, we need to tell Vagrant to route its traffic through the proxy. Install the vagrant plugin “vagrant-proxyconf” for that.

vagrant plugin install vagrant-proxyconf

Find the gateway IP on the guest VM which represents the IP to the host machine.

netstat -rn

In most setups the IP will just be 10.0.2.2.

Tell Vagrant with a Vagrantfile in your home directory to use the proxy server. Test Kitchen or a Vagrantfile in your project will extend the Vagrantfile in your home directory.

nano ~/.vagrant.d/Vagrantfile
Vagrant.configure("2") do |config|
  if Vagrant.has_plugin?("vagrant-proxyconf")
    config.proxy.http     = "http://10.0.2.2:8080/"
    config.proxy.https    = "http://10.0.2.2:8080/"
    config.proxy.no_proxy = "localhost,127.0.0.1,.example.com"
  end
end

Disable Yum mirrors (optional, but recommended!)

To get quicker and more cache hits in the proxy, it’s important to disable for example in Centos, the Yum fastestmirror plugin. Also use the baseurl in repositories instead of the mirrorlist. This way rpm packages will always be downloaded from the same URL instead of different mirrors each time.

An example on how to to this in a Chef recipe:

if node['dev_mode']
  # Improve the cache hit rate for a proxy by disabling the Yum fastestmirror plugin
  ruby_block "disable yum fastestmirror plugin" do
    block do

      Chef::Log.info "[YUM]: Disable fastestmirror plugin"

      rc = Chef::Util::FileEdit.new('/etc/yum/pluginconf.d/fastestmirror.conf')
      rc.search_file_replace_line(/^enabled=/, 'enabled=0')
      rc.write_file
    end
  end

  # Improve the cache hit rate for a proxy by disabling the mirrorlist in repositories
  ruby_block "disable yum mirrorlist in repositories" do
    block do
      Chef::Log.info "[YUM]: Disable mirrorlist in repositories"

      repo_dir = '/etc/yum.repos.d'

      if Dir.exists?(repo_dir)
        Dir.foreach(repo_dir) do |repo|
          next if repo == '.' or repo == '..' or File.directory?(repo)

          content = File.read("#{repo_dir}/#{repo}")

          if content.include?('baseurl=')
            content = content.gsub(/#baseurl=/, 'baseurl=')
            content = content.gsub(/mirrorlist=/, '#mirrorlist=')
            content = content.gsub(/##mirrorlist=/, '#mirrorlist=')

            File.open("#{repo_dir}/#{repo}", 'w') { |file| file.puts content }
          end
        end
      end
    end
  end
end

That’s it! Enjoy working faster and probably drink some less coffee!