Ryan Rampersad
Thoughts, opinions, ideas and now links
  • About
  • Podcast
  • Links

wget caching

May 27, 2012

On a recent podcast, the studio experienced some major Internet connectivity issues. We couldn’t use any direct quotes or reference any of material so we were kind of stuck. I thought I could rig my server to prefetch all of our show note pages prior to the show in the future. I only needed single pages with mostly complete assets. I found a great Superuser thread on this topic via user35651.

from the wget manual (1.12):

“Actually, to download a single page and all its requisites (even if they exist on separate websites), and make sure the lot displays properly locally, this author likes to use a few options in addition to ā€˜-p’: ”

wget -E -H -k -K -p url

This solution works great, but I also add in -nd -w 3. That will add in a no directory switch – essentially preventing the creation of thousands of folders, and a small delay between each request so I don’t destroy the local internet in the house, nor the remote servers.

Tagged: cache, wget
<Previous Next>

© 2013 Ryan Rampersad.