I often find myself needing to copy an immense number of files, or an immense amount of data, from one place to another. A good example, is for backup. Until now, I'm been writing my own software to do this. But, as is so often the case, this is a problem that's faced a lot of people, and Andrew Tridgell (the man who made Samba, and if you don't know Samba, then you've never needed to network Windows and Linux computers together) has written a most excellent solution called rsync
It almost looks to me as if I've been wasting my time writing this backup software. Almost, but not quite, as I'll explain in a minute. But first, rsync.
The only thing it isn't obvious how to do, is copy files created only in the last few days, ignoring the great mass of files created years ago. But there's a way to do that, using find. Once you correct the slight error (+3 should be -3) it works a treat. I'm going to be using rsync in conjunction with find a lot in future.
But I hadn't been completely wasting my time. The way that this method works, means that you're pushing the files from the source to the destination. So, you're logged into the source, and your copying the files from there. And I want to do it the other way round; I want to pull the files from the source. I want to be logged into the destination, and copy files from the source.
This sounds like the same thing, what's the difference? The difference is bandwidth. I use three broadband connections for this. Broadband is cheap (but, I'm told, unreliable) bandwidth; in my area, you get 6 mbps for £15/month - my puny 2 mbps line costs me £400/month. If I shopped around, I could get broadband even cheaper, when I see the ads on TV, they're almost giving it away. Plusnet offer "up to 17 mbits" (meaning 6 mbit where I am) for £2.50/month, if I switch to them for my phone. And when fibre eventually reaches the remote region in which I live, Plusnet offer "up to 76 mbit" (which means that they guarantee that it will never be better than that, not really a useful guarantee) for £20/month (plus £16 line rental). I'll have some of that as soon as it comes to the wild and woolly area I inhabit.
Because I'm pulling the files, I have a choice of three channels to pull the down, and, of course, I use all three. I have a little routine that means I can use all three at once. If I were logged on to the source computer and pushing the files, I couldn't do that.
Well, I probably could, by writing some clever software to run on the destination and issue commands to the source. But that's what I did in the first place.
In the longer run, I'm planning to replace my puny 2mb line (and three broadband DSLs) with a mighty 100mb line. This, I think, might arrive within a month or so, and I've been reorganising my infrastructure to get ready for it. And because there's been a falling trend in bandwidth costs over the last couple of decades, that 100mbit line is going to be cheaper that using a colocation service.