Pages

Sunday 12 November 2017

Virtran

In 1991, I realised that Findvirus, my antivirus scanner, had gotten really slow. When I first wrote it in 1989, it looked for three file viruses, Vienna, Cascade and Jerusalem. By the time it was scanning for several dozen, it was really slow, so I decided to rewrite it.

My first decision was to make it a single point scanner. If you've analysed the virus, then you know *exactly* where in the file the virus would be if it's there at all. My second decision was to include repair, so that if it found a file with a virus, it would (optionally) reverse the infection process and put the file back the way it was before the virus came along. My third big decision was to make it upgradable, so that I could add detection and repair for viruses that came along in future. How many would that be? I thought about it, and realised that I couldn't put an upper limit on it.

To make it upgradable, I decided that as each virus appeared, I didn't want to change the scanning engine. So I would have a separate file. Other antivirus products called this file "signatures", but what I wanted wasn't just the sequence of bytes to look for, but where in the file to look for it, how to do a checksum of the virus so I'd have an *exact* identification of the virus (by that time we'd begun to see families of viruses) and then describing how to do a repair. I called that a driver file; each virus would have its own driver, and it wouldn't matter which order they appeared in. And becuase it worked like that, I'd be able to have umpteen people working on drivers, although for a long ling time there was only one. Me.

So what I designed was an interpreted language. There was a verb for "follow the jump", a verb for "move xx bytes along the file", and so on. There were two kind of verb; single-byte verbs like "follow the jump", and multibyte verbs like "move xx bytes along the file". A byte in the range 0-127 would be a single-byte verb, 128-255 would be multibyte. How many bytes in a multibyte verb? The second byte would tell you that, and would be followed by the xx number, or whatever other info wa needed. Altogether thee was, going on my memory, a hundred or so verbs. And there was a compiler which boiled down an English-like description file, into a byte sequence, which I thoughtfully called "compile".

One of the beauties of this, was that to port the scanner from one system to another, all you had to do was reimplement the engine, the same driver file would work on all engines. I did exactly that, porting the scanner frmo Dos to OS/2. Iolo Davidson also did a port, writing a TSR (memory resident) version that we called "Virus Guard". Later we ported to Windows (and to a VXD, the Windows equivalent to a TSR), to Novell Netware 2 and Netware 3. Linux wasn't a big deal in those days.

And it was an extensible system. If I needed a new verb, then I'd just use the first unused number, and define that for the new verb, implement it in the engine, add to the compiler, and away we go. When the polymorphic viruses started to get more important, this was very useful, because I could add the verb for the "Generic decryption engine", a powerful tool that was able to decrypt any polymorphic virus, making scanning the result really easy.


That was 1991. Now it's 2017. We let the management buy the company, after a couple of years they floated it, then it got sold to Mcafee, the Mcafee got sold to Intel, now Intel have sold that on to another company, and yesterday, I invited Igor Muttik, a chap I hired for the virus lab from the Low Temperature Physics lab in Moscow, and we had coffee and talked about people, places and viruses. And he told me that 26 years later, they are still using Virtran (with the extensions that I had allowed room for).

I was pleased. Although not as please as I was when we god the Queen's Award for Technology for it!



No comments:

Post a Comment