Archive for July, 2007

Writing a Virus Scanner (Part 1 of 2)


Okay, I’m sure everybody here wants to get down and dirty with the world of viruses. I’m sure you all have bad memories with viruses and now is your chance to get some revenge. All right, so our scanner won’t be that strong. In fact it will only detect one virus. To make matters worse the “virus” is just a test virus used to test out antivirus software. Still, armed with this information you can learn to apply these examples to finding real viruses. In the first part of this mini-series I will show you the basic theory behind antivirus software, so that we can write our own little scanning script. Obviously we won’t have advanced features such as quarantining, but if you add to the program, it could actually do some fine work. You should be able to apply this theory to whatever language you program in. I’ll be using Python.


The basic theory behind antivirus software is to detect viruses based
on their signatures, a hexadecimal string based on the contents of a file. First I will show you how we get this string from a known virus. Then in the next part of the series, I’ll show you how to put this information to good use. Basically you find the virus signature by dumping the file in hexadecimal. Hexadecimal is a base-16 system (decimal is a base-10 system) and teaching it to you is outside the reach of this tutorial. If you want to learn hexadecimal (not completely necessary, but definitely helps out) I recommend going here.

Note: We will not be using the 0x1F format, nor will we be using the $1F. We will be just writing 1F. These numbers are still hexadecimal.

Okay, for the next step your antivirus/antispyware utility might interfere with our activities if it has automatic protection features that scan downloads. We are going to download the industry standard eicar test file.

WARNING: I cannot be held responsible for misuse of antivirus software when using this file. If you aren’t comfortable using antivirus software, this tutorial is not for you.

Download a test file (not one of the ZIP ones). Now we need to dump the file into hexadecimal. There are many programs that can dump into hexadecimal, but I assume you want to do this quickly. If that is the case, please visit Online Hex Dump. Upload your file and there should be hex output. Remove the line numbers and the parts that aren’t hex (the parts at the end of the line). Your final output should be this:

58 35 4F 21 50 25 40 41 50 5B 34 5C 50 5A 58 35
34 28 50 5E 29 37 43 43 29 37 7D 24 45 49 43 41
52 2D 53 54 41 4E 44 41 52 44 2D 41 4E 54 49 56
49 52 55 53 2D 54 45 53 54 2D 46 49 4C 45 21 24
48 2B 48 2A

Now you need to unspace this code. You can do this manually, but I just wrote a nice little Python script to do it for me:

string=”58 35 4F 21 50 25 40 41 50 5B 34 5C 50 5A 58 35 34 28 50 5E 29 37 43 43 29 37 7D 24 45 49 43 41 52 2D 53 54 41 4E 44 41 52 44 2D 41 4E 54 49 56 49 52 55 53 2D 54 45 53 54 2D 46 49 4C 45 21 24 48 2B 48 2A”
string=string.replace(” “,””)
print string 

Here’s what the final signature looks like:


This makes a lot faster if you are serious about all this virus signature stuff. Now you have the virus signature (note it should all be on one line, the example about isn’t on one line, due to bad WordPress blog editing in Opera)! Don’t worry, you won’t have to do this for every virus, or most viruses. There are sites that provide free virus signatures. One site I found is run by Lightspeed Systems. Their signature provided for this test signature isn’t the full hex dump. It’s actually only about half of it. Checking that half would still find the virus, it would just return more false-positives. The theory behind our virus scanner is that it will hex dump the file and compare it to a known list of virus signatures. To detect more advanced viruses you will have to learn about polymorphic viruses, which is definitley outside the scope of this tutorial.  Good luck. Learn all this so you’ll be ready for the next tutorial. If you don’t wish to write your own virus scanner, maybe you would like to help out with an open source antivirus project.


ASCII Table / Extended ASCII Codes– A table with ASCII and hex character codes. | Tools– Tools for converting various number systems (including hex) to text.

Online Hex Dump– An online hex dumper (if the title wasn’t obvious).

Python Programming Language — Official Site– Home to the famous scripting/programming language. It’s easy to learn, and allows you to make quick little functions that save you a ton of time. Python can also make robust applications. That’s why we’ll be using it to make the antivirus program in the next part of the tutorial. -Information About Viruses, Hackers, and Spam– A huge virus info site.

Virus Detection Signatures– Free virus signatures. There’s probably better stuff out there though.

That’s all for part one folks! If you love, hate, or don’t care, leave a comment. There might be a small wait for part two, but it should be within the next week or two (I might do some smaller posts for a while, and I’ll be out of a town later this week). Thanks for reading!



July 25, 2007 at 9:47 pm 34 comments

Windows Live Messenger Has a New Protocol Underway

I’ve been messsing around with some protocol stuff in Windows Live Messenger (formerly known as MSN). The current version of the protocol is MSNP15. This does not stand for Microsoft Network (MSN) Protocol, but stands for Microsoft Notification Protocol. Sending a VER message to the server checks to see if you have an acceptable version of the protocol to connect with. Versions below seven do not return proper responses (those were never used in public programs), but all other protocols that have been used respond correctly.  MSNP16 resepons correctly, even though the current version is MSNP15!  What I think this means is that the next version of the protocol is underway and there may be some new features in the upcoming Windows Live Messenger release. The Windows Live Messenger team has been hard at work giving us a great product. I love Messenger and I think it provides a very nice work environment. It just looks and runs so much better than AIM (although AIM Pro has some features I really love). One thing I really love about the development cycle is that they get personal with the program’s users. I found this information out thanks to Python and connected to the server with the built-in socket module that provides a “low-level networking interface” (a.k.a. the TCP/IP protocol used to connect Windows Live Messenger). Here’s the code I used to do this:

#Start Code
import socket
HOST = ‘’ # The remote host
PORT = 1863 # The same port as used by the server
s = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
s.connect((HOST, PORT))
s.send(‘VER 0 MSNP16 CVR0\r\n’)
data = s.recv(1024)
print ‘Received’, repr(data)
#End Code

I’m going to be working on a little text-based Windows Live Messenger in Python.




Turns out someone discovered this before me by using netcat.

July 25, 2007 at 2:56 am 1 comment

Back to Work

After forgetting about this blog for quite some time, I decided that I wasn’t going to waste an account with one small post.  No, it is time for me to start posting some content.  Expect code samples, project ideas, and commentaries on recent happenings in the geek world.  Stay tuned.


July 7, 2007 at 10:06 pm 1 comment

July 2007
« Jun   Aug »