Feb 22

Netcat Clone in Three Languages – Part II (Python)

Category: Programming, Python

A continuation of this article I now write a very stupid version of netcat in python.

NOTE: apparently not everyone is having great success. I have tried this out under cygwin and redhat enterprise linux 5, but if it doesn’t work for you please leave a comment and I’ll see what I can do.


The full script can be found here

The second contestant is python the favorite scripting language for NASA and apparently boost. For consistency with the ruby example I’m going to put everything in a class. You don’t have to, but its a good excuse to try out the objects….

The Command Line Arguments

So like ruby, python has a command line argument parser…

from optparse import OptionParser
import sys
class NetTool:
    def run(self):
    def parse_options(self):
        parser = OptionParser(usage="usage: %prog [options]")
        parser.add_option("-c", "--connect",
            help="Connect to a remote host")
        parser.add_option("-l", "--listen",
            help="Listen for a remote host to connect to this host")
            help="Specify the host to connect to")
            help="Specify the TCP port")
        parser.set_defaults(connect=None, hostname=None)
        (options, args) = parser.parse_args();
        if (options.connect == None):
            sys.stdout.write("no connection type specified\n")
        if(options.port == None):
            sys.stdout.write("no port specified\n")
        if(options.connect and (options.hostname == None)):
            sys.stdout.write("connect type requires a hostname\n")
        self.connect = options.connect
        self.hostname = options.hostname
        self.port = options.port
tool = NetTool()

Just off the cuff, the documentation was waaaay better than the ruby documentation and this option parsing is less verbose, more readable, and just as powerful as the ruby version. It does annoy me that you have to specify the self argument to methods, but I have to admit that self.varname is a lot clearer than ruby’s @varname.

I was also a little annoyed that there aren’t any constants or enums built into the language though I guess that isn’t terrible. There are recipes for making them, but I’m always annoyed if I have to write custom code to do something the language should do for me.

Finally, I was initially surprised that code blocks are identified via indentation rather than {} or begin/end or whatever. I have to say that although I hated the idea initially I now think more languages should adopt it. Why? Because you never get this in python:

if(a == b); //<---oops
if(a == b) 
    do_it_again(); //<---oops
if(a == b)
    if(c == d)
else //<---oops

The form of the code has to indicate the function because the function is defined by the form. There isn’t any good reason to let someone write poorly indented code so why make it an option?

Starting up the Socket

Now I need to create the socket.

class NetTool:
    def connect_socket(self):
            self.socket = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            self.socket.connect( (self.hostname, self.port) )
            server = socket.socket(socket.AF_INET, socket.SOCK_STREAM)
            server.setsockopt(socket.SOL_SOCKET, socket.SO_REUSEADDR, 1)
            server.bind(('localhost', self.port)) 
            self.socket, address = server.accept()

I just want to kick off by mentioning that when I’m using a scripting language I don’t want the documentation to tell me that:

  1. I should read a few C UNIX sockets programming books for details and as well as an RFC.
  2. The sockets may behave differently on different platforms (particularly without details on what is different)

In summary, this is very unreadable code The most egregious bit of which is setting the socket.SO_REUSEADDR flag for the socket options. This basically means that you want to be able to listen on the specified port even if something else just recently stopped listening on it. This is because there is some chance that lingering traffic from an old connection could be interpreted as valid for the new connection…blah blah blah blah.

Why do I even have to explain this crap? It’s bad default behavior based on esoteric details inherited from an ancient UNIX API in C. The code doesn’t make clear what is going on or why. Anyone who doesn’t already know Berkley sockets is going to give up or just write code that doesn’t behave properly.

Ruby definitely wins on this one.

Asynchronous IO

This should be interesting. Now I need to simultaneously forward data from STDIN to the socket out and from the socket in to STDOUT all while checking to see if either the connection or STDIN has been closed.

class NetTool:
    def forward_data(self):
            r, w, e = select.select(
                               [self.socket, sys.stdin], 
                               [self.socket, sys.stdin])
                buffer = self.socket.recv(100)
                while( buffer  != ''): 
                    buffer = self.socket.recv(100)
                if(buffer == ''):
            except socket.error:
                r, w, e = select.select([sys.stdin],[],[],0)
                if(len(r) == 0):
                c = sys.stdin.read(1)
                if(c == ''):
                if(self.socket.sendall(c) != None):

Well this is in many ways as bad as the last bit. There isn’t any non-blocking read or at least not one easily found and the select solution doesn’t work properly under cygwin for some reason (it doesn’t register the entire chunk of data on stdin until a double newline or an EOF).

It definitely is not nearly as readable as the ruby version and took a lot longer to code. On the other hand the python documentation was much better and more extensive that most of what I could find for ruby.

The Author

Michael Smit is a software engineer in Seattle, Washington who works for amazon

No comments

No Comments

Leave a comment