Hey! Socket programming got you down? Is this stuff just a little too difficult to figure out from the man pages? You want to do cool Internet. [PDF] Beej's Guide to Network Programming Using Internet Sockets The late W . Richard Steven's UNIX Network Programming still holds the. If you want to get Programming Windows pdf eBook copy write by good author, you can download Version Beej's Guide to Network Programming.
|Language:||English, Spanish, French|
|Genre:||Business & Career|
|ePub File Size:||26.45 MB|
|PDF File Size:||20.35 MB|
|Distribution:||Free* [*Regsitration Required]|
Beej's Guide to Network Programming The XSL-FO output is then munged by Apache FOP to produce PDF documents, using Liberation fonts. This is my little how-to guide on network programming using Internet sockets, or " sockets programming", for those of you who prefer it. The sockets API, though. Beej's Guide to Network Programming. Using Internet .. are just starting out with socket programming and are looking for a foothold. It is certainly not Second edition ISBNs: , , There is a third.
I could think of a few things, but they don't pertain to socket programming. Notice that this has the added benefit of allowing your program to do something else while it's connecting. Why does select keep falling out on a signal? These functions work for the unsigned variations as well. On the other hand.
You will. You know. You'll load this struct up a bit. And this is the important bit: A socket descriptor is the following type: This is where we start getting into the nitty-gritty details of what's inside an IP address structure.
Some structs are IPv4. To deal with struct sockaddr. I'll cover various data types used by the sockets interface. It's time to talk about programming. This is cool because your code can be IP version-agnostic. In this section. Things get weird from here. You might not usually need to write to these structures.
First the easy one: Note that this is a linked list: I'll make notes of which are what. I'd use the first result that worked. This structure is a more recent invention. It'll return a pointer to a new linked list of these structures filled out with all the goodies you need. It's also used in host name lookups. What is that thing? So you pass in this parallel structure. What about IPv6? Similar structs exist for it. Note that IPv6 has an IPv6 address and a port number. Let's dig deeper!
This structure makes it easy to reference elements of the socket address. Good riddance. Another quick note to mention once again the old way of doing things: The conversion can be made as follows: Part Deux Fortunately for you. You will use getaddrinfo to do that.
Two macros conveniently hold the size of the string you'll need to hold the largest IPv4 or IPv6 address: So check to make sure the result is greater than 0 before using! In this case. The function you want to use. What about the other way around? It's also obsolete and won't work with IPv6. When you call it. Private Or Disconnected Networks Lots of places have a firewall that hides the network from the rest of the world for their own protection. The answer is: How is this possible?
Two computers can't share the same IP address. But if you want to allocate addresses for yourself on a network that won't route outside. Networks behind a NATing firewall don't need to be on one of these reserved networks. The It's doing NAT! Less common is They'll start with fdxx: They are on a private network with 24 million IP addresses allocated to it.
They are all just for me. IPv6 has private networks. Fun fact! My external IP address isn't really The details of which private network numbers are available for you to use are outlined in RFC Who is translating the IP address from one to the other?
Beej's Guide to Network Programming 13 3. But if I ask my local computer what its IP address is. Are you getting nervous yet?
And often times. But I wanted to talk about the network behind the firewall in case you started getting confused by the network numbers you were seeing. Here's what's happening: If I log into a remote computer. I have a firewall at home. NAT and IPv6 don't generally mix. Of course. Any place that you find you're hard-coding anything related to the IP version. This will keep you IP version-agnostic. Instead of gethostbyname. Tell me now!
Almost everything in here is something I've gone over. Use IPv6 multicast instead. Instead of gethostbyaddr. Et voila! Nor is it desirable. In that. The place most people get stuck around here is what order to call these things in. Please note that for brevity.
It helps set up the structs you need later on. You give this function three input parameters.
And they very commonly assume that the result from calls to getaddrinfo succeed and return a valid entry in the linked list. A tiny bit of history: Both of these situations are properly addressed in the stand-alone programs.
System Calls or Bust This is the section where we get into the system calls and other library calls that allow you to access the network functionality of a Unix box.
When you call one of these functions. Let's take a look! Next is the parameter service. This is a real workhorse of a function with a lot of options. The node parameter is the host name to connect to.
In these modern times. Here's a sample call if you're a server who wants to listen on your host's IP address. This is no longer necessary. I've tried to lay out the system calls in the following sections in exactly approximately the same order that you'll need to call them in your programs. Note that this doesn't actually do any listening or network setup.
This short program18 will print the IP addresses for whatever host you specify on the command line: If everything works properly. Beej's Guide to Network Programming 16 hints. Then we make the call. Let's write a quick demo program to show off this information. Here's a sample call if you're a client who wants to connect to a particular server. Or you can put a specific address in as the first parameter to getaddrinfo where I currently have NULL.
I keep saying that servinfo is a linked list with all kinds of address information. If there's an error getaddrinfo returns non-zero. This is nice because then you don't have to hardcode it. Sample run! Everyone loves screenshots: Sorry about that!
I'm not sure of a better way around it. Keep reading! There's a little bit of ugliness there where we have to dig into the different types of struct sockaddrs depending on the IP version. That didn't happen.
If you're going to only be doing a connect because you're the client. The global variable errno is set to the error's value see the errno man page for more details. But what are these arguments? They allow you to say what kind of socket you want IPv4 or IPv6. Or you can call getprotobyname to look up the protocol you want. Read it anyway. It used to be people would hardcode these values. Once you have a socket. And they all lived happily ever after. The End. Beej's Guide to Network Programming 18 5.
The answer is that it's really no good by itself. Once upon a time. Here's the breakdown: The port number is used by the kernel to match an incoming packet to a certain process's socket descriptor.
What you really want to do is use the values from the results of the call to getaddrinfo. I guess I can put it off no longer—I have to talk about the socket system call.
Here is the synopsis for the bind system call: You can either wait for it to clear a minute or so. Let's have an example that binds the socket to the host the program is running on. Another thing to watch out for when calling bind: In the above code. Obviously this is IPv4-specific. You can have any port number above that.
Beej's Guide to Network Programming 19 sockfd is the socket file descriptor returned by socket. If you want to bind to a specific local IP address. That's a bit to absorb in one chunk. I'm telling the program to bind to the IP of the host it's running on. Let's just pretend for a few minutes that you're a telnet application. What if you don't want to connect to a remote host. You comply and call socket. You can do that if you want to. The kernel will choose a local port for us.
The process is two step: If you are connect ing to a remote machine and you don't care what your local port is as is the case with telnet where you only care about the remote port.
Your user commands you just like in the movie TRON to get a socket file descriptor. Beej's Guide to Network Programming 20 if setsockopt listener.
Is this starting to make more sense? I can't hear you from here. The listen call is fairly simple. All of this information can be gleaned from the results of the getaddrinfo call.
No worries. Be sure to check the return value from connect —it'll return -1 on error and set the variable errno. So read furiously onward! No time to lose! The connect call is as follows: See the similar note in the bind section. What do you do now? Lucky for you. Most systems silently limit this number to about Beej's Guide to Network Programming 21 int listen int sockfd. This is where the information about the incoming connection will go and with it you can determine which host is calling you from which port.
Betcha didn't figure that. You call accept and you tell it to get the pending connection. Like before. The code in the accept section. What's going to happen is this: The really tricky part of this whole sha-bang is the call to accept. Their connection will be queued up waiting to be accept ed. What does that mean? We're there! The call is as follows: Guess what?
It'll return to you a brand new socket file descriptor to use for this single connection! The original one is still listening for more new connections. You have to be able to tell your buddies which port to connect to! So if you're going to be listening for incoming connections. Easy enough. If it puts fewer in. Beej's Guide to Network Programming If you're only getting one single connection ever, you can close the listening sockfd in order to prevent more incoming connections on the same port, if you so desire.
These two functions are for communicating over stream sockets or connected datagram sockets. If you want to use regular unconnected datagram sockets, you'll need to see the section on sendto and recvfrom , below. The send call: Just set flags to 0.
See the send man page for more information concerning flags. Some sample code might be: See, sometimes you tell it to send a whole gob of data and it just can't handle it. It'll fire off as much of the data as it can, and trust you to send the rest later. Remember, if the value returned by send doesn't match the value in len, it's up to you to send the rest of the string. The good news is this: Again, -1 is returned on error, and errno is set to the error number. The recv call is similar in many respects: See the recv man page for flag information.
This can mean only one thing: A return value of 0 is recv 's way of letting you know this has occurred. There, that was easy, wasn't it? You can now pass data back and forth on stream sockets! You're a Unix Network Programmer! We have just the thing. Since datagram sockets aren't connected to a remote host, guess which piece of information we need to give before we send a packet? That's right! The destination address! Here's the scoop: To get your hands on the destination address structure, you'll probably either get it from getaddrinfo , or from recvfrom , below, or you'll fill it out by hand.
Just like with send , sendto returns the number of bytes actually sent which, again, might be less than the number of bytes you told it to send! Equally similar are recv and recvfrom. The synopsis of recvfrom is: When the function returns, fromlen will contain the length of the address actually stored in from. So, here's a question: Because, you see, we want to not tie ourselves down to IPv4 or IPv6. Seems extraneous and redundant, huh. The answer is, it just isn't big enough, and I'd guess that changing it at this point would be Problematic.
So they made a new one. Remember, if you connect a datagram socket, you can then simply use send and recv for all your transactions. The socket itself is still a datagram socket and the packets still use UDP, but the socket interface will automatically add the destination and source information for you.
You've been send ing and recv ing data all day long, and you've had it. You're ready to close the connection on your socket descriptor. This is easy. You can just use the regular Unix file descriptor close function: Anyone attempting to read or write the socket on the remote end will receive an error.
Just in case you want a little more control over how the socket closes, you can use the shutdown function.
It allows you to cut off communication in a certain direction, or both ways just like close does. If you deign to use shutdown on unconnected datagram sockets, it will simply make the socket unavailable for further send and recv calls remember that you can use these if you connect your datagram socket.
It's important to note that shutdown doesn't actually close the file descriptor—it just changes its usability. To free a socket descriptor, you need to use close. Nothing to it. Except to remember that if you're using Windows and Winsock that you should call closesocket instead of close. This function is so easy. It's so easy, I almost didn't give it its own section.
But here it is anyway. The function getpeername will tell you who is at the other end of a connected stream socket.
The synopsis: The function returns -1 on error and sets errno accordingly. No, you can't get their login name. Ok, ok. If the other computer is running an ident daemon, this is possible.
This, however, is beyond the scope of this document. Check out RFC for more info. Even easier than getpeername is the function gethostname. It returns the name of the computer that your program is running on. The name can then be used by gethostbyname , below, to determine the IP address of your local machine.
What could be more fun? I could think of a few things, but they don't pertain to socket programming. Anyway, here's the breakdown: The function returns 0 on successful completion, and -1 on error, setting errno as usual.
Just about everything on the network deals with client processes talking to server processes and vice-versa. The exchange of information between client and server is summarized in the above diagram. The server code Client-Server Interaction. All you need to do to test this server is run it in one window. Take telnet. Every time you use ftp. It handles the incoming telnet connection. Client-Server Background It's a client-server world.
When you connect to a remote host on port 23 with telnet the client. The basic routine is: This is what our sample server does in the next section. IPv4 or IPv6: Beej's Guide to Network Programming 26 while waitpid The client source Beej's Guide to Network Programming 27 if listen sockfd.
You can get the data from this server by using the client listed in the next section. It gets the string that the server sends.
I have the code in one big main function for I feel syntactic clarity. Feel free to split it into smaller functions if it makes you feel better. If you make lots of zombies and don't reap them. The code that's there is responsible for reaping zombie processes that appear as the fork ed child processes exit. All this client does is connect to the host you specify on the command line.
A Simple Stream Client This guy's even easier than the server. Beej's Guide to Network Programming 29 printf "client: Here is the source for listener.
Datagram Sockets We've already covered the basics of UDP datagram sockets with our discussion of sendto and recvfrom. Very useful. This is one of the perks of using unconnected datagram sockets! Next comes the source for talker. Beej's Guide to Network Programming 32 And that's all there is to it! Run listener on some machine.
Except for one more tiny detail that I've mentioned many times in the past: Let's say that talker calls connect and specifies the listener's address. From that point on.
I need to talk about this here. For this reason. Watch them communicate! Fun G-rated excitement for the entire nuclear family! You don't even have to run the server this time! You can run talker by itself. Slightly Advanced Techniques These aren't really advanced. Blocking Blocking.
By setting a socket to non-blocking. Generally speaking. I'll offer the synopsis of select: When you first create the socket descriptor with socket. Have at it! One possible alternative is libevent Take the following situation: If you put your program in a busy- wait looking for data on the socket.
If you don't want a socket to be blocking. The reason they can do this is because they're allowed to. Which do you check for? The specification doesn't actually specify which your system will return.
What if you're blocking on an accept call? How are you going to recv data at the same time? You don't want to be a CPU hog. A more elegant solution for checking to see if there's data waiting to be read comes in the following section on select.
All the recv functions block. You've heard about it—now what the heck is it? In a nutshell. Not so fast.
It'll tell you which ones are ready for reading. You probably noticed that when you run listener. This being said. No problem. If you try to read from a non-blocking socket and there's no data there. What happened is that it called recvfrom. So here we go into the brave new world of some of the more esoteric things you might want to learn about sockets.
Lots of functions block. In this example..
When select returns. The following code snippet25 waits 2. There are 1. The following macros operate on this type: This time structure allows you to specify a timeout period. We have a microsecond resolution timer! Clear all entries from the set. If you set the parameter timeout to NULL. I'll talk about how to manipulate these sets. If you want to see if you can read from standard input and some socket descriptor.
The struct timeval has the follow fields: This depends on what flavor of Unix you're running. Before progressing much further. Remove fd from the set. The parameter numfds should be set to the values of the highest file descriptor plus one. You'll probably have to wait some part of your standard Unix timeslice no matter how small you set your struct timeval. Return true if fd is in the set. If the time is exceeded and select still hasn't found any ready file descriptors.
Beej's Guide to Network Programming 34 int select int numfds. Other things of interest: If you set the fields in your struct timeval to 0. Add fd to the set. Don't rely on that occurring if you want to be portable.
But others do not. This program26 acts like a simple multi-user chat server. You should see what your local man page says on the matter if you want to attempt it. But have a look. I know. One more note of interest about select: That's how you know the client has closed the connection. Some Unices can use select in this manner. Beej's Guide to Network Programming 35 struct timeval tv. When you actually do recv from it.
And that. Some Unices update the time in your struct timeval to reflect the amount of time still remaining before a timeout. It's a bummer. When you type something in one telnet session. Use gettimeofday if you need to track time elapsed. Start it running in one window. What happens if a socket in the read set closes the connection? Beej's Guide to Network Programming 37 exit 3.
Due to circumstances beyond your control. You could write a function like this to do it. I know some data has been received. So I get it. I must store these safely away somewhere.
I have to add it to the master set? And every time a connection closes. What happened to the remaining bytes? The first. Notice I check to see when the listener socket is ready to read. I know the client has closed the connection.. At the last minute.
If the client recv returns non-zero. When it is. The reason I have the master set is that select actually changes the set you pass into it to reflect which sockets are ready to read. Check it out! Handling Partial send s Remember back in the section about send. I have to remove it from the master set?
That is. But doesn't this mean that every time I get a new connection. Since I have to keep track of the connections from one call of select to the next. In addition. This means it will block on the read after the select says it won't! Why you little—! Sometimes a human-readable protocol is excellent to use in a non-bandwidth-intensive situation.
I prefer the third method. The receiver will parse the text back into a number using a function like strtol. Convert the number into text with a function like sprintf. Just send the data raw. This will be the same number of bytes you asked it to send. Method two: So hunt around and do your homework before deciding to implement this stuff yourself.
This one is quite easy but dangerous! The receiver will decode it. For completeness. Sneak preview! Tonight only! Serialization—How to Pack Data It's easy enough to send text data across the network. Beej's Guide to Network Programming 39 The function returns -1 on error and errno is still set from the call to send. Encode the number into a portable binary form.
The first method. I include the information here for those curious about how things like this work. See the fcntl reference page for more info on setting a socket to non-blocking. Actually all the methods. I should tell you that there are libraries out there for doing this. It turns out you have a few options. If the packets are variable length.
You probably have to encapsulate remember that from the data encapsulation section way back there at the beginning? Read on for details! Quick note to all you Linux fans out there: Usage is fairly straightforward: I did. Hey—maybe you don't need portability. And since there's no standard way in C to do this. To reverse unencode the number. When packing integer types. But didn't I just get finished saying there wasn't any such function for other non-integer types?
Is all hope lost? Fear not! Were you afraid there for a second? Not even a little bit? There is something we can do: The code is decidedly non-portable. Beej's Guide to Network Programming 40 send s. The thing to do is to pack the data into a known format and send that over the wire for decoding. That's what htons and its ilk do.
What can we do instead? Mostly—it doesn't encode NaN or Infinity. On the other hand. On the minus side. You can also see in the above example that the last couple decimal places are not correctly preserved. Here's some code that encodes floats and doubles into IEEE format Most computers use this format internally for doing floating point math.
But if you want your source code to be portable. Here's sample usage: Beej's Guide to Network Programming 42 long long shift. Zeus saves a kitten every time I recommend it. But if you want to write your own packing utility in C. One thing you can do is write a helper function to help pack the data for you. The Practice of Programming is an excellent read. Python and Perl programmers will want to check out their language's pack and unpack functions for accomplishing the same thing.
That's a lot of work. At this point. To quote a friend. Here's a version I cooked up33 on my own based on that which hopefully will be enough to give you an idea of how such a thing can work. It'll be fun! This code references the pack functions. Back to it: I always blame Microsoft. And Java has a big-ol' Serializable interface that can be used in a similar way.
I'd link to them. Beej's Guide to Network Programming 43 Another question you might have is how do you pack structs? Unfortunately for you. Beej's Guide to Network Programming 47 case 'g': Beej's Guide to Network Programming 48 case 'c': You've found the Runestaff! Be wary when unpacking data you get over the network—a malicious user might send badly-constructed packets in an effort to attack your system! Beej's Guide to Network Programming 49 break.
How does the client know when one message starts and another stops? You could. What should your header look like? Why did I choose the 8-byte and byte limits for the fields? I pulled them out of the air. The problem is that the messages can be of varying lengths. But you're not obligated to. That's vague. When packing the data. So far so good? At least. So we encapsulate the data in a tiny header and packet structure. Beej's Guide to Network Programming 50 Whether you roll your own code or use someone else's.
NUL-padded if necessary. RFC Son of Data Encapsulation What does it really mean to encapsulate data. The Packet Police are not right outside your door. Don't look now. In any case. Excellent question. I suggest conforming to that if you're going to roll the data yourself. Your outgoing data stream looks like this: And so on. Let's have a look a sample packet structure that we might use in this situation: Using the above packet definition. The length of the packet should be calculated as the length of this data plus 8 the length of the name field.
I don't think they are. But that wastes bandwidth! And then let's assume the data is variable length. In the simplest case. The choice is up to you. When you've handled the first one. The advantage of this method is that you only need a buffer large enough for one packet. Another option is just to call recv and say the amount you're willing to receive is the maximum number of bytes in a packet.
But this is why you made your work buffer large enough to hold two packets—in case this happened! Since you know the length of the first packet from the header. Every time you recv data. If you're still curious. Broadcast Packets—Hello. Are you juggling that in your head yet? Once the packet is complete.
Book Site. Jorgensen Publishing October 21, ; eBook Version 3. Book Description Beej's Guide to Network Programming has been one of the top socket programming guides on the Internet for the last 15 years, and it's now for the first time available as a lovingly bound paperback book!
Amazon Lulu. All Categories. Recent Books. IT Research Library. Miscellaneous Books. Computer Languages. Computer Science. Electronic Engineering. Linux and Unix.