Unix Socket Programming Questions And Answers

21⟩ What exactly does SO_KEEPALIVE do?

The SO_KEEPALIVE option causes a packet (called a 'keepalive probe') to be sent to the remote system if a long time (by default, more than 2 hours) passes with no other data being sent or received. This packet is designed to provoke an ACK response from the peer. This enables detection of a peer which has become unreachable (e.g. powered off or disconnected from the net).

Note that the figure of 2 hours comes from RFC1122, "Requirements for Internet Hosts". The precise value should be configurable, but I've often found this to be difficult. The only implementation I know of that allows the keepalive interval to be set per-connection is SVR4.2.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

276 views

22⟩ What exactly does SO_REUSEADDR do?

This socket option tells the kernel that even if this port is busy (in the TIME_WAIT state), go ahead and reuse it anyway. If it is busy, but with another state, you will still get an address already in use error. It is useful if your server has been shut down, and then restarted right away while sockets are still active on its port. You should be aware that if any unexpected data comes in, it may confuse your server, but while this is possible, it is not likely.

It has been pointed out that "A socket is a 5 tuple (proto, local addr, local port, remote addr, remote port). SO_REUSEADDR just says that you can reuse local addresses. The 5 tuple still must be unique!" by Michael Hunter (mphunter@qnx.com). This is true, and this is why it is very unlikely that unexpected data will ever be seen by your server. The danger is that such a 5 tuple is still floating around on the net, and while it is bouncing around, a new connection from the same client, on the same system, happens to get the same remote port. This is explained by Richard Stevens in ``2.7 Please explain the TIME_WAIT state.''.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

220 views

23⟩ How can I make my server a daemon?

There are two approaches you can take here. The first is to use inetd to do all the hard work for you. The second is to do all the hard work yourself. If you use inetd, you simply use stdin, stdout, or stderr for your socket. (These three are all created with dup() from the real socket) You can use these as you would a socket in your code. The inetd process will even close the socket for you when you are done.

#include <stdio.h>

#include <stdlib.h>

#include <ctype.h>

#include <unistd.h>

#include <fcntl.h>

#include <signal.h>

#include <sys/wait.h>

/* Global variables */

volatile sig_atomic_t keep_going = 1;

/* controls program termination */

/* Function prototypes: */

void termination_handler (int signum);

/* clean up before termination */

int

main (void)

{

...

if (chdir (HOME_DIR))

/* change to directory containing data

files */

{

fprintf (stderr, "`%s': ", HOME_DIR);

perror (NULL);

exit (1);

}

/* Become a daemon: */

switch (fork ())

{

case -1:

/* can't fork */

perror ("fork()");

exit (3);

case 0:

/* child, process becomes a daemon: */

close (STDIN_FILENO);

close (STDOUT_FILENO);

close (STDERR_FILENO);

if (setsid () == -1)

/* request a new session (job control) */

{

exit (4);

}

break;

default:

/* parent returns to calling process: */

return 0;

}

/* Establish signal handler to

clean up before termination: */

if (signal (SIGTERM, termination_handler)

== SIG_IGN)

signal (SIGTERM, SIG_IGN);

signal (SIGINT, SIG_IGN);

signal (SIGHUP, SIG_IGN);

/* Main program loop */

while (keep_going)

{

...

}

return 0;

}

void

termination_handler (int signum)

{

keep_going = 0;

signal (signum, termination_handler);

}

Is this answer helpful? 0 Yes | 0 No

Answer This Question

196 views

24⟩ How come I get address already in use from bind()?

You get this when the address is already in use. (Oh, you figured that much out?) The most common reason for this is that you have stopped your server, and then re-started it right away. The sockets that were used by the first incarnation of the server are still active. This is further explained in ``2.7 Please explain the TIME_WAIT state.'', and ``2.5 How do I properly close a socket?''.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

202 views

25⟩ Why do I get connection refused when the server is not running?

The connect() call will only block while it is waiting to establish a connection. When there is no server waiting at the other end, it gets notified that the connection can not be established, and gives up with the error message you see. This is a good thing, since if it were not the case clients might wait for ever for a service which just doesn't exist. Users would think that they were only waiting for the connection to be established, and then after a while give up, muttering something about crummy software under their breath.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

193 views

26⟩ How can I set the timeout for the connect() system call?

Normally you cannot change this. Solaris does let you do this, on a per-kernel basis with the ndd tcp_ip_abort_cinterval parameter.

The easiest way to shorten the connect time is with an alarm() around the call to connect(). A harder way is to use select(), after setting the socket nonblocking. Also notice that you can only shorten the connect time, there's normally no way to lengthen it.

First, create the socket and put it into non-blocking mode, then call connect(). There are three possibilities:

o connect succeeds: the connection has been successfully made (this usually only happens when connecting to the same machine)

o connect fails: obvious

o connect returns -1/EINPROGRESS. The connection attempt has begun, but not yet completed.

If the connection succeeds:

o the socket will select() as writable (and will also select as readable if data arrives)

If the connection fails:

o the socket will select as readable *and* writable, but either a read or write will return the error code from the connection attempt. Also, you can use getsockopt(SO_ERROR) to get the error status - but be careful; some systems return the error code in the result parameter of getsockopt, but others (incorrectly) cause the getsockopt call *itself* to fail with the stored value as the error.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

193 views

27⟩ Why does connect() succeed even before my server did an accept()?

Once you have done a listen() call on your socket, the kernel is primed to accept connections on it. The usual UNIX implementation of this works by immediately completing the SYN handshake for any incoming valid SYN segments (connection attempts), creating the socket for the new connection, and keeping this new socket on an internal queue ready for the accept() call. So the socket is fully open before the accept is done.

The other factor in this is the 'backlog' parameter for listen(); that defines how many of these completed connections can be queued at one time. If the specified number is exceeded, then new incoming connects are simply ignored (which causes them to be retried).

Is this answer helpful? 0 Yes | 0 No

Answer This Question

181 views

28⟩ How do I convert a string into an internet address?

If you are reading a host's address from the command line, you may not know if you have an aaa.bbb.ccc.ddd style address, or a host.domain.com style address. What I do with these, is first try to use it as a aaa.bbb.ccc.ddd type address, and if that fails, then do a name lookup on it. Here is an example:

/* Converts ascii text to in_addr struct.

/* NULL is returned if the

address can not be found. */

struct in_addr *atoaddr(char *address) {

struct hostent *host;

static struct in_addr saddr;

/* First try it as aaa.bbb.ccc.ddd. */

saddr.s_addr = inet_addr(address);

if (saddr.s_addr != -1) {

return &saddr;

}

host = gethostbyname(address);

if (host != NULL) {

return (struct in_addr *) *host->h_addr_list;

}

return NULL;

}

Is this answer helpful? 0 Yes | 0 No

Answer This Question

198 views

29⟩ What are socket exceptions? What is out-of-band data?

Unlike exceptions in C++, socket exceptions do not indicate that an error has occured. Socket exceptions usually refer to the notification that out-of-band data has arrived. Out-of-band data (called "urgent data" in TCP) looks to the application like a separate stream of data from the main data stream. This can be useful for separating two different kinds of data. Note that just because it is called "urgent data" does not mean that it will be delivered any faster, or with higher priorety than data in the in-band data stream. Also beware that unlike the main data stream, the out-of-bound data may be lost if your application can't keep up with it.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

198 views

30⟩ Why do I keep getting EINTR from the socket calls?

This isn't really so much an error as an exit condition. It means that the call was interrupted by a signal. Any call that might block should be wrapped in a loop that checkes for EINTR, as is done in the example code .

Is this answer helpful? 0 Yes | 0 No

Answer This Question

190 views

31⟩ Is there any advantage to handling the signal, rather than just ignoring it and checking for the EPIPE error? Are there any useful parameters passed to the signal catching function?

See that send()/write() can generate SIGPIPE. Is there any advantage to handling the signal, rather than just ignoring it and checking for the EPIPE error? Are there any useful parameters passed to the signal catching function?

In general, the only parameter passed to a signal handler is the signal number that caused it to be invoked. Some systems have optional additional parameters, but they are no use to you in this case.

My advice is to just ignore SIGPIPE as you suggest. That's what I do in just about all of my socket code; errno values are easier to handle than signals (in fact, the first revision of the FAQ failed to mention SIGPIPE in that context; I'd got so used to ignoring it...)

There is one situation where you should not ignore SIGPIPE; if you are going to exec() another program with stdout redirected to a socket. In this case it is probably wise to set SIGPIPE to SIG_DFL before doing the exec().

Is this answer helpful? 0 Yes | 0 No

Answer This Question

215 views

32⟩ How do I send [this] over a socket?

Anything other than single bytes of data will probably get mangled unless you take care. For integer values you can use htons() and friends, and strings are really just a bunch of single bytes, so those should be OK. Be careful not to send a pointer to a string though, since the pointer will be meaningless on another machine. If you need to send a struct, you should write sendthisstruct() and readthisstruct() functions for it that do all the work of taking the structure apart on one side, and putting it back together on the other. If you need to send floats, you may have a lot of work ahead of you. You should read RFC 1014 which is about portable ways of getting data from one machine to another (thanks to Andrew Gabriel for pointing this out).

Is this answer helpful? 0 Yes | 0 No

Answer This Question

202 views

33⟩ How come select says there is data, but read returns zero?

The data that causes select to return is the EOF because the other side has closed the connection. This causes read to return zero.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

218 views

34⟩ How can I force a socket to send the data in its buffer?

You can't force it. Period. TCP makes up its own mind as to when it can send data. Now, normally when you call write() on a TCP socket, TCP will indeed send a segment, but there's no guarantee and no way to force this. There are lots of reasons why TCP will not send a segment: a closed window and the Nagle algorithm are two things to come immediately to mind.

Setting this only disables one of the many tests, the Nagle algorithm. But if the original poster's problem is this, then setting this socket option will help.

A quick glance at tcp_output() shows around 11 tests TCP has to make as to whether to send a segment or not.

As you've surmised, I've never had any problem with disabling Nagle's algorithm. Its basically a buffering method; there's a fixed overhead for all packets, no matter how small. Hence, Nagle's algorithm collects small packets together (no more than .2sec delay) and thereby reduces the amount of overhead bytes being transferred. This approach works well for rcp, for example: the .2 second delay isn't humanly noticeable, and multiple users have their small packets more efficiently transferred. Helps in university settings where most folks using the network are using standard tools such as rcp and ftp, and programs such as telnet may use it, too.

However, Nagle's algorithm is pure havoc for real-time control and not much better for keystroke interactive applications (control-C, anyone?). It has seemed to me that the types of new programs using sockets that people write usually do have problems with small packet delays. One way to bypass Nagle's algorithm selectively is to use "out-of-band" messaging, but that is limited in its content and has other effects (such as a loss of sequentiality) (by the way, out-of- band is often used for that ctrl-C, too).

So to sum it all up, if you are having trouble and need to flush the socket, setting the TCP_NODELAY option will usually solve the problem. If it doesn't, you will have to use out-of-band messaging, but according to Andrew, "out-of-band data has its own problems, and I don't think it works well as a solution to buffering delays (haven't tried it though). It is not 'expedited data' in the sense that exists in some other protocols; it is transmitted in-stream, but with a pointer to indicate where it is."

Is this answer helpful? 0 Yes | 0 No

Answer This Question

185 views

35⟩ What are the pros/cons of select(), non-blocking I/O and SIGIO?

Using non-blocking I/O means that you have to poll sockets to see if there is data to be read from them. Polling should usually be avoided since it uses more CPU time than other techniques.

Using SIGIO allows your application to do what it does and have the operating system tell it (with a signal) that there is data waiting for it on a socket. The only drawback to this soltion is that it can be confusing, and if you are dealing with multiple sockets you will have to do a select() anyway to find out which one(s) is ready to be read.

Using select() is great if your application has to accept data from more than one socket at a time since it will block until any one of a number of sockets is ready with data. One other advantage to select() is that you can set a time-out value after which control will be returned to you whether any of the sockets have data for you or not.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

192 views

36⟩ Why does it take so long to detect that the peer died?

Because by default, no packets are sent on the TCP connection unless there is data to send or acknowledge.

So, if you are simply waiting for data from the peer, there is no way to tell if the peer has silently gone away, or just isn't ready to send any more data yet. This can be a problem (especially if the peer is a PC, and the user just hits the Big Switch...).

One solution is to use the SO_KEEPALIVE option. This option enables periodic probing of the connection to ensure that the peer is still present. BE WARNED: the default timeout for this option is AT LEAST 2 HOURS. This timeout can often be altered (in a system-dependent fashion) but not normally on a per-connection basis (AFAIK).

RFC1122 specifies that this timeout (if it exists) must be configurable. On the majority of Unix variants, this configuration may only be done globally, affecting all TCP connections which have keepalive enabled. The method of changing the value, moreover, is often difficult and/or poorly documented, and in any case is different for just about every version in existence.

If you must change the value, look for something resembling tcp_keepidle in your kernel configuration or network options configuration.

If you're sending to the peer, though, you have some better guarantees; since sending data implies receiving ACKs from the peer, then you will know after the retransmit timeout whether the peer is still alive. But the retransmit timeout is designed to allow for various contingencies, with the intention that TCP connections are not dropped simply as a result of minor network upsets. So you should still expect a delay of several minutes before getting notification of the failure.

The approach taken by most application protocols currently in use on the Internet (e.g. FTP, SMTP etc.) is to implement read timeouts on the server end; the server simply gives up on the client if no requests are received in a given time period (often of the order of 15 minutes). Protocols where the connection is maintained even if idle for long periods have two choices:

1. use SO_KEEPALIVE

2. use a higher-level keepalive mechanism (such as sending a null request to the server every so often).

Is this answer helpful? 0 Yes | 0 No

Answer This Question

207 views

37⟩ Why do I get EPROTO from read()?

EPROTO means that the protocol encountered an unrecoverable error for that endpoint. EPROTO is one of those catch-all error codes used by STREAMS-based drivers when a better code isn't available.

Not quite to do with EPROTO from read(), but I found out once that on some STREAMS-based implementations, EPROTO could be returned by accept() if the incoming connection was reset before the accept completes.

On some other implementations, accept seemed to be capable of blocking if this occured. This is important, since if select() said the listening socket was readable, then you would normally expect not to block in the accept() call. The fix is, of course, to set nonblocking mode on the listening socket if you are going to use select() on it.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

200 views

38⟩ Where can a get a library for programming sockets?

There is the Simple Sockets Library by Charles E. Campbell, Jr. PhD. and Terry McRoberts. The file is called ssl.tar.gz, and you can download it from this faq's home page. For c++ there is the Socket++ library which is on ftp://ftp.virginia.edu/pub/socket++-1.10.tar.gz. There is also C++ Wrappers. The file is called ftp://ftp.huji.ac.il/pub/languages/C++/. Thanks to Bill McKinnon for tracking it down for me! From http://www.cs.wustl.edu/~schmidt you should be able to find the ACE toolkit. PING Software Group has some libraries that include a sockets interface among other things. You can find them at http://love.geology.yale.edu/~markl/ping.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

184 views

39⟩ Whats the difference between select() and poll()?

The basic difference is that select()'s fd_set is a bit mask and therefore has some fixed size. It would be possible for the kernel to not limit this size when the kernel is compiled, allowing the application to define FD_SETSIZE to whatever it wants (as the comments in the system header imply today) but it takes more work. 4.4BSD's kernel and the Solaris library function both have this limit. But I see that BSD/OS 2.1 has now been coded to avoid this limit, so it's doable, just a small matter of programming. :-) Someone should file a Solaris bug report on this, and see if it ever gets fixed.

With poll(), however, the user must allocate an array of pollfd structures, and pass the number of entries in this array, so there's no fundamental limit. As Casper notes, fewer systems have poll() than select, so the latter is more portable. Also, with original implementations (SVR3) you could not set the descriptor to -1 to tell the kernel to ignore an entry in the pollfd structure, which made it hard to remove entries from the array; SVR4 gets around this. Personally, I always use select() and rarely poll(), because I port my code to BSD environments too. Someone could write an implementation of poll() that uses select(), for these environments, but I've never seen one. Both select() and poll() are being standardized by POSIX 1003.1g.

Is this answer helpful? 0 Yes | 0 No

Answer This Question

186 views

40⟩ How do I use TCP_NODELAY?

First off, be sure you really want to use it in the first place. It will disable the Nagle algorithm (see ``2.11 How can I force a socket to send the data in its buffer?''), which will cause network traffic to increase, with smaller than needed packets wasting bandwidth. Also, from what I have been able to tell, the speed increase is very small, so you should probably do it without TCP_NODELAY first, and only turn it on if there is a problem.

Here is a code example,

with a warning about using it

int flag = 1;

int result = setsockopt(sock,

/* socket affected */

IPPROTO_TCP,

/* set option at TCP level */

TCP_NODELAY,

/* name of option */

(char *) &flag,

/* the cast is historical cruft */

sizeof(int));

/* length of option value */

if (result < 0)

... handle the error ...

TCP_NODELAY is for a specific purpose; to disable the Nagle buffering algorithm. It should only be set for applications that send frequent small bursts of information without getting an immediate response, where timely delivery of data is required (the canonical example is mouse movements).

Is this answer helpful? 0 Yes | 0 No

Answer This Question

204 views

Unix Socket Programming

Home Operating System Unix Socket Programming

62 Unix Socket Programming Questions And Answers

21⟩ What exactly does SO_KEEPALIVE do?

22⟩ What exactly does SO_REUSEADDR do?

23⟩ How can I make my server a daemon?

24⟩ How come I get address already in use from bind()?

25⟩ Why do I get connection refused when the server is not running?

26⟩ How can I set the timeout for the connect() system call?

27⟩ Why does connect() succeed even before my server did an accept()?

28⟩ How do I convert a string into an internet address?

29⟩ What are socket exceptions? What is out-of-band data?

30⟩ Why do I keep getting EINTR from the socket calls?

31⟩ Is there any advantage to handling the signal, rather than just ignoring it and checking for the EPIPE error? Are there any useful parameters passed to the signal catching function?

32⟩ How do I send [this] over a socket?

33⟩ How come select says there is data, but read returns zero?

34⟩ How can I force a socket to send the data in its buffer?

35⟩ What are the pros/cons of select(), non-blocking I/O and SIGIO?

36⟩ Why does it take so long to detect that the peer died?

37⟩ Why do I get EPROTO from read()?

38⟩ Where can a get a library for programming sockets?

39⟩ Whats the difference between select() and poll()?

40⟩ How do I use TCP_NODELAY?

Quick Links:

Unix Socket Programming

Home Operating System Unix Socket Programming

62 Unix Socket Programming Questions And Answers

21⟩ What exactly does SO_KEEPALIVE do?

22⟩ What exactly does SO_REUSEADDR do?

23⟩ How can I make my server a daemon?

24⟩ How come I get address already in use from bind()?

25⟩ Why do I get connection refused when the server is not running?

26⟩ How can I set the timeout for the connect() system call?

27⟩ Why does connect() succeed even before my server did an accept()?

28⟩ How do I convert a string into an internet address?

29⟩ What are socket exceptions? What is out-of-band data?

30⟩ Why do I keep getting EINTR from the socket calls?

31⟩ Is there any advantage to handling the signal, rather than just ignoring it and checking for the EPIPE error? Are there any useful parameters passed to the signal catching function?

32⟩ How do I send [this] over a socket?

33⟩ How come select says there is data, but read returns zero?

34⟩ How can I force a socket to send the data in its buffer?

35⟩ What are the pros/cons of select(), non-blocking I/O and SIGIO?

36⟩ Why does it take so long to detect that the peer died?

37⟩ Why do I get EPROTO from read()?

38⟩ Where can a get a library for programming sockets?

39⟩ Whats the difference between select() and poll()?

40⟩ How do I use TCP_NODELAY?

BE THE FIRST TO KNOW

Quick Links: