C: Linux Socket Programming, TCP, a simple HTTP client -- page 2

Let's go over some sections from the source.

Line 38, we create the socket by calling a custom function: create_tcp_socket defined from line 117 to 125.

  1. int create_tcp_socket()
  2. {
  3.   int sock;
  4.   if((sock = socket(AF_INET, SOCK_STREAM, IPPROTO_TCP)) < 0){
  5.     perror("Can't create TCP socket");
  6.     exit(1);
  7.   }
  8.   return sock;
  9. }

In order to have a TCP socket, the domain has to be AF_INET for IPv4, the type of the socket is SOCK_STREAM in order to have a connection-oriented socket, and finally, the protocol is set to IPPROTO_TCP for TCP.

Then we call get_ip(), defined from line 128-145. get_ip takes a hostname as an argument and will attempt to convert it to a string representing its IP address.

  1. char *get_ip(char *host)
  2. {
  3.   struct hostent *hent;
  4.   int iplen = 15; //XXX.XXX.XXX.XXX
  5.   char *ip = (char *)malloc(iplen+1);
  6.   memset(ip, 0, iplen+1);
  7.   if((hent = gethostbyname(host)) == NULL)
  8.   {
  9.     herror("Can't get IP");
  10.     exit(1);
  11.   }
  12.   if(inet_ntop(AF_INET, (void *)hent->h_addr_list[0], ip, iplen ) == NULL)
  13.   {
  14.     perror("Can't resolve host");
  15.     exit(1);
  16.   }
  17.   return ip;
  18. }

Let's look at this function a bit closer. First we allocate just enough characters to hold an IP address string. Then, we call gethostbyname, which on success return a non-NULL pointer to a struct of type hostent, which will hold all the aliases and network adresses (in network byte order). We then convert the first network address to a string by using inet_ntop and return the string.

Back to main, from line 41 to 53, we set the remote address to finally connect our socket to it on line 55.

Now, our socket is ready to receive or send packet.
Line 59, we build the HTTP query and send it from line 63 to 72. As there is no guarantee that the packet is sent in one go, we need to use a loop that will make sure that all the bytes are sent.

Line 77 to 97, we retrieve the reply from the server. Same here, we need to loop over as we might not receive all the bytes in one shot. This algo will fail to detect the beginning of the HTML content if the "\r\n\r\n" sequence in retrieve in 2 times. But anyway, this is good enough for the example :D.

Finally, we clean up the ressources we allocated manually.


