Sockets in Unix
Sockets in Unix
INTRODUCTION TO SOCKET:
Sockets are communication points on the same or different computers to exchange data. Sockets
are supported by UNIX, Windows, Mac and many other operating systems. In Unix, every I/O
actions are done by writing or reading to a file descriptor. A file descriptor is just an integer
associated with an open file and it can be a network connection, a text file, a terminal, or
something else.
To a programmer a socket looks and behaves much like a low level file descriptor. This is
because commands such as read () and write () work with sockets in the same way they do with
files and pipes. The differences between sockets and normal file descriptors occur in the creation
of a socket and through a variety of special operations to control a socket.
Sockets were first introduced in 2.1BSD and subsequently refined into their current form with
4.2BSD. The sockets feature is now available with most current UNIX system releases.
A Unix Socket is used in a client server application frameworks. A server is a process which
does some function on request from a client. Most of the application level protocols like FTP,
SMTP and POP3 make use of Sockets to establish connection between client and server and then
for exchanging data.
SOCKET TYPES:
There are four types of sockets available to the users. The first two are most commenly used and
last two are rarely used.Processes are presumed to communicate only between sockets of the
same type but there is no restriction that prevents communication between sockets of different
types.
Sequenced Packet Sockets: They are similar to a stream socket, with the exception that
record boundaries are preserved. This interface is provided only as part of the Network
Systems (NS) socket abstraction, and is very important in most serious NS applications.
Sequenced-packet sockets allow the user to manipulate the Sequence Packet Protocol
(SPP) or Internet Datagram Protocol (IDP) headers on a packet or a group of packets
either by writing a prototype header along with whatever data is to be sent, or by
specifying a default header to be used with all outgoing data, and allows the user to
receive the headers on incoming packets.
Address Classes:
IP addresses are managed and created by the Internet Assigned Numbers Authority (IANA).
There are 5 different address classes. You can determine which class any IP address is in by
examining the first 4 bits of the IP address.
Addresses beginning with 01111111, or 127 decimal, are reserved for loopback and for internal
testing on a local machine; [ You can test this: you should always be able to ping 127.0.0.1,
which points to yourself ] Class D addresses are reserved for multicasting; Class E addresses are
reserved for future use. They should not be used for host addresses.
CLIENT AND SERVER:
Client Process:
This is the process which typically makes a request for information. After getting the response
this process may terminate or may do some other processing.
For example: Internet Browser works as a client application which sends a request to Web Server
to get one HTML web page.
Server Process:
This is the process which takes a request from the clients. After getting a request from the client,
this process will do required processing and will gather requested information and will send it to
the requestor client. Once done, it becomes ready to serve another client. Server process are
always alert and ready to serve incoming requests.
For example: Web Server keeps waiting for requests from Internet Browsers and as soon as it
gets any request from a browser, it picks up a requested HTML page and sends it back to that
Browser.
Notice that the client needs to know of the existence and the address of the server, but the server
does not need to know the address or even the existence of the client prior to the connection
being established. Once a connection is established, both sides can send and receive information.
2-tier architectures: In this architecture, client directly interact with the server. This type
of architecture may have some security holes and performance problems. Internet
Explorer and Web Server works on two tier architecture. Here security problems are
resolved using Secure Socket Layer(SSL).
3-tier architectures: In this architecture, one more software sits in between client and
server. This middle software is called middleware. Middleware are used to perform all
the security checks and load balancing in case of heavy load. A middleware takes all
requests from the client and after doing required authentication it passes that request to
the server. Then server does required processing and sends response back to the
middleware and finally middleware passes this response back to the client. If you want to
implement a 3-tier architecture then you can keep any middle ware like Web Logic or
WebSphere software in between your Web Server and Web Browsers.
Types of Server:
There are two types of servers you can have:
Iterative Server: This is the simplest form of server where a server process serves one
client and after completing first request then it takes request from another client.
Meanwhile another client keeps waiting.
Concurrent Servers: This type of server runs multiple concurrent processes to serve
many request at a time. Because one process may take longer and another client can not
wait for so long. The simplest way to write a concurrent server under Unix is to fork a
child process to handle each client separately.
The system calls for establishing a connection are somewhat different for the client and the
server, but both involve the basic construct of a socket. The two processes each establish their
own sockets.
The steps involved in establishing a socket on the client side are as follows:
2. Connect the socket to the address of the server using the connect() system call.
3. Send and receive data. There are a number of ways to do this, but the simplest is to use
the read() and write() system calls.
The steps involved in establishing a socket on the server side are as follows:
2. Bind the socket to an address using the bind() system call. For a server socket on the
Internet, an address consists of a port number on the host machine.
4. Accept a connection with the accept() system call. This call typically blocks until a client
connects with the server.
5. Send and receive data using the read() and write() system calls.
Routines for converting data between a host's internal representation and Network Byte Order
are:
Function Description
htons() Host to Network Short
htonl() Host to Network Long
ntohl() Network to Host Long
ntohs() Network to Host Short
unsigned short htons(unsigned short hostshort)
This function converts 16-bit (2-byte) quantities from host byte order to network byte order.
The connect function is used by a TCP client to establish a connection with a TCP server.
The bind function assigns a local protocol address to a socket. With the Internet protocols, the
protocol address is the combination of either a 32-bit IPv4 address or a 128-bit IPv6 address,
along with a 16-bit TCP or UDP port number. This function is called by TCP server only.
The listen function is called only by a TCP server and it performs two actions:
The listen function converts an unconnected socket into a passive socket, indicating that
the kernel should accept incoming connection requests directed to this socket.
The second argument to this function specifies the maximum number of connections the
kernel should queue for this socket.
The accept function is called by a TCP server to return the next completed connection from the
front of the completed connection queue. If the completed connection queue is empty, the
process is put to sleep.
The listen function converts an unconnected socket into a passive socket, indicating that
the kernel should accept incoming connection requests directed to this socket.
The second argument to this function specifies the maximum number of connections the
kernel should queue for this socket.
You can use write() system call to send the data. This call is explained in helper functions
tutorial.
The recv function is used to receive data over stream sockets or CONNECTED datagram
sockets. If you want to receive data over UNCONNECTED datagram sockets you must use
recvfrom().
You can use read() system call to read the data. This call is explained in helper functions tutorial.
The sendto function is used to send data over UNCONNECTED datagram sockets. Put simply,
when you use scoket type as SOCK_DGRAM
The recvfrom function is used to receive data from UNCONNECTED datagram sockets. Put
simply, when you use scoket type as SOCK_DGRAM
The close function is used to close the communication between client and server.
The shutdown function is used to gracefully close the communication between client and server.
This function gives more control in caomparision of close function.
The select function indicates which of the specified file descriptors is ready for reading, ready for
writing, or has an error condition pending.
When an application calls recv or recvfrom it is blocked until data arrives for that socket. An
application could be doing other useful processing while the incoming data stream is empty.
Another situation is when an application receives data from multiple sockets.
Calling recv or recvfrom on a socket that has no data in it's input queue prevents immediate
reception of data from other sockets. The select function call solves this problem by allowing the
program to poll all the socket handles to see if they are available for non-blocking reading and
writing operations.
The write function attempts to write nbyte bytes from the buffer pointed to by buf to the file
associated with the open file descriptor, fildes.
You can also use send() function to send data to another process.
The read function attempts to read nbyte bytes from the file associated with the open file
descriptor, fildes, into the buffer pointed to by buf.
You can also use recv() function to read data to another process.
The fork function create a new process. The new process is called child process will be an exact
copy of the calling process (parent process). The child process inherits many attributes from the
parent process.
The bzero function places nbyte null bytes in the string s. This function will be used to set all the
socket structures with null values.
The bcmp function compares byte string s1 against byte string s2. Both strings are assumed to be
nbyte bytes long.
The bcopy function copies nbyte bytes from string s1 to the string s2. Overlapping strings are
handled correctly.