The World Wide Web (WWW) is a repository of information spread all over the world and linked together. The WWW is a distributed client-server service, in which a client using a browser can access a service using a server. The Web consists of Web pages that are accessible over the Internet.
The Web allows users to view documents that contain text and graphics. The Web grew to be the largest source of Internet traffic since 1994 and continues to dominate, with a much higher growth rate than the rest of the internet. By 1995, Web traffic overtook FTP to become the leader. By 2001, Web traffic completely overshadowed other applications.
Hypertext Transfer Protocol (HTTP)
The protocol used to transfer a Web page between a browser and a Web server is known as Hypertext Transfer Protocol (HTTP). HTTP operates at the application level. HTTP is a protocol used mainly to access data on the World Wide Web. HTTP functions like a combination of FTP and SMTP.
It is similar to FTP because it transfers files, while HTTP is like SMTP because the data transferred between the client and the server looks like SMTP messages. However, HTTP differs from SMTP in the way that SMTP messages are stored and forwarded; HTTP messages are delivered immediately.
As a simple example, a browser sends an HTTP GET command to request a Web page from a server. A browser contacts a Web server directly to obtain a page. The browser begins with a URL, extracts the hostname section, uses DNS to map the name into an equivalent IP address, and uses the IP address to form a TCP connection to the server.
Once the TCP connection is in place, the browser and Web server use HTTP to communicate. Thus, if the browser sends a request to retrieve a specific page, the server responds by sending a copy of the page. A browser requests a Web page, and the server transfers a copy to the browser.
HTTP also allows transfer from a browser to a server. HTTP allows browsers and servers to negotiate details such as the character set to be used during transfers. To improve response time, a browser caches a copy of each Web page it retrieves.
HTTP allows a machine along the path between a browser and a server to act as a proxy server that caches Web pages and answers a browser’s request from its cache. Proxy servers are an important part of the Web architecture because they reduce the load on servers.
In summary, a browser and server use HTTP to communicate. HTTP is an applicationlevel protocol with explicit support for negotiation, proxy servers, caching and persistent connections.
Hypertext Markup Language (HTML)
The browser architecture is composed of the controller and the interpreters to display a Web document on the screen. The controller can be one of the protocols such as HTTP, FTP, Gopher or TELNET. The interpreter can be HTML or Java, depending on the type of document.
The Hypertext Markup Language (HTML) is a language used to create Web pages. A markup language such as HTML is embedded in the file itself, and formatting instructions are stored with the text. Thus, any browser can read the instructions and format the text according to the workstation being used.
Suppose a user creates formatted text on a Macintosh computer and stores it in a Web page, so another user who is on an IBM computer is not able to receive the Web page because the two computers are using different formatting procedures.
Consider a case where different word processors use different techniques or procedures to format text. To overcome these difficulties, HTML uses only ASCII characters for both main text and formatting instructions. Therefore, every computer can receive the whole document as an ASCII document.
Common Gateway Interface (CGI)
A dynamic document is created by a Web server whenever a browser requests the document. When a request arrives, the Web server runs an application program that creates the dynamic document. Common Gateway Interface (CGI) is a technology that creates and handles dynamic documents.
CGI is a set of standards that defines how a dynamic document should be written, how the input data should be supplied to the program and how the output result should be used.
CGI is not a new language, but it allows programmers to use any of several languages such as C, C++, Bourne Shell, Korn Shell or Perl. A CGI program in its simplest form is code written in one of the languages supporting the CGI.
Java
Java is a combination of a high-level programming language, a run-time environment and a library that allows a programmer to write an active document and a browser to run it. It can also be used as a stand-alone program without using a browser. However, Java is mostly used to create a small application program of an applet.