In part 1 of this series, we saw how a browser breaks down your requests, finds out which remote server to connect to, how a connection between the browser and server is established and how a browser uses HTTP instructions to request for a web page.

In this part, we shall see how the server prepares its response and returns it to the browser. We will also see how the browser interprets the response and renders the page on the screen.

We assume that the browser has already requested the server for a web page. The response from the server involves the following steps.

Step 1: The server tries to recognise the resource requested by the URL

As soon as the server gets the request from the browser, it starts looking for a valid resource represented by the URL, in our case /journey-with-the-world-wide-web-part-2. Depending on the configuration made in the server, the URL may point to one of several things. It may be a file residing on the disk of the server, it may be a record in a database, a program or a file loaded in memory and so on. If the server finds a valid resource at the end of the URL, then it continues further. Otherwise, you now know why you sometimes get the infamous 404 Not Found error. But what exactly is the number 404?

A note on response codes

HTTP is a request-response based communication. The browser requests for something and the server obliges with a response. But the server has impeccable manners and always begins its response with a polite greeting, apology or offer for help! Let us use the analogy of a shop whenever we discuss the server’s response. Let us say that you walk into a bakery and ask the shopkeeper for a chocolate muffin. If the shopkeeper says, ‘Sorry you cannot get muffins right now.’, you know that you don’t stand a chance of getting any muffins from this shop. However, you may probe the shopkeeper further for which he / she may say that they are out of muffins, out of ingredients or have a broken oven. But once you have heard ‘sorry’, you know that you have to leave the shop without muffins. The reason may be something of not much interest to you. Similarly the web server always responds with a polite response code everytime it responds to a request. Based on the initial response, the browser may decide whether or not to press this communication further. A 404 is the equivalent of, ‘Sorry, but we do not have what you are asking for’. Some servers may go on to send a ‘Not Found’ page specific to that website, otherwise the browser can show its own.

Response code ranges

A response code of 200 is used to signal success. This is the equivalent of, “Here you go”, usually followed by a more meaningful response such as, “One chocolate muffin coming up”.

All the response codes in the range of 400 and 500 are reserved for errors or ‘Sorry, we can give you this because …’.

Redirections

Let us also take some time to discuss the 300 range response codes. These are called ‘redirections’ and are used mainly for the following reasons.

An old URL is not valid temporarily and the server wants to play traffic cop / diversion sign. E.g. “Sorry, this road is closed for carnival. Please take that diversion sir / ma’am. Please bear with us for today.”.
An old URL is no longer valid and the server wants to intercept those going to that URL to direct them to the new location and then tell them that the old URL is no longer valid. E.g. “Sorry, but we shifted the dry fruits shelf to that aisle. Please remember that dry fruits will always be in that aisle from now on and not here”.
An ISP uses redirects to land the user at the login page and have him / her sign up before using Internet further. E.g. “Excuse me, please come here and get your baggage scanned before proceeding further”.

Step 2: The server builds the response

Once the server is satisfied with the request and it has data ready to send back, there is a specific format in which the server should build the response. The format is called HTML or HyperText Markup Language. This is the format which is understood by browsers and instructs the browser how to layout content in a specific design. E.g. where the header goes, how many columns of text should be there, colour of text, images, tables, etc. Along with the response, the server also sends out some instructions, such as ‘the data I am sending you is 1000 bytes long, so please stop listening to me as soon as you receive that much and show the data to the user’. Other instructions it can send are, “This user’s profile image that I am sending you is valid for another 100 days or until I tell you that the image has changed. Until such a thing happens, please keep a copy of this image on your end and use it instead of downloading again and again.”

Step 3: The server sends out the data

The server sends out the data and it follows more or less the reverse of the same path that was used during the request phase. The system of routers help the response find the way back to the original browser.

Step 4: The browser renders the page to the screen

Finally, sometime after the browser has requested the page and waited, the response makes its way back and the browser is ready to use the response to show the user a nicely designed webpage. The browser has a built-in HTML parser using which it is able to understand HTML instructions to layout text, images, tables and columns. The results are out there for the user to see.

Conclusion

Over two blog posts, we have hopped on an insightful return-ticket journey between the browser and the web server to understand what goes on behind the scenes as a user looks for and gets to an informative webpage. Very much like a public transit system’s dedicated officers, the world wide web is kept running smooth by a system of network engineers, administrators, domain registrars, data storage scientist and programmers. Looking for information on the Internet has been made as easy as hopping onto a bus or a train, knowing fully well that the system will eventually get you where you want to be.

Every day, we intuitively type addresses of websites into our browser and within seconds get a nicely informative and designed webpage as we learn more, achieve more and get better at our life with the power of the World Wide Web.

But what exactly happens between the time that you type an address into your browser and the browser showing you the contents of a webpage. Turns out that hundreds of machines work together with clockwork co-ordination to understand what you requested, bring back the relevant information and show it to you on a page. The World Wide Web is a complex ecosystem of machines of different types that run around the clock to ensure that the website that you love so much is available to you 24/7.

Let us hop on a journey that takes you from the moment you request the URL of this page (http://www.tech101.in/journey-with-the-world-wide-web-part-1 ) in your browser to the moment when this blog post shows up on your browser.

Continue reading “Journey with the World Wide Web: Part 1”

Month: April 2016

Openness and Standards: The Lingua Franca

How email works

Journey with the World Wide Web: Part 2