Email is obviously the most common thing that you do on the Internet besides using your browser to read / post content on websites. Making it a snap to communicate 1-on-1 or in a group, Email has been around even before the existence of the World Wide Web or the browser based Internet that we see and love today. Companies and individuals have been using Email using software like Outlook well before the era of browser surfing. The union of browser and the Email using Webmail was what gave a huge boost to the adoption of Email by the average Internet user though and the first movers to the market in that regard were Hotmail and AOL. Google’s Gmail is now the most commonly used Email platform. I explained how browser based Internet works in a two part series (Part 1, Part 2). In this post, I will explain how Email works. Exactly what happens when you click the Send button and how does the Email find its way to the recipient?
Okay, so you compose a nice email, add the recipient, subject and an image as an attachment too. You have been using your favourite Email client, be it Outlook, Thunderbird, webmail client on your browser or your mobile phone’s Email app. Now you are ready to send the Email and you click the ‘Send’. The magic of Internet is ready to happen and eventually the message will find its way to your recipient’s inbox. Let us go on a trip with the Email message as it floats over the Internet and becomes a new message in a bold font in your recipient’s list of new Emails.
Step 1: Your Email client connects your Email account’s SMTP server
Since you are sending an Email, you yourselves must have an account registered with an Email provider. The Email provider has a database that contains a list of all valid users who are authorised to send / receive Emails using their service. The database can be in any format, but any Email provider provides a standard way for Email clients to communicate with them. This standard communication is called the Simple Mail Transfer Protocol (SMTP). If you are using a webclient, e.g. GMail, Yahoo! Mail or Hotmail, the underlying web application knows which SMTP server to connect to. However when you are using a deskop / generic mobile Email client like Outlook, you need to configure the settings yourself, but these settings are generally well documented by your Email provider. ISPs provide this info in their booklets and your office will have a network administator who will configure your Outlook for you.
Once you click the ‘Send’ button, the Email client attempts to connect to the SMTP server on port 25 (I explain the concept of ports here) which is the international standard port reserved for SMTP.
The SMTP server is either an IP address or a domain name. As discussed in this post, domain name is a human readable name like smtp.gmail.com or an IP address like 18.104.22.168. If the SMTP server is mentioned as a domain name, the corresponding IP address must be first found out as I explained in the above mentioned post. Once the IP address is known, the Email client knows which remote machine to connect to so that the Internet works its magic to send the Email to the intended recipient.
Step 2: The Email client and the SMTP server exchange introductions
Now it is the small talk and the authentication phase. The two parties talk to each other and the server asks the client to identify itself and authorise with the correct Email account username and password. Once the formalities are over, the client gets to the point and intends to send the contents of the Email to the SMTP server.
Step 3: The Email client sends the Email contents to the SMTP server
The Email is a combination of the main content of the email accompanied by administrative / book-keeping data called headers. Think of them as the envelope and the stamps that go along with regular post. The headers that go along with an Email are typically the list of recipients, subject, date/time and other such data that are established as standards. Then comes the content of the Email. Now, an Email can contain text and attachments. Email communication uses a method called multipart-data to let the Email server and the recipient software to know that the mail contains multiple entities. Multipart data is typically a header that describes the type of the data, followed by the data itself. The headers shout out to the server and to the recipient about how to interpret the data. Think of it as communication among the personnel working with movers and packers as they hand out sealed packages to each other. “Hey, this package contains a metal chest. It is rather heavy, so take care of your back when you lift it. The next sealed package is a crate of bottles. Fragile. Careful….” and so on.
Step 4: The SMTP server queues up the Email in its list of outgoing mails
The upcoming step 5 is elaborate and can take some time. Hence the server has to maintain a queue of outgoing mails to hold the mails in its end for sometime.
Step 5: The SMTP server finds out the domain to use for the recipient
When the SMTP server reads through the Email, it retrieves the domain name from the recipient’s address. E.g. for a recipient email@example.com, the domain name is gmail.com.
Step 6: The SMTP server seeks the MX record for the domain
If you are a beginner to the world of Emails, the title of step 6 will not make much sense to you. Allow me to explain with an analogy.
Let us ponder this question for a moment. You order a paperback book from Amazon and receive it at your doorstep. Who delivered the book to you? If you answered ‘Amazon’, then probably you may not be entirely right. While Amazon does have its own logistic service, they depend on many partners too, e.g. Fedex, DHL, etc. If you look at the delivery person or on the package, you will come to know who actually carried the book to you. It would be the same if you were to use Western Union to transfer to someone’s Citibank account. You would say that you wired to Citibank, but it was the WU infrastructure which you used. In simple words, you purchased
MX records are like ‘logistic partners’. People would want to use convenient names like @gmail.com and @abclimited.com, typically a domain name which matches the main domain name of the organisation, which is the equivalent of saying, ‘I bought from Amazon’ or ‘I wired to Citibank’. However just like Fedex and Western Union, Email servers (SMTP) typically reside on a seperate infrastructure from the main website of an organisation. This seperate infrastructure is kept failsafe, so that Email can still be used via Email clients like Outlook, just in case the main website goes down. Conventionally, the Email domain is the main domain prefixed with smtp or mail or something similar, e.g. smtp.abclimited.com, mail.mycompany.com, etc. If MX records were not to exist, people would have to use the Email domain name directly inside the address, instead of the shortened and more convenient main company domain. E.g. it sounds inconvenient to use firstname.lastname@example.org or email@example.com, which would be like saying, “Hey Amazon, I want to buy this book. Please deliver it to me via DHL”!
MX records are looked up to solve this issue. When the sender’s SMTP server gets the domain name from the receiver’s address, the MX record helps get the domain name of the Email server for a given domain name, e.g. smtp.gmail.com for a recipient address firstname.lastname@example.org.
The above screenshot shows how any mail sent to a user email@example.com, will be sent to mx1.dnsmadeeasy.com. The latter is the actual SMTP mail server which provides Email service for the company on behalf of the domain example.com.
Step 7: The SMTP server connects to the recipient SMTP server
Using the TCP handshake described in this post, the sender’s SMTP server connects to the recipient’s SMTP server.
Step 8: The sender’s SMTP server transfers the mail to the recipient SMTP server
Everything including the headers and the content of the Email are transferred between the servers. The recipient server stores the Email in its database of messages.
Step 9: The Email client of the recipient retrieves new mails using an Email retrieval protocol
Once the Email has reached the recipient’s account, the Email client can access the list of new mails. To do this, one of the following techniques is used.
- A desktop / mobile client like Outlook uses a protocol called IMAP to retrieve the list of Emails for a particular account.
- A webmail over a browser uses a techique called websocket to listen for the availability of new Emails for the account.
Step 10: The Email client parses the Email
The Email client then parses through the Email to retrieve the details about the sender, the subject line, the contents and attachments. These are then shown to the recipient.
Email is one of the most commonly used communication methods over the Internet. Emails are simple, yet powerful and have been around for a long time. Even with the proliferation of instant messaging, Email still serves its purpose and remains ubiquitous.