Software programs are like food and clothing. They come in all shapes and sizes. There are billions of programs. But all of them can be classified in a few categories. This post informs you about the different categories of software programs, what can be achieved with each, where they fall short and how you can decide which one you should build, or assemble a team for.
The topic seems very basic, but I feel that companies today are churning out too much code and too many variants of the same program for their own good. As a company, it is worth pausing and asking yourself what you really want to achieve for the users.
During the 1990s, there were only two ways to transfer voice communication from one place to another. The first type was radio communication, using gadgets like walkie-talkies. The second was telephone networks. At the turn of the century, the improved speed of the Internet was begging for a way to transfer voice in real time from one desktop computer to another. Enter VoIP or Voice over Internet Protocol.
What is VoIP?
VoIP is a way to transfer sound, mainly voice, from one device to another over the Internet in as much real-time as possible, similar to the way radio or phone does. But instead of a radio frequency or a telephone exchange, the medium over which your voice is carried from your device to the recipient’s device is the Internet. Please note that this is not the same as downloading a song in MP3 form and listening to it later. This is like a phone conversation where the voice is streamed from one end to another live.
In a telephone, a connection request is indicated to the recipient by ringing his/her device. A connection is established when the receiver picks up. VoIP too has the concept of connection request and establishment. Just like both the phones have a phone number for unique identity, VoIP devices have VoIP endpoints. These usually look like email addresses. Users connect to each other using endpoints. The act of dialling, ringing, picking up and having a conversation over a phone is known as a phone call. In VoIP, the entire cycle is called a VoIP session. Like calls can be recorded in modern telephony systems, VoIP sessions can be recorded too. And just like conference calls in telephones, VoIP sessions can happen with more than two participants.
In a VoIP call, the two parties use a software app or a dedicated device that connects to a VoIP service provider’s server over Internet. This is similar to the way that your landline telephone is plugged into a wall unit using an RJ11 cable and your mobile phone contains a SIM card that connects to the nearest tower belonging to your provider. The software connection to a VoIP server is established just like your web browser connects to the server of a website or your email software (e.g. Outlook, Thunderbird) connects to an email server. Once connected, the software app can request the server that it wants to talk to another person connected to the service. The server will then send a notification to the call recipient, who can choose to receive the call. The server then manages the path and the streaming of voice between the sender and receiver apps.
Advantages of VoIP
A phone communication requires dedicated hardware. You must use a telephone with a subscribed landline connection or a mobile phone with a SIM card whose service has been enabled.
The cost of a phone call depends on the distance between the participants. There are local, national and international calls all with different tariffs.
Phone networks can only be created and maintained by licensed service providers. You cannot start your own landline or GSM networks. It is possible to set up a local telephone network, also known as an intercom, as long as all the phones are within the same building / locality. But if your office is spread around the country or the globe, you cannot simply start a landline or a mobile network of your own. It would be illegal to. You need to purchase a service from BSNL, Vodafone, Rogers, AT & T, Movistar, etc. depending on where you are.
The phone numbers you receive from phone companies cannot be fenced such that only those from your company can call you. Phone numbers are globally accessible and anyone with your number can call you, whether they work in your organisation or not.
In contrast, VoIP can be used from a laptop, desktop computer and mobile phones with Internet enabled. Even a mobile phone which has no SIM card, but has access to Internet over WiFi is ready for VoIP. All you need is a software app that can connect to the VoIP network of your choice.
Since all the calls are over the Internet, you only pay for your Internet package, whether you speak to your spouse in the next room or to your friend on the other side of the globe.
Also, it is possible to set up your own VoIP network for communication between the multiple offices owned by your company across the world. Based on your access control, the rest of the Internet may be allowed to dial into your VoIP network.
Proprietary VoIP networks
Just like any software solution, VoIP comes in proprietary and open solutions. In proprietary solutions, you need to download the software or purchase the VoIP hardware that is released by the company providing the solution. The dedicated software or hardware automatically connects to the VoIP servers of the solution-providing company, without letting you know where they connect to or without allowing you to set any parameters for the connection. Usually, proprietary solutions allow people to talk using only their software and only to those people who have joined their network and who also use their software. Cross-platform VoIP calls did work for certain companies, but the solutions didn’t last.
Skype (now owned by Microsoft) was one of the first VoIP solutions. They broke ground at the start of the millenium, making VoIP a household name. Other solutions followed over the next decade: Google Talk, Yahoo voice chat, etc. Other than Skype, most other solutions were simply voice add-ons to their chat applications. With the success of Android and iPhone, Viber and FaceTime became some of the earliest VoIP apps available for smart phones. The success of Viber prompted WhatsApp to add audio and video to their otherwise text-only application. Google scrapped their erstwhile Google Talk solution and rebuilt a new solution called Hangouts. Hangouts remains one of the very few VoIP offerings that works purely from a browser and does not need an app. Google have not settled for their robust Hangout app and instead confused the users of their Play Store with another app named Duo. Other companies such as Zoom started their own VoIP solutions mainly with the intention of business conference calls where multiple participants can hold meetings by signing into a pre-determined ‘meeting room’.
All of the above solutions are proprietary and incompatible with each other. Skype cannot talk to WhatsApp, whose users cannot join a Hangout, whose users cannot converse with a user over FaceTime. This dreaded situation is called ‘vendor lock-in’. A user has to either confine him/herself to talk to users who also use their solution. Or they have to download / purchase multiple solutions to include everyone. Sometimes, even that’s not possible. For instance, even if an Android user is ready to pay an arm and a leg, Apple simply will not make an Android FaceTime app.
How will it be if we can use an open VoIP protocol that lets us set up our own VoIP server, and one that can chat with other open VoIP servers. Instead of being tied in to specific software apps from vendors, users can have a choice of apps ranging from basic to advanced, which can be programmed to connect to any VoIP server of their choice. Why not two VoIP servers? One for family and one for work. Can we build an ecosystem of VoIP apps and providers, where a user can simply change from app to app until he / she finds one that is right for him / her and then can simply switch from one VoIP provider within that favourite app to another based on the time of the day, one for casual purposes and one for work?
Yes, we can. That is promise of two protocols named SIP (Session Initiation Protocol) and H.323, which are open protocols. We will learn more details about those protocols in part 2 of the series. In this post, I’ll simply summarise that any SIP-compliant app can connect to any SIP-compliant VoIP provider. The providers of one network may set up their servers such that the contacts and sessions of every user be accessible to other providers.
Radio communications assisted the world during the World Wars. Telephones changed the face of voice communication between the 1950s and 1990s. But VoIP has democratised the way people speak to each other, making voice and video calls available to anyone with a device that has Internet, front-facing camera and a microphone, at a very reasonable price.
With VoIP making its way into the Internet of Things, it may even be possible in the future to speak to your home’s virtual assistant to get things done while you are away. Your home’s security camera may automatically dial you with a VoIP call if it detects something fishy. It remains to be seen how much more advanced this technology will turn out to be.
One of the biggest technological advances in this decade is the usage of machines hosted by server giants like Amazon and Google for our businesses, in the form of AWS and Google Cloud Compute. Not only do these companies offer machines, but they also offer specific services such as databases, service to send SMS, online development tools and backup services. These services are collectively referred to as PaaS or Platform-as-a-Service.
Over the last two years, another new concept has rapidly caught, mainly thanks to Amazon’s Lambda. We call this FaaS or Function-as-a-Service, where instead of running an entire software program or a website throughout the day, we simply run a single function, such as sorting a list of million names or converting the format of 50 videos from MPEG to AVI, etc on a remote machine which stays on for only the duration of the time that our function runs and then shuts down. By not keeping machines running all day, maintenance and operational costs go down significantly. This particular way of running machines for a specific short-term purpose and then shutting them down is now termed as ‘serverless’ architecture. Continue reading “Serverless architecture”
In the last article Introduction to clean architecture: Part 1, we saw how clean architecture is a set of principles for designing software such that the purpose of a software program is clear on seeing its code and the details about what tools or libraries are used to make it are buried deeper, out of sight of the person who views it. This is in line with real world products such as buildings and kitchen tools where a user knows what they are seeing rather than how they are made.
In this article, we will see how a very simple program is designed using clean architecture. I am going to present only the blueprint of a program. I won’t use any programming language, staying true to one of the principles of clean architecture, i.e. it doesn’t matter which programming language is used.
The simple program
In our program, we will allow our system to receive a greeting ‘Hi’ from the user while greeting him/her back with a ‘Hello’. That’s all we need to study how to produce a good program blueprint with clean architecture.
In our system, we have a single user who greets our system. Let’s call him/her the greeter. Let’s just use the word ‘system’ to describe our greeting application. We have just one case in our system which we can call, ‘Greet and be greeted back’. Here’s how it will look.
The greeter greets our system.
On receiving the greeting ‘Hi’ (and only ‘Hi’), our system responds with ‘Hello’, which the greeter receives.
Any greeting other than ‘Hi’ will be ignored and the system will simply not respond.
This simple use case has two aspects.
It comprehensively covers every step in the use case covering all inputs and outputs. It distinctly says that only a greeting of ‘Hi’ will be responded to and that other greetings will be ignored without response. No error messages, etc.
The use case also has obvious omissions. The word ‘greet’ is a vague verb which doesn’t say how it’s done. Does the greeter speak to the system and the system speak back. Does the greeter type at a keyboard or use text and instant messaging? Does the system respond on the screen, shoot back an instant message or send an email? As far as a use case is concerned, those are implementation details, the decisions for which can be deferred for much later. In fact, input and ouput systems should be plug-and-play, where one system can be swapped for another without any effect on the program’s core working, which is to be greeted and to greet back.
The EBI system
Once the requirements are clear, we start with the use cases. The use case is the core of the system we are designing and it is converted into a system of parts known as the EBI or Entity-Boundary-Interactor. There are 5 components within the EBI framework. Every use case in the system is converted to an EBI using these five parts.
Interactor (I): The interactor is the object which receives inputs from user, gets work done by entities and returns the output to the user. The interactor sets things in motion like an orchestra director to make the execution of a use case possible. There is exactly one interactor per use case in the system.
Entities (E): The entities contain the data, the validation rules and logic that turns one form of input into another. After receiving input from the user, the interactor uses different entities in the system to achieve the output that is to be sent to the user. Remember that the interactor itself must NEVER directly contain the logic that transforms input into output. In our use case, our interactor uses the services of an entity called GreetingLookup. This entity contains a table of which greeting from the user should be responded to with which greeting. Our lookup table only contains one entry right now, i.e. a greeting of ‘Hi’ should be responded to with ‘Hello’.
Usually, in a system that has been meant to make things easy, automated or online based on a real world system, entities closely resemble the name, properties and functionality of their real world equivalents. E.g. in an accounting system, you’ll have entities like account, balance sheet, ledger, debit and credit. In a shopping system, you’ll have shopping cart, wallet, payment, items and catalogues of items.
Boundaries (B): Many of the specifications in a use case are vague. The use case assumes that it receives input in a certain format regardless of the method of input. Similarly it sends out output in a predetermined format assuming that the system responsible for showing it to the user will format it properly. Sometimes, an interactor or some of the entities will need to use external services to get some things done. The access to such services are in the form of a boundary known as a gateway.
E.g., in our use case, our inputs and outputs may come from several forms such as typed or spoken inputs. The lookup table may seek the services of a database. Databases are an implementation detail that lie outside the scope of the use case and EBI. Why? Because, we may even use something simpler such as an Excel sheet or a CSV file to create a lookup table. Using a database is an implementation choice rather than a necessity.
Request and response model: While not abbreviated in EBI, request and response models are important parts of the system. A request model specifies the form of data that should be sent across the boundaries when requests and responses are sent. In our case, the greeting from the user to the system and vice-versa should be sent in the form of plain English text. This means that if our system works on voice-based inputs and outputs, the voices must be converted to plain English text and back.
With our EBI system complete to take care of the use case, we must realise that ultimately the system will be used by humans and that different people have different preferences for communication. One person may want to speak to the system, while another prefers instant messaging. One person may want to receive the response as an email message, while another may prefer the system to display it on a big flat LCD with decoration.
A controller is an object which takes the input in the form the user gives and converts it into the form required by the request model. If a user speaks to the system, then the controller’s job is to convert the voice to plain English text before passing it on to the interactor.
On the other side is a presenter that receives plain text from the interactor and converts it into a form that can be used by the UI of the system, e.g. a large banner with formatting, a spoken voice output, etc.
Being able to test individual components is a big strength of the clean architecture system. Here are the ways in which the system can be tested.
Use case: Since the use case in the form of EBI is seperated from the user interface, we can test the use case without having to input data manually through keyboards. Testing can be automated by using a tool that can inject data in the form of the request model, i.e. plain text. Likewise the response from the use case can be easily tested since it is plain text. Also individual entities and the interactor can be seperately tested.
Gateway: The gateways such as databases or API calls can be individually tested without having to go through the entire UI and use case. One can tools that use mock data to see if the inputs to and outputs from databases and services on the Internet work correctly.
Controllers and presenters: Without involving the UI and the use case, one can test if controllers are able to convert input data to request model correctly or if presenters are able to convert response model to output data.
Freedom to swap and change components
Interactors: Changes to the interactors are often received well by the entire system. Interactors are usually algorithms and pieces of code that bind the other components together, usually a sequence of steps on what to do. Changes to the steps does not change any functionality in the other components of the system.
Entities: Entities are components that contain a lot of data and rules relevant to the system. Changes to entities will usually lead to corresponding changes in the interactor to comply with the new rules.
Boundaries: Boundaries are agreements between the interactor and external components like controllers, presenters and gateways. A change to the boundary will inevitably change some code in the external components, so that the boundary can be complied.
UI: With a solid use case in place, you can experiment with various forms of UI to see which one is most popular with your users. You can experiment with text, email, chat, voice, banner, etc. The use case and the gateway do not change. However, some changes to the UI can cause a part of the controller and the presenter to change, since these two are directly related to how the UI works.
Controller and presenter: It is rare for the controller or presenter to change in their own rights. A change to the controller or presenter usually means that the UI or the boundary has also changed.
Clean architecture seperates systems such that the functionality is at the core of the system, while everything like user interface, storage and web can be kept at the periphery, where one component can be swapped for another. Hopefully, our example has given you a good idea about how to approach any system with clean architecture in mind.
Let’s say you want live sports, but you are at work. You are not allowed to watch the game on video, but there are several other options. One of them is to visit the event’s official website and use the section for live text commentary.
In text commentary, you will see a running log of the moments of the game. In football, the minute of the game serves as the lead, followed by a brief description of the action, e.g. “57′: Buffon saves a Ronaldo freekick by diving to his left.”. In cricket, you have the over and ball, followed by what happened, e.g. “17.4: Two runs. Kohli drives the ball to long-off, but Watson is in position to prevent the boundary.”
Not only sports, but other events like parliamentary sessions, seminars and board meetings are published the same way, e.g. “10:32 am: Mr. John Doe, Member of Parliament from Acme constituency, takes the mic.”
Let’s take this to the world of technology, with sensors recording the current temperature or an online analytics tool that watches every minute you watch a movie online, shop or read a blog. These tools record a minute-by-minute about what’s going on in the surroundings or what the user is doing on a website. In fact, minute-by-minute is an understatement and it is rather microsecond-by-microsecond. To help record this, the world of technology uses a special type of database called the time-series database.
Continue reading “Follow the trend, use time-series DB”
You probably know that a pressure cooker cannot be opened when it is very hot with plenty of steam built up inside. If you yank at the weight, it will protest with a loud hiss. But there are no problems opening the same cooker either before cooking or after it has completely cooled down. How can the same apparatus behave differently under different conditions for the same procedure: opening the lid? Software engineers will say that the cooker is using the state design pattern. Continue reading “Design patterns: State pattern”
After treating Adil, his doctor prescribes medicines in a complicated dosage. There are two medicinal tablets, one green and one red. The green medicine is to be taken 3 times in the coming week: Monday, Thursday and Saturday and the red one, 4 times: Monday, Wednesday, Friday and Sunday.
As Adil leaves, the doctor hands him two strips of medicine: 7 green tablets and 7 red ones. Adil is confused. He asks, “7 each? But doesn’t it call for 3 of these and 4 of those with a schedule?”. To which, the doctor replies. “How can I be sure that you won’t forget the complicated schedule? That’s why I have given you 7. Have one of each every day.” Adil is aghast. “But… isn’t that over-dosage?” “No, it’s not. 4 of the green tablets are simply mint candies. 3 of the red ones are strawberry candies. Inside the strips, the real tablets are interspersed with identical looking candies as per your dosage schedule. You don’t have to worry. Just habitually have one tablet of each colour every day. Start from the top of each strip.”
Brilliant! The doctor took a complex decision-making process away from Adil and just let him build a simple habit: one green tablet and one red tablet every day. The real tablets will fight against the illness that Adil approached the doctor with. The candies are there to simply … do nothing! In design pattern parlance, the doctor just used the Null Object pattern. Continue reading “Design patterns: Null object”
If you have ever visited a government office, you are probably directed from one counter to another to get tasks done. Why doesn’t the same person do everything? This is because the work is divided into small tasks and each government official is given the responsibility of only one task. Once done, that official will direct you to the next one. You are seeing the chain of responsibility design pattern in action. Continue reading “Design pattern: Chain of responsibility”
Have you ever noticed? If you say “Hi” to Jay, he replies the same. However Jyoti always replies, “What’s up!”. When you get angry and say “Shut up!”, the reactions are different too. Jay is calm, but firm. His reply is, “Hey, watch your word, buddy!”. But Jyoti loses it and says, “Shut up yourself, dumbo!”