Elements of 3D computer graphics

Over the past decade, computer graphics have moved increasingly to the world of 3D. With the release of 3D graphics libraries like Unreal and Unity, more players are entering the game, figuratively and literally. But it takes a lot to change your perception from a 2D world to one of 3D. It is more than just adding a third axis named Z. We wouldn’t simply want to see flat 2D shapes floating around in our 3D world, would we?

Continue reading “Elements of 3D computer graphics”

Compilers and Interpreters

Khushi loves to write children’s stories. She has written several stories in English in her country India. Muskan, a friend of Khushi, works for Samsung. She goes on a business trip to South Korea and takes her young daughter along. She has some children’s books in one of her bags. A local colleague loves the book and asks Muskan if her friend can make the books available in South Korea in Korean.

Khushi is delighted at the opportunity and is ready for a trial book tour in Seoul, where she is to showcase three trial stories to South Korean children. If they love it, there’ll be a deal with a publisher. However she has a problem. She doesn’t speak a word of Korean. The children of South Korea aren’t comfortable with English. So Khushi has two choices. She can use help, translate the three stories into Korean and take the Korean versions with her so that the children can read them on their own. Otherwise she can seek someone in Seoul who speaks both the languages fluently. Then she can read the story line-by-line in English and the interpreter can translate each line into Korean for the children at the same time.

What does children’s books have to do with our topic today? There are two types of programming languages in software. Some of them use a compiler and some use an interpreter. Using the children’s books as an example, we will learn how the two work.

Compiler: When Khushi takes a translated book to South Korea

Khushi decides to do the hard work upfront. She hires a translator on Upwork and gets the three stories translated to Korean. She can then take the Korean books to South Korea and hand them to the children, who can read the books without help. She can take the Korean books to as many Korean communities as she likes, without having to look for a translator every time.

However, Khushi faces the following issues. On one hand, she needs to keep the English versions as her source products, whereas the Korean versions are her by-products. Instead of just 3 products (the original 3 stories), Khushi needs to maintain 6 products (including the 3 translations).

If Khushi finds the need to rewrite several parts of her stories, she must discard the Korean translations and have them done again. Thus, the process of translating beforehand is too sensitive to changes.

If someone invites Khushi to Netherlands, then the Korean translations are useless. She has to find a Dutch translator who will translate her books to Dutch. That will add 3 more products to her collection, taking the total to 9.

Finally, translation is a lot of work upfront. The entire book has to be translated from start to finish so that it can be used in South Korea. The time taken for translation will depend on the length of the stories and the complexity of the sentences used.

This is how a program using a compiler behaves. Let’s assume that you write your computer program in C or C++. A computer cannot understand either of those two languages. It can understand only something called machine language. Machine languages are different for different types of computers, just like Korean is to South Korea and Dutch to Netherlands. E.g. the machine language understood by a desktop computer is different from that of a mobile phone and that of a microwave oven. The microwave oven cannot understand the machine language meant for a desktop computer, the same way a Dutchman cannot understand Korean.

Once you write your program, you need to run a compiler to translate the program into the machine language of the target machine. You need to compile one copy for each type of computing device you wish to run the program on. You need to maintain each compiled copy along with your original program in C/C++. The original one is for you to develop and grow your program in the future, whereas the compiled versions are for distribution. As long as two machines understand the same machine language, you can use the same compiled version on both. This is similar to the way that Khushi can use the Korean translation at Seoul as well as Incheon, but not at Amsterdam.

Every time you make changes to your program, you need to re-compile to all the target machine language versions, discard the older compilations and distribute the new ones to the target machines.

Interpreter: When Khushi hires an accompanying translator within South Korea

In this scenario, Khushi does not get her book translated upfront. She takes the English book with her to South Korea with the promise that she will have an interpreter accompanying her all the time. She reaches this agreement with Netherlands too. As long as the two countries keep their end of the promise, Khushi does not need to get any translation done upfront. Her luggage to both the countries includes the three stories that she wrote in English. When Khushi reaches the target community in South Korea, she pulls out her stories and starts reading them line-by-line. As soon as she finishes reading a line, the interpreter speaks the whole line in Korean, thus reaching Khushi’s audience.

Khushi can makes as many changes to the stories as she likes. She is guaranteed that her latest version will be translated line-by-line during her next story-reading session.

But life isn’t all rosy with this approach too. Khushi needs to make absolutely sure that the target country has a translator available to her. Without the accompanying translator, her English books are just useless.

Also, Khushi needs a translator every time she wants to showcase her stories to the same community or different communities inside South Korea. Just because her translator translated her story orally line-by-line doesn’t mean that anything was recorded in writing. The line-by-line translation needs to be done all over again.

Finally, take the case of a country like Kenya or Congo. These countries are poor and translation is not a lucrative job. The quality of translators is usually mediocre, with limited vocabulary. The result may not do justice to Khushi’s hard work.

This is how interpreted languages like Javascript, Python and Java work. The program in the original language is directly copied to as many target machines as required without undergoing any compilation, with the promise that the target machines have the interpreter of the required programming language installed. The programmer can make as many changes as required and immediately copy the new version to the targets, which will use the new version during the next run.

If the target device does not have an interpreter for the programming language, then the whole program is useless. There is no way to run it in that device.

The program interpreter behaves similar to Khushi’s language interpreter. It converts the program instructions to machine language line-by-line, but doesn’t note down the translations. As a result, the conversion is done every time the program is run.

A hybrid approach

What if Khushi carries the English versions to South Korea, but the interpreter also notes down the Korean version at the same that she translates line-by-line. By the time the interpreter is finished translating to the first community, there will be a newly written Korean version that Khushi can simply distribute to the other communities in the same country. Khushi doesn’t have to get everything translated upfront before leaving for South Korea. Nor does she have to take her interpreter with her everywhere. Khushi needs to remember one thing. She needs to take her interpreter along when changes are made to a story. The interpreter can then note down the Korean translation of the new version of the story when reading it for the first time to a Korean community.

Modern interpreters behave this way. They read programs line-by-line and issue the converted machine language instruction to the processor. But they also note down the conversion into a special area called the interpreter cache. As long as the source program hasn’t changed, the interpreter will take instructions from the cache rather than reading from the source language and translating yet again.

Comparison of compilers and interpreters

Here is a tabular comparison of compilers and interpreters.

A compiler needs to be installed on the machine where the program is developed.

An interpreter must be installed on every machine that the program needs to run on.

The program in the source language, e,g. C/C++, must also be maintained only in the development machine.

The program in the source language must be present on every device where it is to be run.

The source program must be converted into a machine language translation, which must be copied to every device. This must be done for every type of computing device.

The source program is copied as it is to every device where it should run. The same source is copied to all types of computing devices.

The entire program must be converted to machine code up-front before copying to the target machine and running. For a huge program such as an operating system, this may take several hours.

The interpreter picks up one line from the program and converts only that line to machine code. This is done for every line until the entire program is interpreted.

Once a program is converted to machine code and copied to the target device, no more translation takes place.

Since the program exists in source form, the interpreter must translate every line every time. But if an interpreter has a cache, then it only needs to translate once.

Which approach should I use?

There is a reason why 99% of the programming languages today are interpreter-based. The developer needs to maintain only the program in the source code and does not need to learn the tools to convert it to machine language form. The onus of converting lies with the company that writes the interpreter for the various computing platforms.

There are two reasons why a developer would want to look at writing some modules in compiled languages.

  1. If you are running your program on a device that has very little storage and memory, it may not be possible to fit the interpreter into a low-capacity disk and then load it into limited memory. A compiled program, which is already in machine code for that device, would not need the interpreter at all. This is similar to Khushi’s situation in Kenya where good interpreters aren’t available.
  2. If you are running a program that needs to be really, really fast, then the cost of translating every line before it is executed will catch up to you and make the experience bad. You will never see video editors or 3-D modelling apps made in Java or Python. The delay in interpretation will visible cause the frame rates to drop below 25 frames per second and you will notice what is called ‘jank’, i.e. video frames being delivered slower than what the eyes perceive as motion. As a result, video applications or applications that need real-time response are always written in compiled languages. Directly reading machine code is much faster than interpreted code.

Conclusion

Compiler and interpreters follow two different approaches, but their goal is the same. Both convert a program from it source language to something that a computing device will understand. It’s just that a compiler does it with the approach of someone setting curd from milk, i.e. starting well in advance before the product is used, whereas the interpreter follows the approach of someone who chops herbs right before tossing them into a pan, just in time, so that they don’t go stale.

Message queue and task queue

Consider that you are preparing for a party at your home. There are several tasks to be done. Walls are to be decorated, plates are to be washed, food is to be made ready and you may need an extra shoe rack for guests to leave their footwear. Most probably, you do not do everything alone. You get the help of the members of your household. In fact if the load is too much, then you even ask your friends for help.

When more people work on the tasks for organising a party, you get multiple things done parallely. But with more participation comes the need for co-ordination and communication. Some of the tasks may be related to each other, e.g. someone who has promised to wipe washed dishes and stack them on your shelf will need the one who washes dishes to complete his/her task first. Usually, one of the participating persons needs to stay on top of what gets done and who does what. He/she is like the captain of a sports team. When everyone smoothly communicates the status of his/her tasks and one or two persons are aware of each task’s status, things go on quite well despite the complexity.

Modern computers work the same way. They have multiple processors capable of executing multiple things at the same time. Even a single modern processor performs time slicing, i.e. divides its attention among several processes / threads at the same time. Processes need to communicate among each other. For this, they use a system called a message queue.

Continue reading “Message queue and task queue”

How biometrics work: Face to face

Humans have been recognising their peers by identifying their faces. This process of recognition is several millenia old and we have lost track of which species of homo sapiens actually started it. Other than immediately recognising twins, this method has served us quite well and fails very rarely. It is one of the those activities that seems to happen instantly. So fast that we are unable to study how it happens.

The technology has recently been ported to machines. Work has gone on for decades, but it is only now that we can expect a reasonable performance from our laptops and phones to recognise our face using front camera. Even then, there are false positives or embarassing mismatches. Computers are nowhere near humans when it comes to the speed and accuracy of facial recognition. But they are getting there eventually.

In this post, we shall see the various technologies used to recognise human faces.

Recognition by texture

Texture recognition is the most commonly used form of facial recogntion because it has been around for a while and it is cheaper to implement. The image of a face coming through a camera is dissecting into a grid. Each cell in the grid deals with a certain section of the face. Specific features such as tone of the skin, presence of birthmarks, etc. are considered. These are noted down in the appropriate section of the grid. The shape of the jaw, ears and lips are paid particular attention to. Features like facial hair are sought to be ignored.

Facial recognition by texture needs only a single camera taking a reasonably close view of the face. However, it is prone to failures due to the following. If the lighting in the room changes or is too low, then the mapping algorithm gets thrown off or completely fails. While the algorithm seeks to ignore temporary details like facial hair, ornaments, sun glasses or small cuts, it doesn’t always succeed and may actually consider those features as unique identities. Recording of such temporary data causes the match algorithm to fail the next time when the person does not have those features anymore.

Recognition by reflection

This method attempts to build a model of the contours of the face and use that for recognition. There are tiny differences in the shapes of the face between two individuals, which can be used to identify and distinguish. The method is similar to the way an X-Ray works.

A pattern of a beam of light in the invisible spectrum (such as infrared), a steady sound inaudible to the human ear or radio waves are sent toward the face of the person. The reflection from the face is received. These are detected by sensors on the face detection device. The device then measures the time after which each beam / sound / wave that was sent is received. If the reflection takes less time, then that part of the face must have been closer. A reflection that takes more time must be coming from the part of the face that is farther away. This data is then used to build the model of the contour of the face.

For this method to work effectively, the face must be reasonably still, otherwise the calculations are thrown off and the contour is not mapped correctly.

Recognition with several angles

In this method, more than one camera are placed around the face at different angles. Multiple images are captured. All of them can together build a 3D model of the face. The more the number of cameras used, the more accurate the model built. Between 3 to 6 cameras are generally used. Anything below 3 cannot build a good model and anything over 6 is overkill.

Thermal detection

Thermal detection uses a heat-sensitive camera to create a heat map of the face. Heat maps are different for different faces. This method can be used even in the dark. Heat maps may differ due to weather conditions or changes in body conditions.

This is an emerging technology already used in zoology and tracking of animals in the dark. The human body is a warm entity, but different organs and different parts in each organ are warmer or colder than each other. The warm and cold areas of the face can vary from person to person. This variation can be used for identification and differentiation.

A heat sensor generates a heat map of the face by detecting warm and cool areas. The warmer areas are recorded as brighter shades and the cooler ones as darker shades.

The advantage of this method is that it works even in the dark. However, the body’s heat signature can vary due to changing weather, sickness or exposure to places of different temperatures. This can cause variations in the heatmap of the same person.

Which one to use?

None of the above methods is 100% accurate on its own. Each method is used in combination with another. Sometimes they are used simultaneously and other times, the failure of one method leads to a fallback to another method.

Overall, facial recognition is not as accurate as fingerprints or optical scan. I would wait for 5 more years to give this technology time. Facial recognition cannot be used as a mainstream authentication method yet.

Conclusion

While we as humans need no time to distinguish among the people we know and to identify them, a computer is just getting started with facial recognition and there are several years to go before the technology matures. It will be a while before a computer can put a name to a face or a face to a name.

How biometrics work: Within the blink of an eye

Usually you see and then identify things. You can make out that a cup is a cup, a ball is a ball and that your friend Jay is Jay. You identify because you see.

But what if we flip this around? You see, so you are identified. What technology makes it possible that the same eyes that you use to identify things around you, are now used to identify you? That technology has been around for a while now. I’ll give you two in fact. Retina scan and iris scan.

Continue reading “How biometrics work: Within the blink of an eye”

How biometrics work: At your fingertips!

The use of fingerprints has been at the forefront of biometrics even during the non-digital times. From dipping your finger into a pad of ink and transferring your print to a sheet of paper, we have come a long way in the form of touching an illuminated plate of glass.

Continue reading “How biometrics work: At your fingertips!”

Two-step authentication with OATH

You have just moved your desired product to the shopping cart of your favourite online shopping website. The website asks for your preferred mode of payment. You enter the details of your bank/card and click to proceed. Just before your payment is complete, you have one more step to complete. On the screen in front of you is a text box which expects 4 digits. But where do you get those 4 digits from? That’s when your phone pings. You reveal the source of the ping to be an SMS that contains 4 digits. The same digits that you need to enter on the screen to complete your payment. With the magic digits entered, your payment is unlocked and your product will soon be on its way.

Welcome to the world of OATH or Open AuTHentication, a specification that uses two steps of authentication to make sure that your transactions are truly safe. Let’s read how OATH works.

Continue reading “Two-step authentication with OATH”

Build a pipeline to success with CI/CD

Meet Laura. She’s a popular author and has written 25 books. She’s been writing for 10 years. It takes her about 6 months to roll out a book, from draft to publication. Her books are on an average 150 pages long.

Continue reading “Build a pipeline to success with CI/CD”

Learn the A-Z of fonts

Your writing has a distinct style. There is a way in which you shape the letter ‘a’ that is unique to you. This is true for every letter and number you write. Your writing style is formal and measured when you are writing a proposal for your company or filling a government form. It is totally different, very casual, when you write a love note to your spouse. It may turn chaotic when you are simply scribbling ideas on a notepad, post-its or a napkin. One way or the other your writing style changes depending on the application. Also, your style of writing differs from your friends’s, boss’s or spouse’s style. It is unique to you.

The same idea is captured in the digital world in the form of font. A particular font is suited for a particular application. E.g. sans serif fonts for websites, serif fonts for newspapers and mono space fonts to represent engineering codes or mathematical formulae. And then, there is one font that is good for you. It appeals to your taste or represents your character.

Let’s dive into the world of fonts in this post and find out how they work.

Continue reading “Learn the A-Z of fonts”

Follow the trend, use time-series DB

Let’s say you want live sports, but you are at work. You are not allowed to watch the game on video, but there are several other options. One of them is to visit the event’s official website and use the section for live text commentary. In text commentary, you will see a running log of the moments of the game. In football, the minute of the game serves as the lead, followed by a brief description of the action, e.g. “57′: Buffon saves a Ronaldo freekick by diving to his left.”. In cricket, you have the over and ball, followed by what happened, e.g. “17.4: Two runs. Kohli drives the ball to long-off, but Watson is in position to prevent the boundary.” Not only sports, but other events like parliamentary sessions, seminars and board meetings are published the same way, e.g. “10:32 am: Mr. John Doe, Member of Parliament from Acme constituency, takes the mic.” Let’s take this to the world of technology, with sensors recording the current temperature or an online analytics tool that watches every minute you watch a movie online, shop or read a blog. These tools record a minute-by-minute about what’s going on in the surroundings or what the user is doing on a website. In fact, minute-by-minute is an understatement and it is rather microsecond-by-microsecond. To help record this, the world of technology uses a special type of database called the time-series database. Continue reading “Follow the trend, use time-series DB”