Software FAQ

Presented below is a comprehensive explanation to several questions that may arise when speaking of software, IT and the Internet

What is software? How is it created?

Software can be classified as application software and system software. Application software are software that people use on a voluntary basis, like text-editors, games, picture viewers, sound players, CD players, schedule-keepers, file compression / de-compression software, and so on. Application software (popularly called applications) listed above are confined to a local system. But there are several applications which are capable of utilizing a network, and enabling communication, like the web-browser that you are using to read this, your mail client, your chat client, a multi-player game.

Software is typically a combination of several programs. Programs are bunches of instructions to the computer that "runs" the programs. Programs are capable of making the computer do various kinds of work - calculations, graphic displays, input and output, and communications, and storage.

Is every computer programmable?

Computers are basically a collection of several hardware components, which talk to each other to get a particular work done. The keyboard senses what key you press, and conveys it to whichever component wanted to know what you pressed. The mouse sends your clicks and movements to the monitor, with several components inbetween, so that the monitor can display the mouse pointer moving with the mouse.

The computer is not always bought with a single purpose, like the TV or the washing machine or the fridge. Computers are bought because by running some applications on them, you can do some wonderful things. If you are a guitarist or a drummer in a band, you would probably want your computer to be a synthesizer, a sequencer, or a recorder and live telecaster. If you are a businessman, you would want the computer to be a financial assistant. A teacher would want to use the computer as an encyclopaedia, a dictionary, and a reference library. These functions do not typically come in the PC (personal computer) that you buy. It is specific software (applications) you run, that make your computer become into a gaming station, a telephone, a video player, a filing system, or anything you might use the computer for.

So you acquire the necessary software from a vendor or other source, and "install" it on your computer. (The software is programmed by using generic instructions on the programmer's computer, so it needs to be customized to your computer's configuration)

Everybody who owns pots can not be a potter. But it is not the case with computers. moreinfo programming

Programming is done using programming tools. A program begins as a text file of instructions written in a specific programming language. The file is then processed by the programming tools (these are also software and fall under the category of system software - namely compilers, interpreters, linkers, debuggers, and so on), and converted into "executable files". Each software is made by writing several programs, and converting them into few executable files (a.k.a executables). The executables can be "run" by the computer. But at a single time, there may be multiple executables in action in a computer, and so a controlling software, the overseer or the OS (operating system) makes the executable into a process, and allows it to run in a controlled fashion. An executable in action is called a process, and the OS is like a super-process (also in the family of system software). It runs starting from the moment you turn your PC on, until you turn it off.

The above paragraph means that given the system software, even you can be a programmer, and make your computer do things which the software present can not do. You can program your computer into an alarm clock, for instance, which plays your favorite music at specified times of the day, or says nice things about you :)

What is the Internet?

The Internet can be described using comparisons to a trade fair ( or carnival ) that is open all year round. In such a fair, there are usually several shops, and stalls, that allow people to look around, sample the wares on display, and make purchases. Let us consider that our trade fair has about 30 million food stalls, a few million game stalls, and a few book stores, vehicle displays, clothestores, post offices, and telephone exchanges.

The purpose of the Internet is to present to customers, a wide ( indeed infinite ) variety of products and services, ALL UNDER ONE ROOF.

moreinfo www

Now think of just the food stalls. Consider that there is this big highway, and the fair is located on either side of it.

The ones who came to dine are sitting at tables that are put up in all possible spots along the highway. Customers do not leave their table. From where they sit, they can not see even one food stall. So they use a page-boy (for food stalls only), who takes down the name of the food place. Then the page sends his brother to go and looks up the address of the stall from a giant directory called the domain name server(DNS). Once the brother returns, the page takes the address, uses the highway to go to the stall, and places a request to the stall, the stall server responds, and gives the page-boy what he asked. The page-boy returns to the table from where he left, bringing along the goodies.

The best part is, after the page-boy returns you can send him to a different food stall this time. You can eat food from different stalls all at the same table at the same session.

Lo, this is what the WWW is all about. WWW stands for world-wide-web. It is like the collection of all food stalls. But food stalls are only one aspect of the Internet. The page-boy of food stalls is your web-browser (that you are currently using to view this), the food stall is the web server, and the highway is the worldwide network.

The web-browser is a networking software, that uses your computer's connection to the Internet (via phone or cable or Digital Link or satellite or wireless or ...). It has the address-bar, or the URL(uniform resource locator) bar. The address bar does not do any locating. It is the name of the server (food stall) that you type in, which is the locator (URL). The URL is used by your browser, to make a lookup in the DNS (giant directory), and find the physical address (computers do have addresses, when they connect to any network, even if the network is the Internet - in which case it is called an Internet Address or IP address [Internet Protocol Address])

The browser obtains the IP address from the DNS (giant dictionary - note that this is also a server. So it has its own type of page-boy. The web browser used a brother of his). This is done by almost all network applications because names are easier to remember than IP addresses. The name yahoo.com is easier than 203.221.89.65 (just some random IP address, may not be yahoo). So almost all network applications have a brother to lookup the DNS.

To open your own food stall (web server) you need to register your stall's address and name with the DNS, so that your stall has a URL that customers can type in their address bar. After that you can put up your wares for business.

Technically the web-browser is called the http user agent (a middle-man, someone who does your work for you). HTTP stands for Hyper-Text Transfer Protocol. And the web server is called the HTTP server. A web page is said to consist of not just text, but Hyper-text, and in the case of non-text content, like multimedia (audio, images, graphics) then it is called Hyper-media.

moreinfo email

Electronic mail or e-mail is another revolutionary service offered by the Internet. Just like the web client, you use a "mail client" to access email. The most basic feature provided by a mail service is an inbox (which accumulates mail that is addressed to you), and a mechanism to send mail to other people.

The mail client is also a networking software, which uses your computer's connection to the Internet. The mail client operates in tandem with your mail server. For example if your mail server is called "skytrooper.com", then for you to become a user, you need to get a user id on skytrooper.com. I have a user id on gmail.com, which is "ssk.ram". So my fully qualified email id is ssk.ram@gmail.com .(more on servers in the next question)

Your inbox is maintained on the server. This means the server acts like a post office, and your inbox is like your PO box. Mail does not come home to your door, it only comes up to your post office (mail server). The mail client is like a helper who checks into your PO box, and brings letters home (to your computer system that is). So you have to give your user id ( and password ) to your mail client, which allows it to authorize with the server, and locate your PO box.

The PO box is typically structured as a single file, the name of the file being your user id. New mail coming to you is appended into the file at the beginning (with suitable separators between two mail messages). Your mail client also maintains a copy of the file. It finds if new mail is present by looking at the beginning of the file on the server. If it finds something new (that it has not already copied locally) then it updates the new mail into its local copy. Once copied, you need not connect to the server again to read the message (you read it from the local copy)

To send mail, you need to know the fully qualified mail id of the addressee. Your mail client may have a feature allowing you to compose/type out new mail. The form should have the following fields as a minimum:

destination mail id (possibly also Cc and Bcc)
Message subject
Message body (where you type in the message)

After you have finished typing out your message to the addressee, you send the mail. The mail goes to skytrooper.com, your mail server (this is the first step). Suppose the addressee is "jack@mumbo.com". The task of your skytrooper.com server is to find the server called mumbo.com. This it does using the DNS (about which sufficient info is present in moreinfo www). Once it finds mumbo.com, it sends the mail you composed to mumbo.com. The mumbo.com server has several users and consequently several PO boxes. mumbo.com uses the "To" field to find the PO box ( "jack" ) to which it has to append (at the beginning) the mail you composed. Lets hope that there actually exists a PO box called jack on mumbo.com (otherwise mumbo.com will return an error mail to skytrooper.com - using the "From" address which your mail client puts by itself to each mail you compose).

This is how your mail travels to jack's mailbox. When jack logs in to mumbo.com, his mail client finds that a new message has arrived, and copies it into his computer system. Now jack can read the mail you sent him.

Before your message actually gets into jack's mailbox file, mumbo.com may run its anti-spam software, and anti-virus software on your message. Spam is a mail message that is not actually addressed to you (a mistake done during compose) , or unwanted mail. Software to generate random mail addresses are used, and messages are composed without a care of who is going to receive it. Please do not indulge in spam. Email is the lifeline of several businesses, and the Internet is already clogged with several such unwanted mail on the thousands of mail servers in existence.

Unlike a web client, the mail client is authentication based, and once you login to a server, until you logout you can not access your other mailboxes ( on other servers ). Using the same analogies of moreinfo www, unless you finish up with food from one stall, you can not order food from another :)

What is a server exactly?

In the world of networking, there are two models in which computers can communicate with each other. One is the peer-to-peer model. It is the way you talk with your neighbor - direct, man-to-man. But when you talk to him through the phone, you use a number, which goes to your telephone exchange, and locates where the phone with the number you typed is located, and then makes that phone ring. Once he picks up, you say something. Whatever you say first goes to the telephone exchange, and then reaches his ear.

The telephone exchange scenario is the client-server model. In the case of www, there is nobody you are talking to but the telephone exchange itself (which is the server). In the case of Internet Chat or voice chat or email, the person you talk to is just like you, a normal client. Apart from addressing the server, you also address the guy you want to talk to. The email address(or chat address) thus has two parts - the user's name, and the server's name. "ssk.ram @ gmail.com" addresses the user 'ssk.ram' and also the server 'gmail.com'

moreinfo protocols

There are servers for different things. Each server does only one thing. Listen for requests. When one comes, respond. The type of response and what the server does with the requests are up to the "kind" of server (the *protocol* it follows) HTTP is a protocol which serves hypertext files. SMTP (simple mail transfer protocol) is used to send email to the server. POP (Post Office Protocol) is used by mail servers to put the mail into someone's mailbox. IRC (Internet Relay Chat) is used for chat.

But someone needs to run the telephone exchange (server). A server is also a software. A normal computer on the Internet, running the server process(a.k.a service) , becomes a server machine.

Each server machine may be running several services. Yahoo has a mail service, a chat service, and a http service too. Chances are a single server machine may run more than one service. That is why the URL has a protocol part, and an address part. your address bar may say something like "http://geoc....". That means you are using the http protocol. The URL tells the browser not only the server machine's name, but also the service name. A service uses a network abstraction called a service port. This is just a number (ranging from 1 to 65535) that identifies the service. Each service is said to listen on a specific port. No two services on a single server machine can listen on the same service port. The port for http is 80, for smtp it is 25, for pop, 110, and for ftp, 21. There are so many common services, that international standards have set apart 1024 numbers (1 to 1024) as standard service numbers. If you want to invent a new service, it should use a port greater than 1024.

You can see why the web browser is called a http user agent. You are the user or client. And the browser you use is your agent. The agent can talk to only one thing, the http server whose address you give in the address bar. In case of email or any other protocol, the term 'client' frequently replaces 'agent', so it is said 'mail client' or 'chat client' or 'ftp client'. Some even say 'web client' for browser.

Because the Internet is a world-wide network, and people from different parts of the world will be accessing a service (through their user agents), and even if it is night time where you are, someone from the other side of the world may want to access your service. So server machines are mostly always running. And so too, the service, and the operating system on them. Stopping the service, or turning your server machine off leads to a 'server not found' or a 'connection refused' response to a user agent.

What is a search engine?

Web sites like google, altavista, provide a html form in their web pages. This form is capable of sending special requests to the http service. The search terms you type in the form are sent in a query string to the http service. The service extracts the search terms from the query string, and does a lookup into its cache.

Google maintains a cluster of computers, which share a huge volume of storage capacity. Google has web spiders, which are agents which crawl the www. They go from one web server machine to another, sending back to the google clusters information about each web page on the machine. The clusters have their bit of work to do, on the data the spiders send. The data is cached in the cluster storage, and different mechanisms are used to analyze the data, and create ranks for the pages in cache. The ranking technology is different for different search engines, so we mostly get different results for the same query on different search engines. Your search terms are used to lookup the cache in the clusters, and return results of the search, in order of relevance

Yahoo started off as a web directory. Web directories were not only searchable, but there are real humans who categorize each new site that comes up, and create a hierarchical directory of the entire www. Yahoo btw is the most common name on the www.

Ok. By now you know a great deal on computers and the Internet. There are still some details left, like firewalls, proxies, gateways, the DNS and an ocean of detail on security. But finding about all this is more fun when you learn hands-on, rather.

How come everything comes free on the Internet?

The Internet started as an academic network, where academicians and scientists shared their knowledge freely on it. We still continue to do so, and most scientific organizations have an online presence.

The corporate part of the Internet is funded by ads. Take google for example. Its search service is available free, in that you do not have to pay per search or some such thing. Then how do the google engineers make a living? Through the ads they place to the right of the search results. You can pay google to place and ad of your organization, when someone searches for something relevant.

Other schemes include pay-per-click, pay-per-visit, etc. Where you get paid when an ad that you put on your page was clicked, or their site visited from your site, and so on. This strategy is used even by charity organizations like The Hunger Site, and A similar Indian counterpart.

Clicking any one, you will find that it leads to an advertiser page. The page visit is logged on the servers of hungersite or bhookh, and they encash the logs for money from the sponsors. This is one of the perks of having your own site. But there are some which provide even webspace for free, given that some ads be shown on all pages. That is why you see the extra tab to the right of this page.

There are some multimedia sites which require that you pay per view. Some others need that you become a member of the site, by paying a membership fee. This money enables them to provide freebies on their site, or sell stuff at dirt cheap rates.

How did India become so good at IT?

It is due to the networking infrastructure that connects every part of the world, that work needed to be done at one part can be despatched to some other part. This model of business is called off-shore Outsourcing. High skilled IT professionals could be put to work on jobs generated elsewhere in the world. And the standard of living warrants that IT labor charges in India are far less than that in Britain or the Americas, and will continue to be so for a long time. These factors along with political impetus made organizations in the US and other places to reach out to India to finish up their work. It was more financially profitable, and people in India got jobs. Soon people found that Indians became quite good at IT ( I would attribute it to the presence of about 15 completely dissimilar languages in India, making it natural for people to easily learn computer programming languages ), and the outsourcing model continues. China has taken on the same route as India, and are even more competent and cost-effective.

What is web-based services?

Services like email can be provided even if you do not have a mail client. All you need is a web browser to access gmail or yahoo mail.

moreinfo webmail

As stated in moreinfo email, a simple mail client provides two features - accumulation of inbox messages, and the compose feature. As it is, the mailbox is maintained only on the server, and the mail client fetches new appendages to the mailbox file on server, and maintains a local copy. But this enforces the restriction that you need to carry your mail client with you wherever you go. The mail client that you have been using for the past six months has accessed probably about 1000 messages. These messages are marked "read" on your local copy only. The server does not know how many you have read, and how many are new messages to you. When you change to a new mail client, it does not know how many you read earlier with your old client. So the new client reports that you have a thousand unread messages!

The only hindrance to your mobility is the local copy mechanism, which is in reality hidden completely from the user. Does the read and unread concept need to remain at the client? What if it can be taken to the server? Then the only job of the client is authenticating to the server. The mailbox file is reinterpreted into a html page (dynamically upon the 'open' request on any message) and the authentication can be done using a html form, or a secure http protocol (https) from the browser itself. The server contains code which interprets the mailbox file and responds with html to the browser. Sending mail is by using html forms again. The 3 essential fields described in moreinfo email can be placed on a html file, and the message and other details sent using the http methods (GET or POST - using querystrings) to the web server. The web server converts the query information into a mail message and sends it to the mail service (probably sitting in the same machine or cluster)

After the message reaches the mail server, the sequence of steps is the same as earlier. Hotmail pioneered the concept of webmail, where the web browser can fetch email for you from the mail server by using the web server as a go-between.

The mechanism of tunneling other protocols into the http protocol, lets many applications be web-based. Many banks use web-based applications, so that setting up new banks is very simple (customers can also perform transactions online if they know the URL used by the bank, and how to navigate the web interface). To play chess online, you do not need a chess software that shows you the pieces, and animates the moves, and all the bells and whistles. You can rather play online, in a smaller, less featurific, board presented on a web page instead. Web technology has also risen to the occasion, with features like callbacks, session storage, and collaborative development capabilities.

It uses the single premise that all modern computers come with a web browser installed. But anyday, a web interface is not a strong match for the dedicated application/client.