How Internet Cookies Work 
                  
                   
                  Internet cookies 
                    are incredibly simple, but they are one of those things that 
                    have taken on a life of their own. Cookies started receiving 
                    tremendous media attention back in February 2000 because of 
                    Internet privacy concerns, and the debate still rages. 
                   
                  On the other hand, cookies provide 
                    capabilities that make the Web much easier to navigate. The 
                    designers of almost every major site use them because they 
                    provide a better user experience and make it much easier to 
                    gather accurate information about the site's visitors. 
                  In this edition of  , 
                    we will take a look at the basic technology behind cookies, 
                    as well as some of the features they enable. You will also 
                    have the opportunity to see a real-world example of what cookies 
                    can and cannot do using a sample page that we developed here 
                    at  . 
                   
                  
                    
                    
                    Cookie Basics
                    In April of 2000 I read an in-depth article on Internet privacy 
                    in a large, respected newspaper, and that article contained 
                    a definition of cookies. Paraphrasing, the definition went 
                    like this: 
                  
                    Cookies are programs that Web 
                      sites put on your hard disk. They sit on your computer gathering 
                      information about you and everything you do on the Internet, 
                      and whenever the Web site wants to it can download all of 
                      the information the cookie has collected. [wrong]
                  
                  Definitions like that are fairly 
                    common in the press. The problem is, none of that information 
                    is correct. Cookies are not programs, and they cannot run 
                    like programs do. Therefore, they cannot gather any information 
                    on their own. Nor can they collect any personal information 
                    about you from your machine. 
                  Here is a valid definition of 
                    a cookie: 
                  
                    A cookie is a piece of text that a Web server 
                      can store on a user's hard disk. Cookies allow a Web site 
                      to store information on a user's machine and later retrieve 
                      it. The pieces of information are stored as name-value 
                      pairs. [Correct] 
                  
                  For example, a Web site might 
                    generate a unique ID number for each visitor and store the 
                    ID number on each user's machine using a cookie file. 
                  If you use Microsoft's Internet 
                    Explorer to browse the Web, you can see all of the cookies 
                    that are stored on your machine. The most common place for 
                    them to reside is in a directory called c:\windows\cookies. 
                    When I look in that directory on my machine, I find 165 files. 
                    Each file is a text file that contains name-value pairs, 
                    and there is one file for each Web site that has placed cookies 
                    on my machine. 
                  You can see in the directory that 
                    each of these files is a simple, normal text file. You can 
                    see which Web site placed the file on your machine by looking 
                    at the file name (the information is also stored inside the 
                    file). You can open each file by clicking on it. 
                  For example, I have visited and 
                    the site has placed a cookie on my machine. The cookie file 
                    for goto  contains the following information: 
                  It appears that Amazon stores 
                    a main user ID, an ID for each session, and the time the session 
                    started on my machine (as well as an x-main value, which could 
                    be anything). 
                  The vast majority of sites store 
                    just one piece of information -- a user ID -- on your 
                    machine. But there really is no limit -- a site can store 
                    as many name-value pairs as it likes. 
                  A name-value pair is simply a 
                    named piece of data. It is not a program, and it cannot "do" 
                    anything. A Web site can retrieve only the information that 
                    it has placed on your machine. It cannot retrieve information 
                    from other cookie files, nor any other information from your 
                    machine. 
                  
                    
                    
                    How Does Cookie Data Move?
                    As you saw in the previous section, cookie data is simply 
                    name-value pairs stored on your hard disk by a Web site. That 
                    is all cookie data is. The Web site stores the data, and later 
                    it receives it back. A Web site can only receive the data 
                    it has stored on your machine. It cannot look at any other 
                    cookie, nor anything else on your machine. 
                  The data moves in the following 
                    manner: 
                  
                  
                     
                    -  
                      
If you type the URL of a Web 
                        site into your browser, your browser sends a request to 
                        the Web site for the page (see How Web Servers and the 
                        Internet Work for a discussion). For example, if you type 
                        the URL  into your browser, your browser will 
                        contact Amazon's server and request its home page. 
                     
                    - 
                      
 When the browser does this, it will look 
                        on your machine for a cookie file that Amazon has set. 
                        If it finds an Amazon cookie file, your browser will send 
                        all of the name-value pairs in the file to Amazon's server 
                        along with the URL. If it finds no cookie file, it will 
                        send no cookie data. 
                     
                    -  
                      
Amazon's Web 
                        server receives the cookie data and the request for a 
                        page. If name-value pairs are received, Amazon can use 
                        them. 
                     -  
                      
If no name-value 
                        pairs are received, Amazon knows that you have not visited 
                        before. The server creates a new ID for you in Amazon's 
                        database and then sends name-value pairs to your machine 
                        in the header for the Web page it sends. Your machine 
                        stores the name-value pairs on your hard disk. 
                     
                    -  
                      
The Web server can change 
                        name-value pairs or add new pairs whenever you visit the 
                        site and request a page. 
                     
                     
                   
                   
                  There are other pieces of information 
                    that the server can send with the name-value pair. One of 
                    these is an expiration date. Another is a path 
                    (so that the site can associate different cookie values with 
                    different parts of the site). 
                  You have control over this 
                    process. You can set an option in 
                    your browser so that the browser informs you every time a 
                    site sends name-value pairs to you. You can then accept or 
                    deny the values. 
                  
                    
                    
                    How Do Web Sites Use Cookies?
                    Cookies evolved because they solve a big problem for the people 
                    who implement Web sites. In the broadest sense, a cookie allows 
                    a site to store state information on your machine. 
                    This information lets a Web site remember what state 
                    your browser is in. An ID is one simple piece of state information 
                    -- if an ID exists on your machine, the site knows that you 
                    have visited before. The state is, "Your browser has visited 
                    the site at least one time," and the site knows your ID from 
                    that visit. 
                  Web sites use cookies in many 
                    different ways. Here are some of the most common examples: 
                    
                  
                    -  
                      
Sites can accurately determine 
                        how many people actually visit the site. It turns 
                        out that because of proxy servers, caching, concentrators 
                        and so on, the only way for a site to accurately count 
                        visitors is to set a cookie with a unique ID for each 
                        visitor. Using cookies, sites can determine:  
                      
                     
                  
                  The way the site does this is 
                    by using a database. The first time a visitor arrives, 
                    the site creates a new ID in the database and sends the ID 
                    as a cookie. The next time the user comes back, the site can 
                    increment a counter associated with that ID in the database 
                    and know how many times that visitor returns. 
                   
                  
                    -  
                      
Sites can store 
                        user preferences so that the site can look different 
                        for each visitor (often referred to as customization). 
                        For example, if you visit it offers you the ability to 
                        "change content/layout/color." It also allows you to enter 
                        your zip code and get customized weather information. 
                        When you enter your zip code, the following name-value 
                        pair gets added to MSN's cookie file: 
                       
                      Since I live in Raleigh, NC, 
                        this makes sense. 
                       
                  
                   
                  Most sites seem to store preferences 
                    like this in the site's database and store nothing but an 
                    ID as a cookie, but storing the actual values in name-value 
                    pairs is another way to do it (we'll discuss later why this 
                    approach has lost favor). 
                   
                  
                     
                    -  
                      
E-commerce sites can implement 
                        things like shopping carts and "quick checkout" 
                        options. The cookie contains an ID and lets the site 
                        keep track of you as you add different things to your 
                        cart. Each item you add to your shopping cart is stored 
                        in the site's database along with your ID value. When 
                        you check out, the site knows what is in your cart by 
                        retrieving all of your selections from the database. It 
                        would be impossible to implement a convenient shopping 
                        mechanism without cookies or something like them. 
                     
                    
                  
                   
                  In all of these examples, note 
                    that what the database is able to store is things you have 
                    selected from the site, pages you have viewed from the site, 
                    information you have given to the site in online forms, etc. 
                    All of the information is stored in the site's database, and 
                    in most cases, a cookie containing your unique ID is all that 
                    is stored on your computer. 
                  
                    
                    
                    An Example
                    To give you a simple example of what cookies and a database 
                    can do, we have created a simple history and statistics system 
                    for this article. This system runs on the   servers 
                    and lets you view your activity on the   site. Here's 
                    how it works: 
                  
                  
                    -  
                      
 When you 
                        visit   for the first time, the server creates 
                        a unique ID number for you and stores a cookie on your 
                        machine containing that ID. For example, on the machine 
                        I am using now, this is what I see in the   
                        cookie file: 
                     
                  
                   
                  There is nothing 
                    magic about the number 35,005 -- it is simply an integer that 
                    we increment each time a new visitor arrives. I was user number 
                    35,005 to come to the   site since this cookie system 
                    was installed. We could make the ID value as elaborate as 
                    we desire -- many sites use IDs containing 20 digits or more. 
                    
                   
                  
                    -  
                      
Now, whenever 
                        you visit any page on  , your browser sends your 
                        cookie containing the ID value back to the server. The 
                        server then saves a record in the database that contains 
                        the time that you downloaded the page and the URL, along 
                        with your ID. 
                     -  
                      
To see the history 
                        of your activity on  , you can go to this URL on 
                        the site: 
                     
                  
                   
                  Your browser sends your ID value 
                    from the cookie file to the server along with the URL. The 
                    page runs a piece of code that queries the database and retrieves 
                    your history on the site. It also calculates a couple of interesting 
                    statistics. Then it creates a page and sends it to your browser. 
                    
                  Try the URL for the history page 
                    now: 
                  Then go view a couple of other 
                    pages on   and try it again. You will see that the 
                    statistics change and so does the list of files. (Also note 
                    that the allows you to reset your history list whenever you 
                    like.) 
                  
                    
                    
                    Problems with Cookies
                    Cookies are not a perfect state mechanism, but they certainly 
                    make a lot of things possible that would be impossible otherwise. 
                    Here are several of the things that make cookies imperfect. 
                    
                  
                  
                    -   
                      
People often share machines 
                        - Any machine that is used in a public area, and many 
                        machines used in an office environment or at home, are 
                        shared by multiple people. Let's say that you use a public 
                        machine (in a library, for example) to purchase something 
                        from an on-line store. The store will leave a cookie on 
                        the machine, and someone could later try to purchase something 
                        from the store using your account. Stores usually post 
                        large warnings about this problem, and that is why. Even 
                        so, mistakes can happen. For example, I had once used 
                        my wife's machine to purchase something from Amazon. Later, 
                        she visited Amazon and clicked the "one-click" button, 
                        not realizing that it really does allow the purchase of 
                        a book in exactly one click. 
                      On something like a Windows 
                        NT machine or a UNIX machine that uses accounts 
                        properly, this is not a problem. The accounts separate 
                        all of the users' cookies. Accounts are much more relaxed 
                        in other operating systems, and it is a problem. 
                      
                      If you 
                        try the example above on a public machine, and if other 
                        people using the machine have visited  , then the 
                        history URL may show a very long list of files. 
                     -   
                      
Cookies get erased 
                        - If you have a problem with your browser and call tech 
                        support, probably the first thing that tech support will 
                        ask you to do is to erase all of the temporary Internet 
                        files on your machine. When you do that, you lose all 
                        of your cookie files. Now when you visit a site again, 
                        that site will think you are a new user and assign you 
                        a new cookie. This tends to skew the site's record of 
                        new versus return visitors, and it also can make it hard 
                        for you to recover previously stored preferences. This 
                        is why sites ask you to register in some cases 
                        -- if you register with a user name and a password, you 
                        can login, even if you lose your cookie file, and restore 
                        your preferences. If preference values are stored directly 
                        on the machine (as in the MSN weather example above), 
                        then recovery is impossible. That is why many sites now 
                        store all user information in a central database and store 
                        only an ID value on the user's machine. 
                      
                      If you 
                        erase your cookie file for   and then revisit 
                        the history URL in the previous section, you will find 
                        that   has no history for you. The site has 
                        to create a new ID and cookie file for you, and that new 
                        ID has no data stored against it in the database. (Also 
                        note that the allows you to reset your history list whenever 
                        you like.) 
                     
                    -  
                      
Multiple machines 
                        - People often use more than one machine during the day. 
                        For example, I have a machine in the office, a machine 
                        at home and a laptop for the road. Unless the site is 
                        specifically engineered to solve the problem, I will have 
                        three unique cookie files on all three machines. Any site 
                        that I visit from all three machines will track me as 
                        three separate users. It can be annoying to set preferences 
                        three times. Again, a site that allows registration and 
                        stores preferences centrally may make it easy for me to 
                        have the same account on three machines, but the site 
                        developers must plan for this when designing the site. 
                        
                      If you visit the history URL 
                        demonstrated in the previous section from one machine 
                        and then try it again from another, you will find that 
                        your history lists are different. This is because the 
                        server created two IDs for you, one on each machine. 
                     
                    
                   
                   
                  There are probably not any easy 
                    solutions to these problems, except asking users to register 
                    and storing everything in a central database. 
                  When you register with the   
                    registration system, the problem is solved in the following 
                    way: The site remembers your cookie value and stores it with 
                    your registration information. If you take the time to login 
                    from any other machine (or a machine that has lost its cookie 
                    files), then the server will modify the cookie file on that 
                    machine to contain the ID associated with your registration 
                    information. You can therefore have multiple machines with 
                    the same ID value. 
                  
                    
                    
                    Why the Fury Around Cookies?
                    If you have read the article to this point, you may be wondering 
                    why there has been such an uproar in the media about cookies 
                    and Internet privacy. You have seen in this article that cookies 
                    are benign text files, and you have also seen that they provide 
                    lots of useful capabilities on the Web. 
                  There are two things that have 
                    caused the strong reaction around cookies: 
                  
                  
                     
                    -  
                      
The first is something that 
                        has plagued consumers for decades but is now getting out 
                        of hand. Let's say that you purchase something from a 
                        traditional mail order catalog. The catalog company has 
                        your name, address and phone number from your order, and 
                        it also knows what items you have purchased. It can sell 
                        your information to others who might want to sell 
                        similar products to you. That is the fuel that makes telemarketing 
                        and junk mail possible. 
                      On a Web site, the site can 
                        track not only your purchases, but also the pages that 
                        you read, the ads that you click on, etc. If you then 
                        purchase something and enter your name and address, the 
                        site potentially knows much more about you than a traditional 
                        mail order company does. This makes targeting much 
                        more precise, and that makes a lot of people uncomfortable. 
                        
                      Different sites have different 
                        policies.   has a strict privacy policy and 
                        does not sell or share any personal information about 
                        our readers with any third party except in cases where 
                        you specifically tell us to do so (for example, in an 
                        opt-in e-mail program). We do aggregate information together 
                        and distribute it. For example, if a reporter asks me 
                        how many visitors   has or which page on the 
                        site is the most popular, we create those aggregate statistics 
                        from data in the database. 
                     
                    - 
                      
 The second is new. There are certain 
                        infrastructure providers that can actually create cookies 
                        that are visible on multiple sites. Double Click is the 
                        most famous example of this. Many companies use DoubleClick 
                        to serve ad banners on their sites. Double Click can place 
                        small (1x1 pixels) GIF files on the site that allow DoubleClick 
                        to load cookies on your machine. Double Click can then 
                        track your movements across multiple sites. It can potentially 
                        see the search strings that you type into search engines 
                        (due more to the way some search engines implement their 
                        systems, not because anything sinister is intended). Because 
                        it can gather so much information about you from multiple 
                        sites, DoubleClick can form very rich profiles. 
                        These are still anonymous, but they are rich. 
                     
                    -  
                      
Double Click then went one 
                        step further. By acquiring a company, DoubleClick threatened 
                        to link these rich anonymous profiles back to name and 
                        address information -- it threatened to personalize them, 
                        and then sell the data. That began to look very much like 
                        spying to most people, and that is what caused the uproar. 
                        
                      Double Click and companies 
                        like it are in a unique position to do this sort of thing, 
                        because they serve ads on so many sites. Cross-site 
                        profiling is not a capability available to individual 
                        sites, because cookies are site specific