An Introduction To Data Files

Data files, as you may have guessed, contain data. They can contain any kind
of data, but they're most often used for storing large collections of data.
Data files usually contain most of the data stored by a program. As the
program runs, it makes extensive use of its data files to get all of its
information. The actual program executable file is often little more than a
manipulator, a tool to use the program data that's stored separately in other
files. Because they typically contain all of a program's data resources, data
files are also often called resource files. Non-technical people sometimes
call them "binary files". Although there is nothing abot data files that
makes them inherently more "binary" than a plain text file (after all, every
file is stored the same binary way, as a sequence of 1s and 0s), this
connotes the implication that data files are meant to be read by a computer,
whereas text files can be read by a human. In other words, data files are not
meant to be "human readable".

However, there is no reason why a human couldn't simply open up a data file,
start reading it, and understand everything that was inside it. The only
thing they would need is a knowledge of the file's format. This leads to the
trouble with understanding data files: They can have any format at all. There
is no universal standard for them, and so professional programmers often just
end up inventing their own proprietary file format, a format they create
themselves which nobody else knows about. Every data file must have a
standardized format to make the data inside it usable, but while some file
formats are publicly-available standards, other file formats are
closely-guarded secrets. A programmer can structure their file format any way
they choose, and many programmers don't want people to be able to browse
through their resource information at will.

If you've ever studied the file structure of a computer game, you've probably
noticed that the game's graphics aren't stored as JPEGs or GIF files.
Although these are two common standards for computer graphics files,
professional game developers will rarely incorporate their game's graphics by
simply creating a bunch of standard picture files. Instead, they'll more
likely convert those picture files into a format of their own design. The
game will come with several resource files which contain all the game's data
like the graphics, sound effects, music, and so on, but nobody will be able
to use those files without knowing their data format. The game's executable
file is simply the "engine", as it is often called in the game industry; What
you actually see in the game is stored in separate data files.

Ironically, game developers usually work with standard file formats when they
create the game. Game artists will create graphics in JPEG, GIF, PCX, BMP, or
some other well-known industry standard format, but as the development of the
game continues, these pictures are typically compiled into a few large files
which collectively contain all the game's picture data. These files contain
formats proprietary to the developer. Sometimes, if you view the data files,
you can even still see the original graphic file names there, as part of the
packaged resource file.

This is not necessarily a bad thing; It cuts down on file clutter, because
instead of having hundreds or thousands of files for all the pictures in the
game, there are only a few packaged data files which contain everything. It's
also not always an attempt by the developer to prevent people from
manipulating the program data (although that's a possibility as well); Many
companies have actually publicized their proprietary file formats, because
they want to allow other people to modify the program's data files if they
wish to do so. Although this has sometimes led to copyright concerns, it has
also led to cases in which the mass appeal of a game is greatly heightened
because of user-created modifications to the game. A classic example is the
game Doom, which was extremely popular in part because people could (and did)
create their own levels and graphics for the game by modifying the game's
resource files.

In an interesting move, id Software released the source code to Wolfenstein
3D, a game that helped make the company a star, and similarly, Apogee
released the source code to two of their games, Rise Of The Triad and Duke
Nukem 3D. What was so clever about this move was that only the source code to
the games' engines were released; Users who downloaded the source still
couldn't play the games, because the resource files weren't included. Users
could see all the game code, but unless they actually had the full games,
including the resource files, they simply had an empty game shell with no
graphics, no sound, and no music. This allowed everyone to study the games'
source code for their own interest, and it also gave owners of the games an
opportunity to re-program the games however they saw fit, while still
preventing the game from being pirated becase of lack of game resources;
Nobody wants to pirate just a game engine with no graphics in it.

Of course, there are many different ways to make a computer program, and
programmers have all different styles. There are many games which are
contained within a single executable file. In such cases, all the game's
resource data is contained within that one file. There is nothing necessarily
wrong with that, but if the game is large, it would result in a single very
large file, which could become difficult and inefficient to work with.
Logically organizing the game's data into groups of a few files tends to make
development much smoother. In the early days of computing, many games were
made as single files, because they were small enough that they could be
effectively managed that way; Today, only very small and amateur-created
games are programmed as single files.

There are also several games which do maintain their data as several separate
files; For example, all of a game's sound effects might take the form of
separate WAV files stored in some directory. Although a bit less compact,
such an arrangement opens up interesting possibilities by allowing users to
customize their games by changing the sound effects, replacing the original
sounds with new ones of their choosing. This is an interesting scenario, but
again, in reality, only small and usually non-retail-quality games are
arranged this way.

Just as in a kitchen, food (a finite resource) is stored separately from the
forks and spoons (tools used to act on the food), so it makes sense to
package a program's executable code separately from its data. Data files are
a way of life in the computer world, and they're here to stay. Once you get
used to the concept, you'll appreciate how they make the life of a programmer
easier and more organized.

Enjoy your data!