Chapter 1:Distributed Database Concepts

These are the basic concepts and terminology of Oracle's distributed database architecture. The chapter includes:

1.1              Distributed Database Architecture

1.2              Database Links

1.1 Distributed Database Architecture

A distributed database system allows applications to access data from local and remote databases. In a homogenous distributed system, each database is an Oracle database. In a heterogeneous distributed system, at least one of the databases is a non-Oracle database. Distributed databases use a client/server architecture to process information requests.

This section contains the following topics:

1.1.1      Homogenous Distributed Database Systems

1.1.2      Heterogeneous Distributed Database Systems

1.1.3      Client/Server Database Architecture

1.1.4      Data Fragmentation

 

 

1.1.1 Homogenous Distributed Database Systems

A homogenous distributed database system is a network of two or more Oracle databases that reside on one or more machines. Figure 1 illustrates a distributed system that connects three

Figure 1.1 Homogeneous Distributed Database

databases: ORACLE1, ORACLE2, and ORACLE0. An application can simultaneously access or modify the data in several databases in a single distributed environment. For example, a single query from a Manufacturing client on local database ORACLE2 can retrieve joined data from the PRODUCTS table on the local database and the DEPT table on the remote ORACLE1 database.

For a client application, the location and platform of the databases are transparent. One can also create synonyms for remote objects in the distributed system so that users can access them with the same syntax as local objects. For example, if one is connected to database ORACLE2 yet wants to access data on database ORACLE1, creating a synonym on ORACLE2 for the remote DEPT table allows one to issue this query:

SELECT * FROM dept;

In this way, a distributed system gives the appearance of native data access. Users on ORACLE2 do not have to know that the data they access resides on remote databases.

An Oracle distributed database system can incorporate Oracle databases of different versions. All supported releases of Oracle can participate in a distributed database system. Nevertheless, the applications that work with the distributed database must understand the functionality that is available at each node in the system--for example, a distributed database application cannot expect an Oracle7 database to understand the object SQL extensions that are only available with Oracle8i.

1.1.1.1 Distributed Databases Vs. Distributed Processing

The terms distributed database and distributed processing are closely related, yet have distinct meanings.

 

 

Distributed database 

A set of databases in a distributed system that can appear to applications as a single data source.  

For example, the setup of machines in the network shown above is such an instance.

Distributed processing 

The operations that occurs when an application distributes its tasks among different computers in a network.

 For example, a database application typically distributes front-end presentation tasks to client computers and allows a back-end database server to manage shared access to a database. Consequently, a distributed database application processing system is more commonly referred to as a client/server database application system. 

 

 

Oracle distributed database systems employ a distributed processing architecture. For example, an Oracle database server acts as a client when it requests data that another Oracle database server manages.

1.1.1.2 Distributed Databases Vs. Replicated Databases

The terms distributed database system and database replication are related, yet distinct. In a pure (that is, non-replicated) distributed database, the system manages a single copy of all data and supporting database objects. Typically, distributed database applications use distributed transactions to access both local and remote data and modify the global database in real-time.

 

The term replication refers to the operation of copying and maintaining database objects in multiple databases belonging to a distributed system. While replication relies on distributed database technology, database replication offers applications benefits that are not possible within a pure distributed database environment.

Most commonly, replication is used to improve local database performance and protect the availability of applications because alternate data access options exist.

 For example, an application may normally access a local database rather than a remote server to minimize network traffic and achieve maximum performance. Furthermore, the application can continue to function if the local server experiences a failure, but other servers with replicated data remain accessible.

1.1.2

Heterogeneous Distributed Database Systems

In a heterogeneous distributed database system, at least one of the databases is a non-Oracle system. To the application, the heterogeneous distributed database system appears as a single, local, Oracle database; the local Oracle database server hides the distribution and heterogeneity of the data. The Oracle database server accesses the non-Oracle system using Oracle8i Heterogeneous Services in conjunction with an agent. If one accesses the non-Oracle data store using an Oracle Transparent Gateway, then the agent is a system-specific application.

 For example, if one includes a Sybase database in an Oracle distributed system, then one needs to obtain a Sybase-specific transparent gateway so that the Oracle databases in the system can communicate with it.

Alternatively, generic connectivity can be used to access non-Oracle data stores so long as the non-Oracle system supports the ODBC or OLE DB protocols. If one uses the generic agent included with the Oracle database server, then one does not need to purchase a separate transparent gateway.

1.1.2.1

 

 

Generic Connectivity

Generic connectivity allows one to connect to non-Oracle8i data stores by using either a Heterogeneous Services ODBC agent or a Heterogeneous Services OLE DB agent--both are included with the Oracle8i product as a standard feature. Any data source compatible with the ODBC or OLE DB standards can be accessed using a generic connectivity agent.

Generic connectivity is implemented as one of the following types of Heterogeneous Service agents:

Each user session receives its own dedicated agent process spawned by the first use in that user session of the database link to the non-Oracle system. The agent process ends when the user session ends.

The advantage to generic connectivity is that it does not require one to purchase and configure a separate system-specific agent. One simply need an ODBC or OLE DB driver that can interface with the agent.

1.1.2.2

Heterogeneous Services Features

The features of the Heterogeneous Services include the following:

Feature 

Purpose 

Distributed transactions 

Allows a transaction to span both Oracle and non-Oracle systems, while still guaranteeing transaction consistency. 

SQL translations 

Integrates data from non-Oracle systems into the Oracle environment as if the data were stored in one local database. SQL statements are transparently transformed into SQL statement understood by the non-Oracle system.  

Data dictionary translations 

Makes a non-Oracle system appear as an Oracle database server. SQL statements containing references to Oracle's data dictionary tables are transformed into SQL statements containing references to a non-Oracle system's data dictionary tables. 

Pass-Through SQL 

Gives application programmers direct access to a non-Oracle system from an Oracle application using the non-Oracle system's SQL dialect. 

Stored procedure access 

Allows access to stored procedures in SQL-based non-Oracle systems as if they were PL/SQL remote procedures. 

NLS support 

Supports multi-byte character sets, and translates character sets between a non-Oracle system and the Oracle8i server. 

Multi-Threaded agents 

Takes advantage of the operating system's threading capabilities by reducing the number of required processes. 

Agent self-registration 

Automates the updating of Heterogeneous Services configuration data on remote hosts, ensuring correct operation over heterogeneous database links. 

Generic connectivity 

Allows Oracle8i to connect to a non-Oracle data store using the ODBC or OLE DB protocol. 

 

 

1.1.3

 

Client/Server Database Architecture

A database server is the Oracle software managing a database, and a client is an application that requests information from a server. Each computer in a network is a node that can host one or more databases. Each node in a distributed database system can act as a client, a server, or both, depending on the situation.

In Figure 1.2, the host for the ORACLE1 database is acting as a database server when a statement is issued against its local data (for example, the second statement in each transaction issues a statement against the local DEPT table), but is acting as a client when it issues a statement against remote data (for example, the first statement in each transaction is issued against the remote table EMP in the ORACLE0 database).

 

 

 

Figure 1.2 An Oracle Distributed Database System

 

1.1.3.1 Direct and Indirect Connections

A client can connect directly or indirectly to a database server. A direct connection occurs when a client connects to a server and accesses information from a database contained on that server.

 For example, if one connects to the ORACLE1 database and access the DEPT table on this database as in Figure 1.2, one can issue the following:

                                                             SELECT * FROM dept;

This query is direct because one is not accessing an object on a remote database.

In contrast, an indirect connection occurs when a client connects to a server and then accesses information contained in a database on a different server.

For example, if one connects to the ORACLE1 database but access the EMP table on the remote ORACLE0 database as in Figure 2 one can issue the following:

                                              SELECT * FROM emp@oracle0;
 

This query is indirect because the object one is accessing is not on the database to which one is directly connected.

1.1.5 Data Fragmentation

If relation r is fragmented, r is divided into a number of fragments r1,r2,r3….r n. These fragments contain sufficient information to allow reconstruction of the original relation r. This reconstruction can take place through the application of either the union operation or a special type of join operation on the various fragments. There are two different schemes for fragmenting a relation; horizontal fragmentation and vertical fragmentation. Horizontal fragmentation splits the relation by assigning each tuple of r or to one or more segments. Vertical fragmentation splits the relation by decomposing the relation R of relation r in a special way. These two schemes can be applied successively to the same relation, resulting in a number of fragments. We take the example of an Account-schema(branch, account, balance)

1.1.4.1 Horizontal Fragmentation

The relation r is partitioned into a number of subsets r1, r2, ….r n. Each tuple of relation must belong to at least one of the fragments, so the original relation can be reconstructed, if needed

A fragment can be defined as selection on the global relation r. That is the predicate Pi to construct fragment ri as follows     ri=spi(r)         

Branch

Account

Balance

Hillside

a-305

500

Hillside

a-226

336

Valleyview

a-177

205

Valleyview

a-402

10000

Hillside

a-155

62

Valleyview

a-408

1123

Valleyview

a-639

750

Sample account

Hillside

a-305

500

Hillside

a-226

336

Hillside

a-155

62

Account 1

Valleyview

a-177

205

Valleyview

a-402

10000

Valleyview

a-408

1123

Valleyview

a-639

750

Account2

We obtain the reconstruction of the relation r by taking the union of all the fragments ie. R=r1Čr2Č….Črn

As an illustration suppose that the relation r is the account relation. This relation can be divided into n different fragments, each of which consists of tuples of accounts belonging to a particular branch. If the banking system has only two branches Hillside and Valleyview—then there are two different fragments.

Account1=sbranch = “Hillside”(account)    Account2=sbranch = “Valleyview”(account)

The fragments in this example are disjoint

1.1.4.2 Vertical Fragmentation

Vertical fragmentation is the same as decomposition.Vertical fragmentation of r(R) involves the definition of several subsets of attributes R1,R2,……..Rn of the schema R such

            R=R1ČR2Č…….. ČRn

Each fragment ri of r is defined by ri=ŐRi( r)  The fragmentation should be done such thwt we can reconstruct relation r from the fragments by taking the natural join

       r = r1´r2´r3…….´rn      

 

 

 

 

 

 

 

 

 

 

 

 

 

1.2       Database Links

The central concept in distributed database systems is a database link. A database links is a connection between two physical database servers that allows a client to access them as one logical database.

This section contains the following topics:

1.2.1      What Are Database Links?

1.2.2      What Are Shared Database Links?

1.2.3      Why Use Database Links?

1.2.4      Global Database Names in Database Links

1.2.5      Types of Database Links

1.2.6      Users of Database Links

1.2.7      Creation of Database Links: Examples

1.2.8      Schema Objects and Database Links

1.2.9      Database Link Restrictions

1.2.1 What Are Database Links?

A database link is a pointer that defines a one-way communication path from an Oracle database server to another database server. The link pointer is actually defined as an entry in a data dictionary table. To access the link, one must be connected to the local database that contains the data dictionary entry.

A database link connection is one-way in the sense that a client connected to local database A can use a link stored in database A to access information in remote database B, but users connected to database B cannot use the same link to access data in database A. If local users on database B want to access data on database A, then they must define a link that is stored in the data dictionary of database B.

A database link connection allows local users to access data on a remote database. For this connection to occur, each database in the distributed system must have a unique global database name in the network domain. The global database name uniquely identifies a database server in a distributed system.

Figure 1.3 shows an example of user SCOTT accessing the EMP table on the remote database with the global name ORACLE1.VRCE.ernet.in:


Figure 1.3 Database Link

Database links are either private or public. If they are private, then only the user who created the link has access; if they are public, then all database users have access.

One principal difference among database links is the way that connections to a remote database occur. Users accessing a remote database through a:

One can create database links using the CREATE DATABASE LINK statement. After a link is created, one can use it to specify schema objects in SQL statements.

1.2.2 What Are Shared Database Links?

A shared database link is a link between a local server process and the remote database. The link is shared because multiple client processes can use the same link simultaneously.

When a local database is connected to a remote database through a database link, either database can run in dedicated or multi-threaded server (MTS) mode. The following table illustrates the possibilities:

Local Database Mode 

Remote Database Mode 

Dedicated 

Dedicated 

Dedicated 

Multi-threaded 

Multi-threaded 

Dedicated 

Multi-threaded 

Multi-threaded 

 

 

A shared database link can exist in any of these four configurations. Shared links differ from standard database links in the following ways:

1.2.3

Why Use Database Links?

The great advantage of database links is that they allow users to access another user's objects in a remote database so that they are bounded by the privilege set of the object's owner. In other words, a local user can access a link to a remote database without having to be a user on the remote database.

For example, assume that employees submit expense reports to Accounts Payable (A/P), and further suppose that a user using an A/P application needs to retrieve information about employees from the HR database. The A/P users should be able to connect to the HR database and execute a stored procedure in the remote HR database that retrieves the desired information. The A/P users should not need to be HR database users to do their jobs; they should only be able to access HR information in a controlled way as limited by the procedure.

Database links allow one to grant limited access on remote databases to local users. By using current user links, one can create centrally managed global users whose password information is hidden from both administrators and non-administrators. For example, A/P users can access the HR database as SCOTT, but unlike fixed user links, SCOTT's credentials are not stored where database users can see them.

By using fixed user links, one can create non-global users whose password information is stored in unencrypted form in the LINK$ data dictionary table. Fixed user links are easy to create and require low overhead because there are no SSL or directory requirements, but a security risk results from the storage of password information in the data dictionary.

1.2.4

Global Database Names in Database Links

To understand how a database link works, one must first understand what a global database name is. Each database in a distributed database is uniquely identified by its global database name. Oracle forms a database's global database name by prefixing the database's network domain, specified by the DB_DOMAIN initialization parameter at database creation, with the individual database name, specified by the DB_NAME initialization parameter.

For example, Figure 1.4 illustrates a representative hierarchical arrangement of databases throughout a network.

Figure1.4 Network Directories and Global Database Names

The name of a database is formed by starting at the leaf of the tree and following a path to the root. For example, the MFTG database is in DIVISION3 of the VRCE_TOOLS branch of the COM domain. The global database name for MFTG is created by concatenating the nodes in the tree as follows:

MFTG.DIVISION3.VRCE_TOOLS.ernet.in
 

While several databases can share an individual name, each database must have a unique global database name. For example, the network domains US.ETRLS.VRCEadmin.ernet.in and UK.ELNX.VRCEadmin.ernet.in each contain a ORACLE0 database. The global database naming system distinguishes the ORACLE0 database in the ETRLS division from the ORACLE0 database in the ELNX division as follows:

ORACLE0.US.ETRLS.VRCEadmin.ernet.in
ORACLE0.UK.ELNX.VRCEadmin.ernet.in
 

1.2.5

 

Types of Database Links

Oracle allows one to create private, public, and global database links. These basic link types differ according to which users are allowed access to the remote database:

Type

Owner

Description

Private  

User who created the link. Access ownership data through DBA_DB_LINKS or ALL_DB_LINKS. 

Creates link in a specific schema of the local database. Only the owner of a private database link or PL/SQL subprograms in the schema can use this link to access database objects in the corresponding remote database. 

Public  

User called PUBLIC. Access ownership data through DBA_DB_LINKS or ALL_DB_LINKS. 

Creates a database-wide link. All users and PL/SQL subprograms in the database can use the link to access database objects in the corresponding remote database. 

Global 

User called PUBLIC. Access ownership data through DBA_DB_LINKS or ALL_DB_LINKS. 

Creates a network-wide link. When an Oracle network uses Oracle Names, the names servers in the system automatically create and manage global database links for every Oracle database in the network. Users and PL/SQL subprograms in any database can use a global link to access objects in the corresponding remote database.  

 

 

Determining the type of database links to employ in a distributed database depends on the specific requirements of the applications using the system. Consider these advantages and disadvantages:

Private Database Link 

This link is more secure than a public or global link, because only the owner of the private link, or subprograms within the same schema, can use the link to access the remote database. 

Public Database Link 

When many users require an access path to a remote Oracle database, one can create a single public database link for all users in a database. 

Global Database Link 

When an Oracle network uses Oracle Names, an administrator can conveniently manage global database links for all databases in the system. Database link management is centralized and simple. 

 

1.2.6

 

 

Users of Database Links

When creating the link, one determines which user should connect to the remote database to access the data. The following table explains the differences among the categories of users involved in database links:

User Type

Meaning

Sample Link Creation Syntax

Connected user 

A local user accessing a database link in which no fixed username and password have been specified. If SYSTEM accesses a public link in a query, then the connected user is SYSTEM, and Oracle connects to the SYSTEM schema in the remote database.

Note: A connected user does not have to be the user who created the link, but is any user who is accessing the link.  

CREATE PUBLIC DATABASE LINK oracle1 USING 'oracle1'; 

Current user 

A global user in a CURRENT_USER database link. The global user must be authenticated by an X.509 certificate and a private key and be a user on both databases involved in the link. Current user links are an aspect of the Oracle Advanced Security option.

CREATE PUBLIC DATABASE LINK oracle1 CONNECT TO CURRENT_USER using 'oracle1'; 

Fixed user 

A user whose username/password is part of the link definition. If a link includes a fixed user, then the fixed user's username and password are used to connect to the remote database. 

CREATE PUBLIC DATABASE LINK oracle1 CONNECT TO Jhoro IDENTIFIED BY doe USING 'oracle1'; 

 

1.2.6.1

 

 

Connected User Database Links

Connected user links have no connect string associated with them. The advantage of a connected user link is that a user referencing the link connects to the remote database as the same user. Furthermore, because no connect string is associated with the link, no password is stored in clear text in the data dictionary.

Connected user links have some disadvantages. Because these links require users to have accounts and privileges on the remote databases to which they are attempting to connect, they require more privilege administration for administrators. Also, giving users more privileges than they need violates the fundamental security concept of least privilege: users should only be given the privileges they need to perform their jobs.

The ability to use connected user database link depends on several factors, chief among them whether the user is authenticated by Oracle using a password, or externally authenticated by the operating system or a network authentication service. If the user is externally authenticated, then the ability to use a connected user link also depends on whether the remote database accepts remote authentication of users, which is set by the REMOTE_OS_AUTHENT initialization parameter.

The REMOTE_OS_AUTHENT parameter operates as follows:

If REMOTE_OS_AUTHENT is... 

Then... 

TRUE for the remote database 

An externally-authenticated user can connect to the remote database using a connected user database link. 

FALSE for the remote database 

An externally-authenticated user cannot connect to the remote database using a connected user database link unless a secure protocol or a network authentication service supported by the Oracle Advanced Security option is used.  

 

 

1.2.6.2

Fixed User Database Links

A benefit of a named link is that it connects a user in a primary database to a remote database with the security context of the user in the connect string. For example, local user JOE can create a public database link in JOE's schema that specifies the fixed user SCOTT with password TIGER. If JHORO uses the fixed user link in a query, then JHORO is the user on the local database, but he connects to the remote database as SCOTT/TIGER.

Fixed user links have a username and password associated with the connect string. The username and password are stored in unencrypted form in the data dictionary in the LINK$ table. This fact creates a possible security weakness of fixed user database links: a user with the SELECT ANY TABLE privilege has access to the data dictionary so long as the O7_DICTIONARY_ACCESSIBILITY initialization parameter is set to TRUE, and thus the authentication associated with a fixed user is compromised.

 

For an example of this security problem, assume that JHORO does not have privileges to use a private link that connects to the HR database as SCOTT/TIGER, but has SELECT ANY TABLE privilege on a database in which the O7_DICTIONARY_ACCESSIBILITY initialization parameter is set to TRUE. He can select from LINK$ and read that the connect string to HR is SCOTT/TIGER. If JHORO has an account on the host on which HR resides, then he can connect to the host and then connect to HR as SCOTT using the password TIGER. He will have all SCOTT's privileges if he connects locally and any audit records will be recorded as if he were SCOTT.

1.2.6.3 Current User Database Links

Current user database links make use of a global user. A global user must be authenticated by an X.509 certificate and a private key and be a user on both databases involved in the link.

The user invoking the CURRENT_USER link does not have to be a global user. For example, if JHORO is authenticated by password to the Accounts Payable database, he can access a stored procedure to retrieve data from the HR database. The procedure uses a current user database link, which connects her to HR as global user SCOTT. SCOTT is a global user and thereby authenticated through a certificate and private key over SSL, but JHORO is not.

Note that current user database links have these consequences:

 

1.2.7

Creation of Database Links: Examples

Create database links using the CREATE DATABASE LINK statement. The table gives examples of SQL statements that create database links in a local database to the remote ORACLE0.US.ETRLS.VRCE admin.ernet.in database:

SQL Statement 

Connects To Database 

Connects As 

Link Type 

CREATE DATABASE LINK oracle0.us.etrls.VRCEadmin.ernet.in USING 'oracle0_us'; 

ORACLE0 using net service name ORACLE0_US 

Connected user 

Private connected user 

CREATE DATABASE LINK foo CONNECT TO CURRENT_USER USING 'am_sls'; 

ORACLE0 using service name AM_SLS 

Current global user 

Private current user 

CREATE DATABASE LINK oracle0.us.etrls.VRCEadmin.ernet.in CONNECT TO scott IDENTIFIED BY tiger USING 'oracle0_us'; 

ORACLE0 using net service name ORACLE0_US 

SCOTT using password TIGER 

Private fixed user  

CREATE PUBLIC DATABASE LINK oracle0 CONNECT TO scott IDENTIFIED BY tiger USING 'rev'; 

ORACLE0 using net service name REV 

SCOTT using password TIGER 

Public fixed user  

CREATE SHARED PUBLIC DATABASE LINK oracle0.us.etrls.VRCEadmin.ernet.in CONNECT TO scott IDENTIFIED BY tiger AUTHENTICATED BY kaustav IDENTIFIED BY kaustav USING 'oracle0'; 

ORACLE0 using net service name ORACLE0 

SCOTT using password TIGER, authenticated as KAUSTAV using password KAUSTAV 

Shared public fixed user  

 

1.2.8

 

 

Schema Objects and Database Links

After one has created a database link, one can execute SQL statements that access objects on the remote database. For example, to access remote object EMP using database link FOO, one can issue:

SELECT * FROM emp@foo;
 

Constructing properly formed object names using database links is an essential aspect of data manipulation in distributed systems.

1.2.8.1 Naming of Schema Objects Using Database Links

Oracle uses the global database name to name the schema objects globally using the following scheme:

schema.schema_object@global_database_name

where:

schema

 

is a collection of logical structures of data, or schema objects. A schema is owned by a database user and has the same name as that user. Each user owns a single schema. 

Schema_object

 

is a logical data structure like a table, index, view, synonym, procedure, package, or a database link. 

global_database_name

 

is the name that uniquely identifies a remote database. This name must be the same as the concatenation of the remote database's initialization parameters DB_NAME and DB_DOMAIN, unless the parameter GLOBAL_NAMES is set to FALSE, in which case any name is acceptable. 

 

 

For example, using a database link to database ORACLE0.DIVISION3.VRCE.ernet.in, a user or application can reference remote data as follows:

SELECT * FROM scott.emp@oracle0.division3.VRCE.ernet.in;  # emp table in scott's schema
SELECT loc FROM scott.dept@oracle0.division3.VRCE.ernet.in;
 

If GLOBAL_NAMES is set to FALSE, then one can use any name for the link to ORACLE0.DIVISION3.VRCE.ernet.in. For example, one can call the link FOO. Then, one can access database as follows:

SELECT name FROM scott.emp@foo; # link name different from global name .

1.2.8.1 Synonyms for Schema Objects

Oracle allows one to create synonyms so that one can hide the database link name from the user. A synonym allows access to a table on a remote database using the same syntax that one would use to access a table on a local database. For example, assume one issues the following query against a table in a remote database:

SELECT * FROM emp@oracle1.VRCE.ernet.in;
 

One can create the synonym EMP for EMP@ORACLE1.VRCE.ernet.in so that one can issue the following query instead to access the same data:

SELECT * FROM emp;

 

1.2.8.2

Schema Object Name Resolution

To resolve application references to schema objects (a process called name resolution), Oracle forms object names hierarchically. For example, Oracle guarantees that each schema within a database has a unique name, and that within a schema each object has a unique name. As a result, a schema object's name is always unique within the database. Furthermore, Oracle resolves application references to an object's local name.

In a distributed database, a schema object such as a table is accessible to all applications in the system. Oracle extends the hierarchical naming model with global database names to effectively create global object names and resolve references to the schema objects in a distributed database system. For example, a query can reference a remote table by specifying its fully qualified name, including the database in which it resides.

For example, assume that one connects to the local database as user SYSTEM:

CONNECT system/manager@oracle1

One then issue the following statements using database link ORACLE1.VRCE.ernet.in to access objects in the SCOTT and JHORO schemas on remote database ORACLE1:

SELECT * FROM scott.emp@oracle1.VRCE.ernet.in;
INSERT INTO Jhoro.accounts@oracle1.VRCE.ernet.in (acc_no, acc_name, balance)
  VALUES (5001, 'BOWER', 2000);
UPDATE Jhoro.accounts@oracle1.VRCE.ernet.in
  SET balance = balance + 500;
DELETE FROM Jhoro.accounts@oracle1.VRCE.ernet.in
  WHERE acc_name = 'BOWER';

 

 

1.2.9 Database Link Restrictions

One cannot perform the following operations using database links:

 

 

 

Chapter 2:

 

 

Materialized Views

This chapter introduces one to the use of materialized views and discusses:

2.1 The Need for Materialized Views

2.2 Types of Materialized Views

2.3 Creating a Materialized View

2.4 Nested Materialized Views

2.5 Altering a Materialized View

2.6 Dropping a Materialized View

2.1 The Need for Materialized Views

Materialized views are used in data warehouses to increase the speed of queries on very large databases. Queries to large databases often involve joins between tables or aggregations such as SUM, or both. These operations are very expensive in terms of time and processing power. The type of materialized view that is created determines how the materialized view can be refreshed and used by query rewrite.

Materialized views can be used in a number of ways and almost identical syntax can be used to perform a number of roles. For example, a materialized view can be used to replicate data, which was formerly achieved by using the CREATE SNAPSHOT statement. Now CREATE MATERIALIZED VIEW is a synonym for CREATE SNAPSHOT.

Materialized views improve query performance by precalculating expensive join and aggregation operations on the database prior to execution time and storing the results in the database. The query optimizer can use materialized views by automatically recognizing when an existing materialized view can and should be used to satisfy a request. It then transparently rewrites the request to use the materialized view. Queries are then directed to the materialized view and not to the underlying detail tables. In general, rewriting queries to use materialized views rather than detail tables results in a significant performance gain.

Figure 2.1 Transparent Query Rewrite

When using query rewrite, one wants to create materialized views that satisfy the largest number of queries. For example, if one identifies 20 queries that are commonly applied to the detail or fact tables, then one might be able to satisfy them with five or six well-written materialized views. A materialized view definition can include any number of aggregations (SUM, COUNT(x), COUNT(*), COUNT(DISTINCT x), AVG, VARIANCE, STDDEV, MIN, and MAX) and/or include any number of joins. If one is unsure of which materialized views to create, Oracle provides a set of advisory functions in the DBMS_OLAP package to help in designing and evaluating materialized views for query rewrite.

If a materialized view is to be used by query rewrite, it must be stored in the same database as its fact or detail tables. A materialized view can be partitioned, and one can define a materialized view on a partitioned table and one or more indexes on the materialized view.

Materialized views are similar to indexes in several ways: they consume storage space, they must be refreshed when the data in their master tables changes, and, when used for query rewrite, they improve the performance of SQL execution and their existence is transparent to SQL applications and users. Unlike indexes, materialized views can be accessed directly using a SELECT statement. Depending on the type of refresh that is required, they can also be accessed directly in an INSERT, UPDATE, or DELETE statement.

 

 

 

 

 

2.2 Types of Materialized Views

The SELECT clause in the materialized view creation statement defines the data that the materialized view is to contain. There are only a few restrictions on what may be specified. Any number of tables can be joined together, however, they cannot be remote tables if one wishes to take advantage of query rewrite or the warehouse refresh facility (part of the DBMS_OLAP package). Besides tables, views, inline views, subqueries, and materialized views may all be joined or referenced in the SELECT clause.

The types of materialized views are:

2.2.1      Materialized Views with Joins and Aggregates

2.2.2      Single-Table Aggregate Materialized Views

2.2.3      Materialized Views Containing Only Joins

2.2.1 Materialized Views with Joins and Aggregates

In data warehouses, materialized views would normally contain one of the aggregates shown in Example 2 below. For fast refresh to be possible, the SELECT list must contain all of the GROUP BY columns (if present), and may contain one or more aggregate functions. The aggregate function must be one of: SUM, COUNT(x), COUNT(*),COUNT(DISTINCT x), AVG, VARIANCE, STDDEV, MIN, and MAX, and the expression to be aggregated can be any SQL value expression.

If a materialized view contains joins and aggregates, then it cannot be fast refreshed using a materialized view log. Therefore, for a fast refresh to be possible, only new data can be added to the detail tables and it must be loaded using the direct path method.

Here are some examples of the type of materialized view that can be created.

Create Materialized View: Example 1

CREATE MATERIALIZED VIEW vrce_oracle1_mv
  PCTFREE 0 TABLESPACE mviews
  STORAGE (initial 16k next 16k pctincrease 0)
  BUILD DEFERRED
  REFRESH COMPLETE ON DEMAND
  ENABLE QUERY REWRITE
  AS
  SELECT
   s.vrce_name,
     SUM(dollar_oracle1) AS sum_dollar_oracle1
      FROM vrce s, fact f
      WHERE f.vrce_key = s.vrce_key
      GROUP BY s.vrce_name;
 

Example 1 creates a materialized view vrce_oracle1_mv that computes the sum of oracle1 by vrce. It is derived by joining the tables vrce and fact on the column vrce_key. The materialized view does not initially contain any data because the build method is DEFERRED. A complete refresh is required for the first refresh of a build deferred materialized view. When it is refreshed, a complete refresh is performed and, once populated, this materialized view can be used by query rewrite.

 

Create Materialized View: Example 2

CREATE MATERIALIZED VIEW vrce_stdcnt_mv
  PCTFREE 0 TABLESPACE mviews
  STORAGE (INITIAL 16k NEXT 16k PCTINCREASE 0)
  BUILD IMMEDIATE
  REFRESH FAST
  ENABLE QUERY REWRITE
  AS
  SELECT vrce_name, t.time_key,
     STDDEV(unit_oracle1) AS stdcnt_unit_oracle1
     AVG(unit_oracle1) AS avgcnt_unit_oracle1
     COUNT(unit_oracle1) AS count_days
     SUM(unit_oracle1) AS sum_unit_oracle1
   FROM vrce s, fact f, time t
      WHERE s.vrce_key = f.vrce_key AND
            f.time_key = t.time_key
      GROUP BY vrce_name, t.time_key;
 

The statement above creates a materialized view vrce_stdcnt_mv that computes the standard deviation for the number of units sold by a vrce on a given date. It is derived by joining the tables vrce, time, and fact on the columns vrce_key and time_key. The materialized view is populated with data immediately because the build method is immediate and it is available for use by query rewrite. In this example, the default refresh method is FAST, which is allowed because the COUNT and SUM aggregates have been included to support fast refresh of the STDDEV aggregate.

 

 

2.2.2 Single-Table Aggregate Materialized Views

A materialized view that contains one or more aggregates (SUM, AVG, VARIANCE, STDDEV, COUNT) and a GROUP BY clause may be based on a single table. The aggregate function can involve an expression on the columns such as SUM(a*b). If this materialized view is to be incrementally refreshed, then a materialized view log must be created on the detail table with the INCLUDING NEW VALUES option, and the log must contain all columns referenced in the materialized view query definition.

CREATE MATERIALIZED VIEW log on fact   
  with rowid (vrce_key, time_key, dollar_oracle1, unit_oracle1)  
  including new values;  
 
CREATE MATERIALIZED VIEW sum_oracle1  
  PARALLEL
  BUILD IMMEDIATE  
  REFRESH FAST ON COMMIT  
  AS  
  SELECT f.vrce_key, f.time_key,  
         COUNT(*) AS count_grp,  
  SUM(f.dollar_oracle1) AS sum_dollar_oracle1,  
        COUNT(f.dollar_oracle1) AS count_dollar_oracle1,  
  SUM(f.unit_oracle1) AS sum_unit_oracle1,  
        COUNT(f.unit_oracle1) AS count_unit_oracle1 
  FROM fact f  
  GROUP BY f.vrce_key, f.time_key;
  

In this example, a materialized view has been created which contains aggregates on a single table. Because the materialized view log has been created, the materialized view is fast refreshable. If DML is applied against the fact table, then, when the commit is issued, the changes will be reflected in the materialized view.

Table 2.1 illustrates the aggregate requirements for a single-table aggregate materialized view.

Table 2.1 Single-Table Aggregate Requirements

If aggregate X is present, aggregate Y is required and aggregate Z is optional 

X 

Y 

Z 

COUNT(expr) 

- 

- 

SUM(expr) 

COUNT(expr) 

- 

AVG(expr) 

COUNT(expr) 

SUM(expr) 

STDDEV(expr) 

COUNT(expr) 

SUM(expr * expr) 

VARIANCE(expr) 

COUNT(expr) 

SUM(expr * expr) 

 

 

Note that COUNT(*) must always be present.

Incremental refresh for a single-table aggregate materialized view is possible after any type of DML to the base tables (direct load or conventional INSERT, UPDATE, or DELETE).

A single-table aggregate materialized view can be defined to be refreshed ON COMMIT or ON DEMAND. If it is ON COMMIT, the refresh is performed at commit time of the transaction that does DML on one of the materialized view's detail tables.

After a refresh ON COMMIT, you are urged to check the alert log and trace files to see if any error occurred during the refresh.

2.2.3 Materialized Views Containing Only Joins

Materialized views may contain only joins and no aggregates, such as in the next example where a materialized view is created which joins the fact table to the vrce table. The advantage of creating this type of materialized view is that expensive joins will be precalculated.

Incremental refresh for a materialized view containing only joins is possible after any type of DML to the base tables (direct load or conventional INSERT, UPDATE, or DELETE).

A materialized view containing only joins can be defined to be refreshed ON COMMIT or ON DEMAND. If it is ON COMMIT, the refresh is performed at commit time of the transaction that does DML on the materialized view's detail table.   If one specifies REFRESH FAST, Oracle performs further verification of the query definition to ensure that fast refresh can be performed if any of the detail tables change.

If some of the above restrictions are not met, then the materialized view should be created as REFRESH FORCE to take advantage of incremental refresh when it is possible. If the materialized view is created as ON COMMIT, Oracle performs all of the fast refresh checks. If one of the tables did not meet all of the criteria, but the other tables did, the materialized view would still be incrementally refreshable with respect to the other tables for which all the criteria are met.

A materialized view log should contain the rowid of the master table. It is not necessary to add other columns.

To speed up refresh, it is recommended that the user create indexes on the columns of the materialized view that vrces the rowids of the fact table.

 
CREATE MATERIALIZED VIEW LOG ON fact   WITH ROWID;
 
CREATE MATERIALIZED VIEW LOG ON time   WITH ROWID;
 
CREATE MATERIALIZED VIEW LOG ON vrce   WITH ROWID;
 
CREATE MATERIALIZED VIEW detail_fact_mv   PARALLEL 
  BUILD IMMEDIATE
       REFRESH FAST
       AS
       SELECT
    f.rowid "fact_rid", t.rowid "time_rid", s.rowid "vrce_rid",
      s.vrce_key, s.vrce_name, f.dollar_oracle1,
      f.unit_oracle1, f.time_key
       FROM fact f, time t, vrce s 
       WHERE f.vrce_key = s.vrce_key(+) AND 
       f.time_key = t.time_key(+);
 

In this example, in order to perform a REFRESH FAST, unique constraints should exist on s.vrce_key and t.time_key. It is also recommended that indexes be created on the columns fact_rid, time_rid, and vrce_rid, as illustrated below, which will improve the performance of refresh.

CREATE INDEX mv_ix_factrid  ON
  detail_fact_mv(fact_rid);
 

Alternatively, if the example shown above did not include the columns time_rid and vrce_rid, and if the refresh method was REFRESH FORCE, then this materialized view would be fast refreshable only if the fact table was updated but not if the tables time or vrce were updated.

CREATE MATERIALIZED VIEW detail_fact_mv 
       PARALLEL
       BUILD IMMEDIATE
       REFRESH FORCE
       AS
       SELECT
    f.rowid "fact_rid",
      s.vrce_key, s.vrce_name, f.dollar_oracle1,
      f.unit_oracle1, f.time_key
       FROM fact f, time t, vrce s 
       WHERE f.vrce_key = s.vrce_key(+) AND 
       f.time_key = t.time_key(+);
 
 
 
 
 
 
 
 
 

2.3 Creating a Materialized View

A materialized view can be created with the CREATE MATERIALIZED VIEW statement or using Oracle Enterprise Manager. The following command creates the materialized view vrce_oracle1_mv.

CREATE MATERIALIZED VIEW vrce_oracle1_mv
  PCTFREE 0 TABLESPACE mviews
  STORAGE (INITIAL 16k NEXT 16k PCTINCREASE 0)
  PARALLEL
  BUILD IMMEDIATE
  REFRESH COMPLETE
  ENABLE QUERY REWRITE
  AS
  SELECT
   s.vrce_name,
     SUM(dollar_oracle1) AS sum_dollar_oracle1
      FROM vrce s, fact f
      WHERE f.vrce_key = s.vrce_key
      GROUP BY s.vrce_name;
 

 

It is not uncommon in a data warehouse to have already created summary or aggregation tables, and the DBA may not wish to repeat this work by building a new materialized view. In this instance, the table that already exists in the database can be registered as a prebuilt materialized view

 

 

 

2.3.1 

 

Used for Query Rewrite

Even though a materialized view is defined, it will not automatically be used by the query rewrite facility. The clause ENABLE QUERY REWRITE must be specified if the materialized view is to be considered available for rewriting queries.

If this clause is omitted or specified as DISABLE QUERY REWRITE when the materialized view is initially created, the materialized view can subsequently be enabled for query rewrite with the ALTER MATERIALIZED VIEW statement.

If one defines a materialized view as BUILD DEFERRED, it is also not eligible for query rewrite until it is populated with data.

2.3.2 Query Rewrite Restrictions

Query rewrite is not possible with all materialized views. If query rewrite is not occurring when expected, check to see if the materialized view satisfies all of the following conditions.

2.3.3 Materialized View Restrictions

1.     There cannot be non-repeatable expressions (ROWNUM, SYSDATE, non-repeatable PL/SQL functions, and so on) anywhere in the defining query.

2.     There cannot be references to RAW or LONG RAW datatypes or object REFs.

3.     The query must be a single-block query, that is, it cannot contain set functions (UNION, MINUS, and so on). However, a materialized view can have multiple query blocks (for example, inline views in the FROM clause and subselects in the WHERE or HAVING clauses).

4.     If the materialized view was registered as PREBUILT, the precision of the columns must agree with the precision of the corresponding SELECT expressions unless overridden by WITH REDUCED PRECISION.

2.3.4 Query Rewrite Restrictions

1.     If a query has both local and remote tables, only local tables will be considered for potential rewrite.

2.     None of the detail tables can be owned by SYS, and the materialized view cannot be owned by SYS.

2.3.5 Refresh Options

When one defines a materialized view, one can specify its two refresh options: how to refresh and what type of refresh. If unspecified, the defaults are assumed as ON DEMAND and FORCE.

The two refresh execution modes are: ON COMMIT and ON DEMAND. The method you select will affect the type of materialized view that can be defined.

 

 

Refresh Mode 

Description 

ON COMMIT 

Refresh occurs automatically when a transaction that modified one of the materialized view's fact tables commits. Can be used with materialized views on single table aggregates and materialized views containing joins only. 

ON DEMAND 

Refresh occurs when a user manually executes one of the available refresh procedures contained in the DBMS_MVIEW package (REFRESH, REFRESH_ALL_MVIEWS, REFRESH_DEPENDENT). 

 

 

If a materialized view fails during refresh at COMMIT time, the user has to explicitly invoke the refresh procedure using the DBMS_MVIEW package after addressing the errors specified in the trace files. Until this is done, the view will no longer be refreshed automatically at commit time.

One can specify how you want your materialized views to be refreshed from the detail tables by selecting one of four options: FORCE, COMPLETE, FAST, and NEVER.

Refresh Option 

Description 

COMPLETE 

Refreshes by recalculating the materialized view's defining query.  

FAST 

Refreshes by incrementally adding the new data that has been inserted into the tables. The new data is obtained from the direct path log or from the materialized view logs.  

FORCE 

Applies fast refresh if possible; otherwise, it applies COMPLETE refresh. 

NEVER 

Indicates that the materialized view will not be refreshed with the Oracle refresh mechanisms. 

 

 

Whether the fast refresh option is available will depend upon the type of materialized view. Fast refresh is available for three general classes of materialized views: materialized views with joins only, materialized views with joins and aggregates, and materialized views with aggregates on a single table.

2.3.5 General Restrictions on Fast Refresh

The materialized view's defining query is restricted as follows:

2.3.6 Restrictions on Fast Refresh on Materialized Views with Joins Only

Defining queries for materialized views with joins only and no aggregates have these restrictions on fast refresh:

2.3.7 Restrictions on Fast Refresh on Materialized Views with Single-Table Aggregates

Defining queries for materialized views with single-table aggregates have these restrictions on fast refresh:

2.3.8 Restrictions on Fast Refresh on Materialized Views with Joins and Aggregates:

Defining queries for materialized views with joins and aggregates have these restrictions on fast refresh:

 

ORDER BY

An ORDER BY clause is allowed in the CREATE MATERIALIZED VIEW statement. It is only used during the initial creation of the materialized view. It is not used during a full refresh or an incremental refresh.

To improve the performance of queries against large materialized views, vrce the rows in the materialized view in the order specified in the ORDER BY clause. This initial ordering provides physical clustering of the data. If indexes are built on the columns by which the materialized view is ordered, accessing the rows of the materialized view using the index will significantly reduce the time for disk I/O due to the physical clustering.

The ORDER BY clause is not considered part of the materialized view definition. As a result, there is no difference in the manner in which Oracle detects the various types of materialized views (for example, materialized join views with no aggregates). For the same reason, query rewrite is not affected by the ORDER BY clause. This feature is similar to the CREATE TABLE ... ORDER BY ... capability that exists in Oracle. For example:

CREATE MATERIALIZED VIEW sum_oracle1 
  REFRESH FAST ON DEMAND AS
  SELECT cityid, COUNT(*) count_all, 
  SUM(oracle1) sum_oracle1, COUNT(oracle1) cnt_oracle1
  FROM city_oracle1
  ORDER BY cityid;
 

In the above example, we would use the "ORDER BY cityid" clause only during the creation of the materialized view. The materialized view definition is not affected by the ORDER BY clause. The definition is:

SELECT cityid, COUNT(*) count_all, 
SUM(oracle1) sum_oracle1, COUNT(oracle1) cnt_oracle1
FROM city_oracle1

 

 

 

 

 

 

 

 

 

 

2.4 Nested Materialized Views

A nested materialized view is a materialized view whose definition is based on another materialized view. A nested materialized view may reference other relations in the database in addition to materialized views.

2.4.1 Why Use Nested Materialized Views?

In a data warehouse, you typically create many aggregate views on a single join (for example, rollups along different dimensions). Incrementally maintaining these distinct materialized aggregate views can take a long time because the underlying join has to be performed many times. By using nested materialized views, the join is performed just once (while maintaining the materialized view containing joins only) and incremental maintenance of single-table aggregate materialized views is very fast due to the self-maintenance refresh operations on this class of views. Using nested materialized views also overcomes the limitation posed by materialized aggregate views, where incremental maintenance can only be done with direct-load insert.

2.4.2 Limitations of Nested Materialized Views

Nested materialized views incur the space overhead of materializing the join and having a materialized view log. This is in contrast to materialized aggregate views where the space requirements of the materialized join view and its log are not demanding, but have relatively long refresh times due to multiple computations of the same join.

Nested materialized views are incrementally refreshable under any type of DML while materialized aggregate views are incrementally refreshable under direct-load insert only.

2.4.3 Example of a Nested Materialized View

One can create a materialized join view or a single-table aggregate materialized view on a single-table on top of another materialized join view, single-table aggregate materialized view, complex materialized view (a materialized view Oracle cannot perform incremental refresh on) or base table. All the underlying objects (be they materialized views or tables) on which the materialized view is defined on must have a materialized view log. All the underlying objects are treated as if they were tables. All the existing options for materialized join views and single-table aggregate materialized views can be used. Thus, ON COMMIT refresh is supported for these types of nested materialized views.

The following presents a retail database with an example schema and some materialized views to illustrate how nested materialized views can be created.

VRCE   (vrce_key, vrce_name, vrce_city, vrce_state, vrce_country)  
PRODUCT (prod_key, prod_name, prod_brand)   
TIME    (time_key, time_day, time_week, time_month)
FACT    (vrce_key, prod_key, time_key, dollar_oracle1)
  
/* create the materialized view logs */
CREATE MATERIALIZED VIEW LOG ON fact 
  WITH ROWID;
CREATE MATERIALIZED VIEW LOG ON vrce 
  WITH ROWID;
CREATE MATERIALIZED VIEW LOG ON time 
  WITH ROWID;
CREATE MATERIALIZED VIEW LOG ON product 
  WITH ROWID;
 
/*create materialized join view join_fact_vrce_time as incrementally 
refreshable at COMMIT time */
CREATE MATERIALIZED VIEW join_fact_vrce_time 
REFRESH FAST ON COMMIT AS
SELECT s.vrce_key, s.vrce_name, f.dollar_oracle1, t.time_key, t.time_day,  
       f.prod_key, f.rowid frid, t.rowid trid, s.rowid srid    
FROM fact f, vrce s, time t
WHERE f.time_key = t.time_key AND
      f.vrce_key = s.vrce_key;
 

To create a nested materialized view on the table join_fact_vrce_time, one would have to create a materialized view log on the table. Because this will be a single-table aggregate materialized view on join_fact_vrce_time, one needs to log all the necessary columns and use the INCLUDING NEW VALUES clause.

/* create materialized view log on join_fact_vrce_time */
CREATE MATERIALIZED VIEW log on join_fact_vrce_time 
  WITH rowid (vrce_name, time_day, dollar_oracle1)
  INCLUDING new values;
 
/* create the single-table aggregate materialized view sum_oracle1_vrce_time on 
join_fact_vrce_time as incrementally refreshable at COMMIT time. */
CREATE MATERIALIZED VIEW sum_oracle1_vrce_time 
  REFRESH FAST ON COMMIT 
  AS
  SELECT COUNT(*) cnt_all, SUM(dollar_oracle1) sum_oracle1, COUNT(dollar_oracle1)
         cnt_oracle1, vrce_name, time_day
  FROM join_fact_vrce_time
  GROUP BY vrce_name, time_day;
 

Note that the above single-table aggregate materialized view sum_oracle1_vrce_time is logically equivalent to a multi-table aggregate on the tables fact, time, and vrce whose definition is

SELECT COUNT(*) cnt_all, SUM(f.dollar_oracle1) sum_oracle1, 
       COUNT(f.dollar_oracle1) cnt_oracle1, s.vrce_name, t.time_day
FROM fact f, time t , vrce s
WHERE f.time_key = t.time_key AND 
      f.vrce_key = s.vrce_key 
GROUP BY vrce_name, time_day;
 

One can now define a materialized join view join_fact_vrce_time_prod as a join between join_fact_vrce_time and table product.

CREATE MATERIALIZED VIEW join_fact_vrce_time_prod 
  REFRESH FAST ON COMMIT 
  AS
  SELECT j.rowid jrid, p.rowid prid, j.vrce_name, j.prod_key, j.prod_name,
         j.dollar_oracle1
  FROM join_fact_vrce_time j, product p 
  WHERE j.prod_key = p.prod_key;
 

The above schema can be diagrammatically represented as in Figure 2.1


Figure 2.1 Nested Materialized View Schema

 

2.4.3.1 Nesting Materialized Views with Joins and Aggregates

Materialized views with joins and aggregates can be nested if they are refreshed as COMPLETE REFRESH. Thus, a user can arbitrarily nest materialized views having joins and aggregates. No incremental maintenance is possible for these materialized views.

Note that the ON COMMIT refresh option is not available for complex materialized views. Because you have to invoke the refresh functions manually, ordering has to be taken into account. This is because the refresh for a materialized view that is built on other materialized views will use the current state of the other materialized views, whether they are fresh or not. You can find the dependent materialized views for a particular object using the PL/SQL function GET_MV_DEPENDENCIES() in the DBMS_MVIEW package.

2.4.3.2 Nested Materialized View Usage Guidelines

Here are some guidelines on how to use nested materialized views:

1.     If incremental refresh is desired when a materialized view contains joins and aggregates and standard fast refresh cannot be used because DML is occurring on the tables, consider creating a single-table aggregate materialized view nested on a materialized join view.

2.     If incremental refresh is desired, you should incrementally refresh all the materialized views along any chain. It makes little sense to define an incrementally refreshable materialized view on top of a materialized view that must be refreshed with a complete refresh.

3.     When using materialized join views and single-table aggregate materialized views, you can define them to be ON COMMIT or ON DEMAND. The choice would depend on the application using the materialized views. If one expects the materialized views to always remain fresh, then all the materialized views should have the ON COMMIT refresh option. If the time window for refresh does not permit refreshing all the materialized views at commit time, then the appropriate materialized views could be created with (or altered to have) the ON DEMAND refresh option. 

 

 

 

2.5 Altering a Materialized View

There are five amendments that can be made to a materialized view:

All other changes are achieved by dropping and then recreating the materialized view.

The COMPILE clause of the ALTER MATERIALIZED VIEW statement can be used when the materialized view has been invalidated as described in "Invalidating a Materialized View". This compile process is quick, and allows the materialized view to be used by query rewrite again.

 

 

 

 

 

 

2.6 Dropping a Materialized View

Use the DROP MATERIALIZED VIEW statement to drop a materialized view. For example:

DROP MATERIALIZED VIEW oracle1_sum_mv;
 

This command drops the materialized view oracle1_sum_mv. If the materialized view was prebuilt on a table, then the table is not dropped but it can no longer be maintained with the refresh mechanism. Alternatively, you can drop a materialized view using Oracle Enterprise Manager.

Loading strategies can be classified as one-phase or two-phase. In one-phase loading, data is loaded directly into the target table, quality assurance tests are performed, and errors are resolved by performing DML operations prior to refreshing materialized views. If a large number of deletions are possible, then storage utilization may be adversely affected, but temporary space requirements and load time are minimized. The DML that may be required after one-phase loading causes multi-table aggregate materialized views to become unusable in the safest rewrite integrity level.

In a two-phase loading process:

Immediately after loading the detail data and updating the indexes on the detail data, the database can be opened for operation, if desired. Query rewrite can be disabled by default (with ALTER SYSTEM SET QUERY_REWRITE_ENABLED = FALSE) until all the materialized views are refreshed, but enabled at the session level for any users who do not require the materialized views to reflect the data from the latest load (with ALTER SESSION SET QUERY_REWRITE_ENABLED = TRUE). However, as long as QUERY_REWRITE_INTEGRITY = ENFORCED or TRUSTED, this is not necessary because the system ensures that only materialized views with updated data participate in a query rewrite.

 

 

 

 

 

 

 

 

CHAPTER 3:CONCEPT OF REPLICATION

 

Replication is the process of copying and maintaining database objects,

such as tables,  in multiple databases that make up a distributed database system.

In a replicated environment multiple copies of the same database object are present in various sites constituting the replication environment.  The changes applied to the database objects at one site are monitored continuously, stored locally  and at frequent intervals, are forwarded and applied at each of the remote locations. Replication uses distributed database technology to share data between multiple sites but it is different from the distributed database system in the sense unlike in the distributed architecture where a particular database object resides at a particular location, in a replicated database the same data is available at multiple locations.

 

Many reasons provide a strong case for the use of Replicated environments. Some of them are listed below.

 

3.1 Availability

Replication improves the availability of applications because it provides them with alternative data access options. If one site becomes unavailable, users can continue to query or even update the remaining locations.

3.2 Performance

Replication provides fast, local access to shared data because it balances activity over multiple sites. Some users can access one server while other users access other servers, thereby reducing the load at all servers. Also, users can access data from the replication site that has the lowest access cost, which is typically the site that is geographically closest to them.  

 

 

3.3 Disconnected Computing

A snapshot is a complete or partial copy (replica) of a target master table from a single point in time. Snapshots enable users to work on a subset of a database while disconnected from the central database server. Later, when a connection is established, users can synchronize (refresh) snapshots on demand. When users refresh snapshots, they update the central database with all of their changes, and they receive any changes that may have happened while they were disconnected.

 

 

3.4 Network  Load Reduction

Replication can be used to distribute data over multiple     regional locations. Then, applications can access various regional servers instead of accessing one central server. This configuration can reduce network load dramatically.

 

3.5 Types of Replication Environment         

 

Oracle supports three types of Replication environments as indicated below.

3.5.1 Multimaster replication

3.5.2 Snapshot replication.

3.5.3 Multimaster and Snapshot Hybrid Configuration.

   

3.5.2 MULTIMASTER REPLICATION.

Multimaster replication is the type of replication environment which allows multiple sites, acting as equal peers, to manage groups of replicated database objects. Each site participating in this environment is called a master site. All the sites interact with every other site in the environment frequently to exchange updates that have taken place to each of their database objects. These changes are then applied at all the sites; in other words after each exchange all the sites contain the same updated  information. A single master site acts as a master definition site for the master groups ( Master Groups are dealt in detail later). This particular site performs many of the

administrative and maintenance tasks for the multimaster replication environment. Any of the master sites can be configured as the master definition site but there can be only a single master definition site for a master group and additionally the master definition site can be changed to a different master site if necessary. 

 

Multimaster master scheme ensures excellent failover protection, to make sure information is available when and where the users need it. It can be used to protect the availability of a mission critical database. For example, a multimaster replication environment can replicate all of the data in the database to establish a failover site should the primary site become unavailable due to system or network outages. In contrast with Oracle's standby database feature, such a failover site also can serve as a fully functional database to support application access when the primary site is concurrently operational, whereas a standby database can only become fully functional if the primary site is unavailable.

 

There are two types of multimaster replication.

3.5.2.1 Asynchronous Replication,

often referred to as store-and-forward replication, captures any local changes, stores them in a queue, and, at regular intervals, propagates and applies these changes at remote sites. With this form of replication, there is a period of time before all sites achieve data convergence.

 

3.5.2.2 Synchronous Replication,

also known as real-time replication, applies any changes or executes any replicated procedures at all sites participating in the replication environment as part of a single transaction. If the DML or procedure fails at any site, the entire transaction rolls back. Synchronous replication ensures data consistency at all sites in real-time.

 

 

 

 

 

3.5.2.1 Asynchronous Replication

Asynchronous replication independently sends the changes to all the sites in the environment. Propagation occurs separately after the execution of the replication procedure locally. This is the default mode of replication. The process of asynchronous replication is described below.

 

*User issues DML or executes a wrapper for a replicated procedure.

Once a table has been set up for replication, any DML that a user issues on the table is captured for replication to all other master sites.

For each row that is inserted, updated, or deleted, an internal trigger creates a deferred remote procedure call (RPC). The deferred transaction queue contains all deferred RPCs. If a procedure has been replicated and its wrapper is executed at a master site, the procedure call is also placed in the deferred transaction queue.

 

*Deferred Transaction Queue stores deferred RPCs.

Each transaction in the deferred transaction queue has a list of destinations that define where the deferred transaction should be propagated; this list contains all master sites except for the originating site.

 

*Propagation sends deferred transaction queue entry to destination.

At scheduled intervals or on-demand, the deferred transactions in the deferred transaction queue are propagated to the target destinations.

 

 

 

*Deferred transaction queue entry applied at remote destination.

As a deferred transaction is being propagated to a target destination, each deferred RPC is applied at the destination site. If the deferred transaction cannot be successfully applied at the destination site, it is resent and placed into the error queue at the destination site, where the DBA can fix the error condition and re-apply the deferred transaction.

When a deferred transaction queue entry is applied at the remote destination, Oracle checks for data conflicts. If a conflict is detected, it is logged and optionally a conflict resolution method is invoked.

 

3.5.2.2 Synchronous Replication

Synchronous replication propagates any changes made at a local site to other synchronously linked masters in a replication environment during the same transaction as the initial change. If the propagation fails at any of the master sites, the entire transaction, including the initial change at the local master site, rolls back. This strict enforcement ensures data consistency across the replication environment. The process of synchronous replication is described below.                                                                                                     

* User issues DML or executes a wrapper for a replicated procedure.

Once a table has been setup for replication, any DML that a user issues on the target table is captured for replication to all other master replication sites. If a procedure has been replicated and its wrapper is executed at a master site, the procedure call is also captured for replication.

 

 

*DML or wrapper execution is immediately propagated to destination sites.

The internal trigger captures any DML and immediately propagates these actions to all other master sites in the replication environment. The internal trigger applies these actions in the security context of the propagator's database link and uses an internal RPC to apply these actions at the destination site.

Like an internal trigger, a wrapper for a replicated procedure immediately propagates the procedure call to all other master sites in the replication environment. If the transaction fails at any one of the master replication sites, then the transaction is rolled back at all master sites. This methodology ensures data consistency across all master replication sites. Because of the need to rollback a transaction if any site fails, synchronous replication is extremely dependent on highly-available networks, databases, and the associated hardware.

 Asynchronous replication, however, means that the data sets at the different master sites in the replication environment may be different for a period of time before the changes have been propagated. Also, data conflicts may occur in an asynchronous replication environment. But in Synchronous replication, there is never a period of time when the data at any of the master sites does not match. Also no data conflicts are introduced.  These benefits have the cost of requiring many hardware and networking resources with no flexibility for downtime.

3.6 Conflict Resolution Concepts

 

Asynchronous replication might result in conflicts. The possible conflicts that can occur fall into three categories:

 

UPDATE CONFLICTS – Two or more updates are applied to the same row before the preceding update can be propagated to the other sites.

 

UNIQUENESS CONFLICTS – An insert is performed at two or more sites and the primary key for each insert contains the same value, or an update at one site modifies the primary key , which contains the same value as an insert at another site.

 

DELETE CONFLICTS – A row is deleted at one site and an update occurs at another site, which may result in an attempt to update a row that does not exist, or the same row is deleted in the same interval at more than one site.

 

 An instance where an update conflict is possible is shown below.

TIME

Master Site A

Master Site B

5:00 P.M.

Propagate changes to Master Site B

Propagate changes to Master Site A

5:05 P.M.

Updates row 1

 

5:07 P.M.

 

Updates row 1

5:10 P.M.

Propagate changes to Master Site B

Propagate changes to Master Site A

 

If the propagation of changes occurs every 10 minutes in the environment, in the above instance we find a conflict occurs on row 1 of the table at 5:10 P.M. Once a data conflict is detected, the following actions occur: The conflict resolution methods try to resolve the data conflict. If the conflict is not resolved, the data conflict is logged in the error queue at the destination site. If the data conflict is logged in the error queue, the database administrator is responsible for resolving the data conflict manually. One of the conflict resolution methods is to take the latest change in the interval into consideration and apply it ignoring the rest of them. A refined form of this is called “Latest Timestamp” method of resolving the conflicts. In addition to this Oracle provides a pre-built method called “Overwrite” method. This method ignores the values from the originating site and hence it doesn’t guarantee convergence of data. This method is designed to be used by single master site and multiple snapshot sites. Oracle does not provide any prebuilt conflict resolution methods to handle delete or ordering conflicts. It does, however, allow the user to build his own conflict resolution method to resolve data conflicts.

 

 

 

 

 

 

3.7 Multimaster Topology

The network topology of the machines used in this project  is as shown.

       

Oval: Oracle1
128.127.50.16
                                                                        (Master Definition Site)

                                

                       

                                                                                   Database link  

 

 

   

 

 


          

 

Setting up of Master Sites

 

Setting up Master Groups

 

3.7.1 Configuring a Snapshot Replication Environment

In order to setup an environment for snapshot replication the following tasks need to be done:

*Create the necessary schema. (schema objects that may participate in replication may be tables etc. among others)

 

*Create the necessary database links ( Refer to chapter )

Assign the appropriate privileges.

Allocate sufficient job processes. (Refer to chapter )

This may be done in two ways

ŕUsing an automated tool provided by Oracle called Replication  manager Setup wizard.

 

ŕUsing the replication management API (application program interface).

 

3.8 Creating a Schema

A snapshot schema in a remote database and the schema that contains the master table in the master database must be the same. Since the Scott user of oracle2 contains the tables to be replicated in the database network the snapshot schema in oracle1 must also be Scott.

 

In order to enable the refreshing of the snapshot for the refreshing mode specified in the definition of the snapshot, it uses database links to reference remote tables. To ensure secure behavior in refreshing the snapshot the defining query uses database links that include embedded user name and password in their definition. The restriction is that a public database link cannot be used to define a snapshot. Also, the remote account that the link uses must have the SELECT privileges necessary to access the data referenced in the snapshot's defining query.

 

Before creating snapshots, some administrative database links are required to be created.

A “public” database link from the snapshot site to the master site . This makes defining private database links easier.

 

Private” database links from the snapshot administrator to the proxy administrator and from the propagator to the receiver.

 

The Replication Manager Setup Wizard creates database links automatically.

Private database links connecting each snapshot database to the schema at the master database need to be created.

 

3.8.1 Privileges

Both the creator and the owner of the snapshot must be able to issue the defining SELECT statement of the snapshot. The user that creates the snapshot must have the CREATE SNAPSHOT privilege and the appropriate SELECT privileges to execute the defining SELECT statement.

 

3.8.2 Schedule Purge at Master Site

 

To keep the size of the deferred transaction queues in check, a purge operation  may be scheduled to remove all successfully completed deferred transactions from the deferred transaction queue.

 

3.8.3 Scheduled Push

Scheduling a push at the snapshot site automatically propagates the deferred transactions at the snapshot site to the associated target master site. Typically, there is only a single scheduled link per snapshot group at a snapshot site, because a snapshot group only has a single target master site.

3.8.4 SNP Background Processes and Interval

Sufficient SNP (or job queue) background processes should be allocated to handle the refreshing of snapshots. Because the snapshot site has only a single scheduled link to the target master site, the snapshot site only requires a single SNP process, but to handle additional activity, such as scheduled jobs it may be required to allocate at least two SNP processes at the snapshot site.

The SNP processes are defined using the JOB_QUEUE_PROCESSES initialization parameter in the initialization parameter file for the database. The initialization parameter file for the database is named as init(sid).ora . To set up SNP processes  we may manually edit the initialization parameter file. The SNP job interval determines how often the SNP processes "wake up" to execute any pending operations, such as pushing a queue.

 

/* Remember to add initialization parameter file */

 

 The following are the three mechanisms used at the master site to support refreshing of the snapshots:

Master Table

The master table is the basis for the snapshot and is located at the target master site. This table may be involved in both snapshot replication and multimaster replication (remember that a snapshot points to only one master site).

Changes made to the master table, as recorded by the snapshot log, will be propagated to the snapshot during the refresh process.

 

Internal Triggers

When changes are made to the master table using DML, an internal trigger records information about the affected rows in the snapshot log. This information includes the values of the primary key and/or the ROWID, This is an internal trigger that is fired when a snapshot log for the target master table is created.

 

Snapshot Logs

A snapshot log is a table that records all of the DML changes to a master table. A single master table  has one and only one snapshot log, regardless of how many snapshots refresh from the master.

Before any snapshot groups are created , snapshot logs at the master sites need to be created. The log is required for any master table that supports snapshots with fast refreshes.

 

A fast refresh of a snapshot is possible only if the snapshot's master table has a snapshot log. When a snapshot is fast refreshed, entries in the snapshot's associated snapshot log that have appeared since the snapshot was last refreshed are applied to the snapshot. The time of last refresh can be obtained from the database audit trail.

         

Also in order to get the snapshot refresh times manually the view    DBA_SNAPSHOT_REFRESH_TIMES may be queried at the master.

Any snapshot created at the snapshot site is automatically registered at the master site.

3.8.5 Snapshot Groups

A snapshot group in a replication system maintains a partial or complete copy of the objects at the target master group. Snapshot groups cannot span master group boundaries.

Snapshot groups are required for supporting updateable snapshots. If a snapshot does not belong to a snapshot group, then it must be a read-only snapshot.

 

3.8.6 Refresh Groups

One of the main tasks of Oracle is to provide a read-consistent view of the database at the statement level as well as the transaction level.Thus with in a given transaction the group of snapshots logically related must be Transactionally consistent

 

A refresh group can contain snapshots from more than one snapshot group to maintain transactional (read) consistency across master group boundaries.

Usage of multiple refresh groups to refresh the contents of a single snapshot group must be avoided. Using multiple refresh groups to refresh the contents of a single snapshot group may introduce inconsistencies in the snapshot data, which may cause referential integrity problems at the snapshot site.

 

3.8.7 OVERVIEW OF SNAPSHOT CREATED       

       An overview of the snapshot created on the master definition site Oracle1 is shown.

 

  The table ‘sub’ is created on the master definition site from the SQL prompt

 

create table sub(subcode number ,

subname varchar2(30),

constraint pk_2 primary key(subcode));

 

This table is created  in the ‘scott’ schema of the database in Oracle1.

 SQL script for creation of snapshot.

 

create snapshot scott.test

refresh next (sysdate) + 1/(24*60)

for update as

select * from scott.sub@scott_link;

 

The snapshot ‘test’ is created in the ‘scott’ schema of Oracle2 database environment. It refers the table ‘sub’ in the ‘scott’ schema of Oracle1 through the database link ‘scott_link’ as shown in the last phrase of the script. The snapshot is updateable as indicated by the clause ‘for update as’  and is refreshed every 1 minute as indicated in the ‘refresh’ clause. 

 

 

 

 

3.9 SETUP OF REPLICATION ENVIRONMENT.

 

The SQL scripts generated during the process of setting up of Multimaster as well as snapshots environment are given. They include scripts for generating master sites, adding them to master groups and configuring the materialized view as well. These scripts are generated after configuring the replication environment through DBA Studio and then clicking on the ‘Generate SQL Script’ tab.

 

Steps in setting up a Master site:

At oracle0:

Connect as SYSTEM to oracle0

            CONNECT SYSTEM/MANAGER@oracle0   

 

Create replication administrator

            CREATE USER repadmin IDENTIFIED BY repadmin;

 

Grant privileges to replication administrator

        To grant the replication administrator privileges to create and 

        manage a replication environment we execute the

   GRANT_ADMIN_ANY_SCHEMA API.

    BEGIN

          DBMS_REPCAT_ADMIN.GRANT_ADMIN_ANY_SCHEMA (

          USERNAME => 'repadmin');

    END;

    Note: In order that REPADMIN is able to create snapshot logs for

             any replicated table, we grant COMMENT ANY TABLE and

             LOCK ANY TABLE to REPADMIN.

 

    Register propagator

        A propagator that is responsible for propagating the deferred

        transaction queue to other master sites is registered.

         BEGIN

         DBMS_DEFER_SYS.REGISTER_PROPAGATOR (

         USERNAME => 'repadmin');

         END;

 

    Register Receiver

         The receiver receives the propagated deferred transactions  

         sent by the propagator from other master sites.

    BEGIN

              DBMS_REPCAT_ADMIN.REGISTER_USER_REPGROUP (

              USERNAME => 'repadmin',

                privilege_type => 'receiver',

               list_of_gnames => NULL);

     END;

 

Schedule Purge at Master Site

 The SCHEDULE_PURGE API automates the purge process. 

            This helps to keep the size of the deferred transaction queue

             in check by purging successfully completed deferred 

             transactions.

 

       

CONNECT repadmin/repadmin@oracle0

        BEGIN

                          DBMS_DEFER_SYS.SCHEDULE_PURGE (

             next_date => SYSDATE,

                        interval => 'sysdate + 1/24',

                       delay_seconds => 0,

                       rollback_segment => '');

                   END;

 

Create Master Site Users

This consists of the following steps:

CREATE PROXY SNAPSHOT ADMINISTRATOR 

 

         The proxy snapshot administrator performs tasks at the target

         master site on behalf of the snapshot administrator 

         at the snapshot site.

 

        CONNECT system/manager@oracle0

        CREATE USER proxy_snapadmin IDENTIFIED BY 

                                                                  proxy_snapadmin;

 

        BEGIN

              DBMS_REPCAT_ADMIN.REGISTER_USER_REPGROUP (

              username => 'PROXY_SNAPADMIN',

              privilege_type => 'PROXY_SNAPADMIN',

              list_of_gnames => NULL);

        END;

 

 

 

Create proxy refresher

        The proxy refresher performs tasks at the master site on behalf

        of the refresher at the snapshot site.

         CREATE USER  proxy_refresher  IDENTIFIED BY

                                                                proxy_refresher;

         GRANT CREATE SESSION TO proxy_refresher;

         GRANT SELECT ANY TABLE TO proxy_refresher;

 

   Create database links between master sites

          The database links provide the necessary distributed

mechanisms to allow the different replication sites to replicate  

data among themselves. See before creating any private database links, the public database links that each private database link will use are created. All the replication administrators are then connected using database links.

 

           CONNECT system/manager@oracle0

           CREATE PUBLIC DATABASE LINK oracle1 USING oracle1;

           CREATE PUBLIC DATABASE LINK oracle2 USING oracle2;

            CONNECT repadmin/repadmin@oracle0

            CREATE DATABASE LINK oracle1 CONNECT TO repadmin  

                                                              IDENTIFIED BY repadmin;

 

            CREATE DATABASE LINK oracle2 CONNECT TO repadmin

                                                              IDENTIFIED BY repadmin;

 

 

             CONNECT system/manager@oracle1

             CREATE PUBLIC DATABASE LINK oracle1 USING oracle0;

             CREATE PUBLIC DATABASE LINK oracle2 USING oracle2;

             CONNECT repadmin/repadmin@oracle1;

             CREATE DATABASE LINK oracle0 CONNECT TO repadmin

                                                             IDENTIFIED BY repadmin;

              CREATE DATABASE LINK oracle2 CONNECT TO repadmin

                                                             IDENTIFIED BY repadmin;

               CONNECT system/manager@oracle2;

              CREATE PUBLIC DATABASE LINK oracle0 USING oracle0;     

              CREATE PUBLIC DATABASE LINK oracle1 USING oracle1;

 

              CONNECT repadmin/repadmin@oracle2

              CREATE DATABASE LINK oracle0 CONNECT TO repadmin

                                                             IDENTIFIED BY repadmin;

 

               CREATE DATABASE LINK oracle1 CONNECT TO repadmin

                                                            IDENTIFIED BY repadmin;

 

 

Create scheduled links

       The scheduled link determines how often the deferred

       transaction queue is propagated to each of the other master

       sites. SCHEDULE_PUSH procedure needs to be executed for

       each database link that is created

            CONNECT repadmin/repadmin@oracle0

  BEGIN

                        DBMS_DEFER_SYS.SCHEDULE_PUSH (

           destination => 'ORACLE1',

           interval => 'SYSDATE + 10 / (24 * 60)',

           next_date => SYSDATE);

    END;

   BEGIN

                         DBMS_DEFER_SYS.SCHEDULE_PUSH (

            destination => 'ORACLE2',

                          interval => 'SYSDATE + 10 / (24 * 60)',

            next_date => SYSDATE);

    END;

 

                CONNECT repadmin/repadmin@oracle1

    BEGIN

                        DBMS_DEFER_SYS.SCHEDULE_PUSH (

           destination => 'ORACLE0',

           interval => 'SYSDATE + 10 / (24 * 60)',

           next_date => SYSDATE);

    END;

 

    BEGIN

         DBMS_DEFER_SYS.SCHEDULE_PUSH (

           destination => 'ORACLE2',

           interval => 'SYSDATE + 10 / (24 * 60)',

           next_date => SYSDATE);

   END;

            CONNECT repadmin/repadmin@oracle2

 

   BEGIN

        DBMS_DEFER_SYS.SCHEDULE_PUSH (

          destination => 'ORACLE0',

          interval => 'SYSDATE + 10 / (24 * 60)',

          next_date => SYSDATE);

   END;                       

 

   BEGIN

        DBMS_DEFER_SYS.SCHEDULE_PUSH (

           destination => 'ORACLE1',

           interval => 'SYSDATE + 10 / (24 * 60)',

           next_date => SYSDATE);

   END;

 

           CONNECT repadmin/repadmin@oracle0

   BEGIN

       DBMS_DEFER_SYS.SCHEDULE_PUSH (

          destination => 'ORACLE1',

          interval => 'SYSDATE + 10 / (24 * 60)',

          next_date => SYSDATE);

   END;

 

    SETTING UP SNAPSHOT ENVIRONMENT

    SETTING UP SNAPSHOT GROUPS

     

 

 

 

    MONITORING THE DATABASE AND IMPROVING

    THE PERFORMANCE BASED ON THE STATISTICS

    LOGGED.

 

               

     Based on the information logged regarding the Replication activity                         through ‘Auditing’ (Refer      ) the performance of the activity of          replication is made optimum which is elaborated in the coming      sections.

    

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

 

CHAPTER 4: Auditing

General Overview

Auditing is the tracking of database actions. The main aim of auditing is to:

Gather data about specific database activities. The database administrator can thn make some useful decisions for improving either the performance or security features of the database.

Check the occurrence of unauthorized activities.

The general types of auditing are

 

4.1.1Statement Auditing

The auditing of SQL statements with respect to only the type of statement is called statement auditing.  “Audit Table”   tracks several DDL statements regardless of the table on which they are issued.

 

4.1.2 Privilege Auditing

The auditing of the use of powerful system privileges to perform corresponding actions is called Privilege auditing. Privilege auditing is more specific than statement auditing because it audits only the use of the privilege.

 

4.1.3 Schema Object Auditing

The auditing of specific statements on a particular schema object. Schema object auditing is the most specific auditing, doing so only  for a specific statement on a specific schema object.  

 

4.3 The Database Audit Trail

Auditing records information such as the operation, the user performing the operation in the chronological order of their occurrence. Audit records can be stored in either a data dictionary table, called the database audit trail, or an operating system audit trail.

 

The database audit trail is generated by setting the AUDIT_TRAIL variable to DB or TRUE.  This enables system-wide auditing and causes audited records to be written to the database audit trail (the SYS.AUD$ table).

 

The parameter AUDIT_TRAIL is set to true in the file   

The parameter file init.ora and is set to FALSE in the file

The parameter file  init(sid).ora where ‘sid’ denotes the system identifier (Oracle0, Oracle1, Oracle2)

 

Failing to do any of the above two things does not generate the               audit information. The database audit trail is a single table named AUD$ in the SYS schema of each Oracle database's data dictionary.

 

The following general information is included in each audit trail record

The user name .

The name of the schema object accessed .

The operation performed or attempted.

The completion code of the operation .

The date and time stamp.

The system privileges used.

 

Audit trail has nothing to do with transaction commit or rollback. The audit information remains even after rollback. In order to make meaningful interpretation of the audit information, there are to PL/SQL scripts that need to be run as  SYS.

These are CATALOG.SQL and CATAUDIT.SQL

          In order to disable auditing we run the script CATNOAUD.SQL .

 

 

4.4 Project Specific Use of Auditing

The various steps are

1.  The AUDIT_TRAIL parameter was set to TRUE.

The scripts CATALOG.SQL and CATAUDIT.SQL were run.

 An Audit select is issued on the snapshot at the snapshot site.

 Any access of the snapshot for reading it is then entered into   the database along with its time stamp. The audit information for that snapshot is read form the audit trail as follows

         

 select owner,TO_CHAR(timestamp,'DD-MON-YY HH:MI

                               P.M.'),ACTION_NAME

    from sys.dba_audit_trail    where

        (owner='SCOTT'

         and

         TO_CHAR(timestamp,'DD-MON-YY') in

       (select TO_CHAR(sysdate,'DD-MON-YY') from   

                                             dual))

       and

      ((TO_CHAR(timestamp,'HH:MI P.M.') between   

                 ('01:00 P.M.') and ('02:00 P.M.'))

       or

       (TO_CHAR(timestamp,'HH:MI P.M.') between 

                 ('02:00 P.M.') and ('03:00 P.M.'))

       or

       (TO_CHAR(timestamp,'HH:MI P.M.') between 

                 ('03:00 P.M.') and ('04:00 P.M.'))

       or

       (TO_CHAR(timestamp,'HH:MI P.M.') between 

                 ('04:00 P.M.') and ('05:00 P.M.'))

       or

       (TO_CHAR(timestamp,'HH:MI P.M.') between 

                 ('05:00 P.M.') and ('06:00 P.M.'))

       or

       (TO_CHAR(timestamp,'HH:MI P.M.') between 

                 ('06:00 P.M.') and ('07:00 P.M.'))

       or

       (TO_CHAR(timestamp,'HH:MI P.M.') between 

                 ('07:00 P.M.') and ('08:00 P.M.'))

       or

       (TO_CHAR(timestamp,'HH:MI P.M.') between 

                 ('08:00 P.M.') and ('09:00 P.M.'))

       or

       (TO_CHAR(timestamp,'HH:MI P.M.') between 

                 ('09:00 P.M.') and ('10:00 P.M.'))

       or

       (TO_CHAR(timestamp,'HH:MI P.M.') between 

              ('10:00 P.M.') and ('11:00 P.M.'))) ;

         

This query gives the record of information for the database snapshot between 11:30 a.m. and 12 noon. Now assuming that the snapshot refresh interval is 1 hour i.e., at 11 am , 12 noon ,1 p.m. and so on. Now if the access of the snapshot is very high between 11:30 and 12 noon this means that the snapshot provides values that are not up to date. So in order to provide updated information to the users of the snapshot we force a refresh of the snapshot by using the DBMS procedure

        DBMS_SNAPSHOT.REFRESH

 

For instance,

dbms_snapshot.refresh ( 'sub,marks,code', 'CF' ); performs a complete refresh of the SUB snapshot, a fast refresh of the MARKS snapshot, and a default refresh of the CODE snapshot.

 

4.5 Job Queues

Job queues are event schedules that are specified by the user. They are used in a distributed database configuration to enable automatic refresh of snapshots. These processes wake up periodically and refresh any snapshots that are scheduled to be automatically refreshed. If more than one job

queue process is used, the processes share the task of  refreshing snapshots. These processes also execute job requests created by the DBMS_JOB package and propagate queued messages to queues on other databases.

 

The job queue background processes (SNPn) handle message          propagation. To enable propagation, at least one job queue process must be started.

 

We can use job queues to schedule periodic execution of  PL/SQL code.     

 

4.5.1 SNP Background Processes

A multi-process Oracle system uses some additional processes called background processes  in order to improve the performance and to allow multiple users. Background processes consolidate functions that would otherwise be handled by multiple Oracle programs running for each user

process. Background processes perform I/O by interrupting main-line activity i.e. they function asynchronously and monitor other Oracle processes to provide increased parallelism for better performance and reliability.

 

SNP background processes execute job queues. Any PL/SQL code can be scheduled to be performed periodically using the job queue. To schedule a job we submit it to the job queue and specify the frequency at which the job is to be run. Jobs can be altered, deleted and disabled.

 

At least one SNP process is required to be running to execute queued jobs in the background. SNP processes periodically wake up and execute any queued jobs that are due to be run. SNP background processes differ from other Oracle background processes, in that the failure of an SNP           process does not cause the instance to fail. If an SNP process fails, Oracle restarts it.

 

SNP background processes will not execute jobs if the system has been started in restricted mode. The mode can be turned on or off as follows.

 

          ALTER SYSTEM ENABLE RESTRICTED SESSION;

          ALTER SYSTEM DISABLE RESTRICTED SESSION;

 

In a restricted session the jobs are not executed, we have to switch to the unrestricted mode for the processes to run.

 

An oracle instance can support a maximum of 36 SNP processes. Task of executing queued jobs can be shared among multiple SNP processes. These are named as SNP0 to SNP9, and SNPA-SNPZ.

 

Job queue initialization parameters enable to control the operation of the SNP background processes. The parameters can be set in the init(sid).ora file.

 

            ALTER SYSTEM SET JOB_QUEUE_PROCESSES = 10;

 

The JOB_QUEUE_INTERVAL parameter controls the frequency of wake up of the SNP processes.

 

4.5.2 Managing  Job Queues

To schedule and manage jobs in the job queue, the procedures in the DBMS_JOB package are used. Any user who can execute the job queue procedures can use the job queue. The DBMS_JOB package provides the following procedures

 

SUBMIT

REMOVE

CHANGE

WHAT

NEXT_DATE

INTERVAL

BROKEN

RUN

 

4.5.2.1 SUBMIT

Used to submit a new job to the job queue, the following parameters can be specified;

VARIABLE jobno number;

BEGIN

     DBMS_JOB.SUBMIT ( :jobno, 

                               Snapjob(‘db2’,’sub’),

                             sysdate, 'sysdate + 1');

     COMMIT;

END;

                  

Jobno  ----  the job number assigned to the new job that is  added to the queue.

Snapjob --- name of the procedure that is required to be executed periodically.

Sysdate ---- specifies the next time the job is to be run

Sysdate+1 -- expression that calculates the interval between

                         the updates.      

 

 

 

JOB eXecution mechanism

SNP background processes execute jobs. To execute a job, the process creates a session to run the job. When an SNP process runs a job, the job is run in the same environment in which it was submitted and with the owner's default privileges. When a job is forced to run using the procedure DBMS_JOB.RUN,  the job is run by a user process. In this case it is run with the user’s default privileges only.

 

4.5.2.2 RemovE

Used to remove a job from the job queue. The following statement removes job number (jno) from the job queue

DBMS_JOB.REMOVE (jno);

 

Restrictions

Currently executing jobs can only be removed from the job queue. However, the job will not be interrupted and the current execution will be completed. You can remove only jobs you own. If you try to remove a job that you do not own, you receive a message that states the job is not in the job queue.

 

4.5.2.3 CHANGE

We can alter any of the user-definable parameters associated with a job by calling the DBMS_JOB.CHANGE procedure. In the following example, the job identified as ‘jno’ is now executed every n days…

               DBMS_JOB.CHANGE(jno, null, null, 'SYSDATE + n');

If we specify NULL for WHAT, NEXT_DATE, or INTERVAL when the procedure CHANGE is called, the current value remains unchanged.

         

4.5.2.4 WHAT  

The definition of a job is changed by calling the DBMS_JOB.WHAT procedure. The following example changes the definition of the job identified as jno:

                    DBMS_JOB.WHAT(jno, new_procedure_name);

4.5.2.5 NEXT_DATE

The next date that Oracle executes a job can be altered by calling the DBMS_JOB.NEXT_DATE procedure, as follows:

 

                         DBMS_JOB.NEXT_DATE(jno, 'SYSDATE + 1');

 

4.5.2.6 INTERVAL

The following statement changes the execution interval for a job by calling the DBMS_JOB.INTERVAL procedure:

              DBMS_JOB.INTERVAL(jno, 'NULL');

In this case, the job will not run again after it successfully executes.

 

4.5.2.7 Broken

Used to force a broken job to run.

A job is said to be broken if

           a. It has failed to execute for 16 consecutive attempts.

           b. The job has been explicitly marked broken by using the 

                      DBMS_JOB.BROKEN procedure

                               DBMS_JOB.BROKEN(jno, TRUE)

Oracle does not attempt to execute a broken job until the job is you either marked as not broken, or forced the  job to be executed by calling the procedure

                DBMS_JOB.RUN.

 A job can be marked as not broken and subsequently forced to execute using …

                             DBMS_JOB.BROKEN(jno, FALSE);

 

4.5.2.8 RUN

Used to manually execute a job. To force a job to be executed immediately, the procedure RUN is used. On running  a job using DBMS_JOB.RUN, Oracle recomputes the next execution date.

The following statement runs job ‘jno’ in your session and recomputes the next execution date

                       DBMS_JOB.RUN (jno);

 

Terminating a Job

 A running job is terminated by marking the job as broken, identifying the session running the job, and disconnecting that session. The job should be marked as broken, so that Oracle does not attempt to run the job again.

                                                                                                                                     

 

 

 

        

Standby Database

A standby database is a database replica created from a backup of a primary database. By applying archived redo logs from the primary database to the standby database, we can keep the two databases synchronized.

 Main purposes of standby database

·                     Disaster protection

·                     Protection against data corruption

·                     Supplemental reporting

If the primary database is destroyed or its data becomes corrupted, we can perform a failover to the standby database, in which case the standby database becomes the new primary database. We can also open a standby database with the read-only option, thereby allowing it to function as an independent reporting database.

Standby Database Creation

We can construct the standby database from backups of the primary database control files and datafiles, and then prepare it for managed recovery. This involves the following basic steps:

1.                 Make a backup of the primary datafiles (or access a previous backup) and create the standby control file.

2.                 Transfer the standby datafiles and control file to the standby site.

3.                 Configure Net8 so that we can connect to the standby service name.

4.                 Configure the primary and standby initialization parameter files.

5.                 Initiate automatic archiving on the primary site.

6.                 Start the standby instance without mounting it.

Figure 1-8 Standby Database Creation


Text description of sbr81101.gif follows.

Configuration of the Standby Database Environment

                      A standby database environment refers to the physical configuration of a primary database with one or more associated standby databases.The main factors affecting configuration of the standby database environment are described as follows.

Ř              Number of Standby Databases

Although a standby database can be synchronized with one and only one primary database, a single primary database can support a theoretically unlimited number of standby databases. These standby databases are separate and independent, and can reside on multiple machines or on a single machine.

 

 

  Using Backups for Standby Creation

Each standby database must be created from a backup of the primary database. This backup can be either consistent or inconsistent, open or closed.

We can also use a single backup of the primary database to create an unlimited number of standby databases, although the various standby databases in the environment do not have to be created from the same backup. For example, we can create one standby database from a backup of the primary database taken on some date and create another standby database from the backup taken on some other date .

Figure 1-9 Creating Standby Databases Using Different Backups


Text description of sbr81104.gif follows.

Limits Imposed by Automatic Archival to Standby Sites

     In a managed standby environment, the primary database can automatically archive to a maximum of four standby sites. Consequently, we can simultaneously run a maximum of four standby databases in managed recovery mode in any given standby database environment.

Although automatic archival limits the number of standby databases that we can maintain in managed recovery mode, it does not limit the number of standby databases that we can maintain in manual recovery mode..

Ř              Transferring Archived Redo Logs

 We have two options for transferring the logs to the stand-by site:

·                     Configure the primary database to archive automatically to the standby site

·                     Transfer the logs manually

Managed Recovery Environment

If we choose to implement a managed recovery environment, then we must connect the primary and standby databases through Net8. To configure the primary initialization parameter file for automatic archival to a standby site, we need to specify a service name. To specify a service name, we must configure network files such as tnsnames.ora, listener.ora, and possibly names.ora.

When a primary database archives to a standby service, Oracle automatically transfers the archived redo logs through Net8 to a directory on the standby site. As the primary database archives each log, Oracle automatically transfers the new log to the standby site.

Independence of Automatic Archival and Managed Recovery

The primary database can continue to archive to the standby site even if the standby database is not in standby recovery mode, but only if the standby instance is started. The mechanism for recovery of a standby database is independent of the mechanism for automatic transfer of archived redo logs to the standby site. Consequently, we can take a standby database out of managed recovery mode and temporarily place it in read-only mode. While the standby database is in read-only mode, archived redo logs continue to accumulate at the standby site.

 

 

Optional Manual Transfer of Archived Redo Logs

Even if we configure the primary database to archive automatically to the standby site,  we can still transfer the logs manually if the occasion requires it, but only if we have specified the standby site as an optional destination .

Non-Managed Recovery Environment

In a non-managed recovery environment, we do not configure the primary database to archive automatically to the standby site and so must transfer the archived redo logs by hand. Because the primary database is not archiving automatically to the standby site, we do not need to maintain a Net8 connection. For example, we can use an operating system utility such as UNIX cp or ftp to transfer the archived redo logs to the standby site, connect to the standby database using operating system authentication, then recover the database manually.

 

Figure 1-10 Non-Managed Recovery Environment


Text description of sbr81105.gif follows.

 

 

 

Ř              Location and Directory Structure of Primary and Standby Sites

We need to know the number and configuration of the machines involved. Of particular importance are whether:

·                     The primary and standby databases reside on the same host or on different hosts

·                     The primary and standby sites have identical or different directory structures

Number and Location of Machines

There is no theoretical limit to the number of machines on which the standby databases reside. For example, we can locate a standby database:

·                     On the same host as the primary database

·                     On a different host in the same data center

·                     On a different host in a different data center, but in the same metropolitan area

·                     On a different host in a data center in a geographically remote location, for example, on a different continent

The location of hosts involved in the standby database environment has obvious implications for a disaster recovery strategy. For example, if the primary host in a data center is destroyed, then we cannot perform failover to a standby database unless it resides on a different host, which may or may not be in the same data center. In a worst case scenario, if the data center is completely destroyed, then we cannot perform a failover to a standby database unless the standby database is located on a different machine in a remote location.

 

 

 

 

Directory Structure of Standby Sites

The directory structure of the various standby sites determines the path names for the standby datafiles and redo logs. A standby site on the same host as the primary database necessarily uses a different directory structure; otherwise, the standby database attempts to overwrite the primary database files.

Configuration Options

We have three basic configuration options, which are illustrated in the figure given below:

·                     A standby database on the same host as the primary database

·                     A standby database on a separate host that uses the same directory structure as the primary host

·                     A standby database on a separate host that uses a different directory structure from the primary host



 

 

 

 

 

 

 

 

 

 

 

Figure 1-11 Possible Standby Configurations


Text description of sbr81097.gif follows.

 

 

 

 

 

 

 

 

 

 

The following table describes possible configurations of primary and standby databases and the consequences of each:

Standby Host 

Directory Structure 

Consequences 

Same as primary host 

Differs from primary host (necessarily) 

·                     We must set the LOCK_NAME_SPACE parameter.

·                     We must rename primary database filenames in the standby database control file

·                     The standby database does not protect against disaster.

 

Separate 

Same as primary host 

·                     We do not need to rename primary database filenames in the standby control file, although we can still do so if we want a new naming scheme .

·                     Using separate physical media for wer databases safeguards wer primary data.

 

Separate 

Differs from primary host 

·                     We must rename primary database filenames in the standby database control file .

·                     Using separate physical media for wer databases safeguards wer primary data.

 

 

Although these configurations are mutually exclusive, that is, a given standby database must use only one configuration, we can run multiple standby databases simultaneously for a given primary database. Consequently, we can implement any combination of configurations. For example, we can maintain one standby database on the same host as the primary database, another standby database on a separate host in the same data center, and a third standby database on a separate host on the other side of the world. We can run each standby in managed recovery mode, manual recovery mode, or read-only mode.

Standby Database Maintenance

Although a managed standby environment is mostly automated, it is not completely automated. In various situations, we may need to perform status checks or maintenance on the standby database. The most common maintenance operations are:

·                     Checking the Status of Archived Redo Logs

·                     Backing Up the Standby Database

·                     Responding to Physical Changes in the Primary Database

 

Advantages and Disadvantages

A standby database can be a powerful tool for both disaster prevention and supplementary reporting. For example, we can:

·                     Maintain a standby database in a location that is geographically remote from the primary database, or maintain several standby databases in geographically diverse locations.

·                     Maintain the primary and standby databases on different disk drives of the same machine, so that if the primary database's drive fails, we can activate the standby database and resume normal operations.

·                     Implement a managed standby configuration, whereby a standby database automatically applies archived redo logs that are automatically shipped to the standby site by a primary database. In this way, changes to a primary database are regularly propagated to a standby database.

·                     Make a standby database the new primary database with minimal loss of time and data if the primary database is completely destroyed.

·                     Provide possible protection against erroneous batch jobs, user errors (for example, truncating the wrong table), or application corruptions on the primary database by not applying archived logs containing corrupt data to the standby database. We can then activate the uncorrupted standby database, making it the primary database.

While a standby database can be a tremendous benefit in wer backup and recovery strategy, it involves costs as well. For example, a standby database requires:

·                     An additional computer if we want to maximize disaster prevention by maintaining a standby database on a separate host

·                     Implementation and maintenance of a Net8 connection if we use the managed standby environment

·                     Additional system resources and extra storage space no matter which implementation we choose

·                     Administration of the standby database to mirror some structural operations (for example, adding a tablespace or datafile) performed on the primary database