Ways to Improve Performance

Comparison

Ways to Improve Performance

Your databases are growing bigger and bigger. It is extremely important to give a heavy consideration about data access performance throughout the development cycle. The performance of an application is typically dominated by data accessing. The optimization of performance may be involved with many aspects such as server hardwares/softwares, design and size of database, network tuning, optimization of distributed queries, and client application coding. In many cases, the performance can be improved by correct coding of your data accessing components. This short article is mainly focused on the performance tuning of client coding through this OleDBPro module. This OleDBPro module have various ways to boost your application performance and some of them are special. All of them can be implemented easily and simply without much coding involved.

Avoid network traffic as much as possible.
        Usually network roundtrips and data packing between a client and a server are the number one reason for poor performance of data accessing, and must be eliminated as much as possible especially when a big set of data are involved. The OleDBPro has two core classes, CRBase and CBatchParam<T>, which use the batch mode to update and retrieve into and from an OLEDB data source by default. All their derived classes inherit the same mechanism to complete their task.
        It is very critical to correctly construct a SQL query statement to avoid fetching needless data. One of the capabilities of the SQL language is its ability to filter away data at the server side so that only the data required is returned to the client. Using these facilities minimizes expensive network traffic between a server and a client. This implies that both the SELECT fields and WHERE clauses must be restrictive enough to retrieve only the data required. The reduction of the size of a rowset will improve data accessing speed, capability of remote use and multi-user scalability.
        If you only updates (SQL Add, Update and Delete) lots of data into a data source, it is highly desirable to use CBatchParam<T> instead of CRBase derived classes to send multiple sets (20, 40, or more) of data into a server by a single call of CBatchParam<......>::DoBatch through a parameterized SQL statement or a stored procedure, because it avoids retrieving data from a server to a client and reduces data packing and movement over expensive network, as shown in the example MultiProcs.
        OleDBPro module has a powerful template class, CMultiBulkRecord<T>, which can handle complicated statement batches like "Select * from Orders;Insert into Employees values(.....);Excute GetOrderInfo(?, ?, ?,....)". Statement batch is a way of sending multiple statements from a client to a server at one time, thereby reducing the number of network roundtrips to the server. If the statement batch contains multiple SELECT statements, the server will return multiple rowsets to a client in a single data stream.
        As shown in the example Scroll, this OleDBPro module supports use of bookmark, keys and indexes to pinpoint records, referring to CRBase and CRBaseEx. Additionally, you can jump from one record to another by setting nSkipped of CRBase::MoveNext(LONG nSkipped=0). all of these methods are designed for reducing data traffic over network.
        As shown and discussed in the examples FilterSort and DataShape, use of MS data access services can eliminate the avoidable data movement over network in many cases.
        If possible, it is highly recommended to use a stored procedure to handle a batch statement with multiple executions and let a server handle it to reduce data movement over network as much as possible.
         CRBase uses the batch mode to fetch records from a server to a client. By default, the batch size is 20. However, if a record has a few fields and its size is small, it may be correct to increase the batch size for boosting performance before opening a rowset.

Don't use rowset properties than necessary.
        In OLEDB, rowset properties determine what cursor should be used for managing a resultant rowset. Cursors are a useful and flexible tool in a database management system. However, it is expensive for a server to manage a cursor. The more functionality a cursor has, the more expensive it costs.

Use transactions often but correctly.
        A primary goal of using transactions (COsession::BeginTrans, COSession::Commit and COSession::Rollback) is to reduce the amount of data transferred and data packing between server and client. Long-running transactions can be great for a single user, but they scale poorly to multiple users, may block away other users accessing the same resources, and may even cause deadlocks. Therefore, an application should avoid too long-running transactions in a multi-user environment.

Use prepared parameterized statements or procedures.
        Both the CBatchParam<T> and CBulkRecordParam<T> classes fully support prepared parameterized statemets and procedures with any numbers (1, 2, 3, .......) and types (INPUT, OUTPUT and INPUT/OUTPUT) of parameters. The use of prepared parameterized statement or procedures can avoid the repeated parsing of a SQL statement at server side. Further more, the two classes reuse a OLEDB TCommand object without the repeated creation of this object at the client side.

Consider using provider-specific interfaces.
        OLEDB is extremely flexible and extensible. You may easily use provider-specific interfaces to send and fetch data into and from a data source. For example, you can easily use the interface IRowsetFastLoad of SQL Server provider to load records into a table at the fastest speed (BCP, Batch CoPy). In comparison with ODBC, it is really simple to use provider-specific interfaces.

Select an OLEDB provider correctly.
        Typically, it is highly recommended to use a native OLEDB provider. Today, maybe there are a few OLEDB providers available for a DBMS. Some of them may run faster at retrieving records, and others may faster at updating records into a DBMS. You may need to compare them and select one from them for your specific purposes.

Use Just-In-Need to retrieve data from a server.
        As shown in the example FastAccess, this OleDBPro module has a UNIQUE feature at this writing time, deferring. If you do have a big rowset with a large number of fields but don't always need accessing all of them for each record (accessing them under some particular conditions), you can use CRBase::SetDBPart to discard some of fields which are not often accessed at run time. If you do need to access those of discarded fields under some particular conditions, you can use CRBase::GetDataEx or CRBase::SetDataEx to retrieve or update them. This feature is something like Just-In-Need, and obviously reduces the network traffic. The improvement is mainly dependent on whether the property DBPROP_DEFERRED is set to true and what percentage of data fetching can be avoided. For details and reasons, refer to the short article, Deferred Columns and Performance. At this writing time, MS Access providers set DBPROP_DEFERRED to true by default. Even if an OLEDB provider does not set this property to true, it is still safe and recommended to use this feature because the provider may be added with this feature in the future. If so, your current codes will have a role in improving your application performance in the future. Additionally, this unique feature reduces coping data from an OLEDB provider to its consumer, and increase the speed somewhat too. When we tested this idea on MS Access provider, the result just amazed us!

Avoid conversions between data types if proper.
        By default, OleDBPro uses data types of columns of a raw rowset, which are determined by an OLEDB provider and its table column definitions. It is fast. If proper, use default data types.

Configure the underlying DBBINDING structures correctly.
        By default, CRBase and its derived classes retrieve data values, statuses and lengths for variable-length data types from a provider. You can configure the underlying DBBINDING structures just for data values only at run time. This way could reduce setting data from a provider into its client consumer and slightly increase data accessing speed.

Reduce traversing an array of DBBINDING structures.
        To get data values, statuses or lengths needs to traverse an array of DBBINDING structures, inside CRBase and its derived classes. This may consumes a little time especially for a big rowset. You can set an array of pointers to the inside buffers described by the DBBINDING structures through calling CRBase::GetData, CRBase::GetStatusPtr and CRBase::GetLengthPtr to eliminate traversing the DBBINDING structures repeatedly. This may slightly increase the performance.