DBA InfoPower Powerful Database Insight
 


Keep These Databases Running.

(Achieving highest level of database business continuity)

Content

Introduction..

Introduction

 

Achieving high database business continuity in a cost effective way can sometimes be compared to a catch 22 situation: with the current technology dominated by hardware-rich cluster solutions, companies need lots of money to reach high database uptime while attempting not to spend tons of money in order to stay cost effective.

One common mistake made in effort to increase database business continuity is placing 100% effort into creation of high availability and fault tolerant physical architectures.

While Gartner reports that high percentage of database availability is lost due to physical reasons (hardware failures), at least 15% of it is attributed to logical unavailability caused by human factor.

For active business environments that constantly require new database code deployment and database re-architecture (e.g., online retailers, banks involved in securities trading, telecommunications companies, etc.), the cost of human factor is significantly higher. Therefore, knowledge and control over human factor and changes introduced by its existence can result in huge benefits to the company and to the stability of its operational environment.

 

Overview
 

What is achieving high database business continuity anyway? The main thing – we need to understand what is our goal. What exactly we are trying to accomplish? Based on the industry’s many publications, best practices and expert opinions, we can define the following set of goals:
1) Identify database related problems impacting business functions
2) Proactively contain and resolve these issues
3) Meet business response time and availability SLAs
4) Have full information on planned and on-going business changes and understanding of their impact on databases
5) Have early identification and ability to forecast scalability and capacity needs
6) Avoid unnecessary spending on hardware, software licenses and human labor.

 

The Challenge
 

Oracle and DB2 databases are two of the most reliable and configurable database engines used by businesses and governments these days.

On the positive side, both databases can support any required business model and with the right architecture can scale to satisfy extremely fast expansion of business activity.

On the negative side, application growth, upgrades and patches, constant growth in number of supported databases, changing user base and changing business needs place a heavy load on database support personnel in their goal of achieving high business continuity.

In addition, the underlying hardware and software infrastructures are constantly changing with the introduction of new hardware servers, front-ends, and application servers only adding to the above challenge.

An analyst from the Gartner Group writes, “At the core of business data for most production applications is a relational database management system (RDBMS). RDBMS monitoring and administration has evolved into a highly specialized market…” This is a highly specialized market because providing and sustaining high database continuity is not an easy task. It cannot and should not be approached without adequate strategy, supporting procedures and products in place.

The mission of this paper is to outline a successful methodology and strategy for achieving superior database business continuity, supported by a software that is written by domain experts and applied with great success across multiple clients and industries.

 

Database Administration Functions

 

The following sections describe important database administration functions directed at reduction of critical application downtime due to catastrophic database events and improvement of database performance. They also cover the key requirements for a product capable of aiding a DBA in performing those functions.

 

Monitoring, Detection and Alerting

 

The first task in monitoring is to identify the right set of performance characteristics to monitor. While a number of characteristics is generic and can be used across databases, each database has specifics related to the application and business functions it supports.

In order to identify these database-specific performance characteristics, it is critical to use data received during root cause analysis of database outages and performance analysis and to include leading performance characteristics into the monitored set.

List of monitored metrics can include combination of database and OS metrics (including kernel statistics), derivative metrics generated by custom SQL and custom scripts.

Another important task is creation of the database performance baseline. While use of an absolute baseline is helpful on very stable databases (which is seldom the case in real production environments), it quickly diminishes when application profile, business functions or usage of database changes. As a result baselines needs to be frequently re-evaluated and re-established.

During the period when the old baseline is no longer valid and a new baseline is not established yet, judgment on database load and performance can be very subjective.

Resolution of this situation is introduction of automatic baseline generation into database monitoring. As a result, the baseline is constantly and automatically adjusted according to the database behavior and is valid 100% of the he time. As an additional benefit, it can be normalized to fit the range of, say 0 to 1, so any significant change in the baseline will be spotted as a drop from 1 to 0.5 or 0 or rise from 0 to 0.5 or 1.

In addition to automatic metric baseline, real time statistical processing can be used to identify accumulation of small changes in system or SQL metric behavior and proactively alert of issues that can have potential impact on database business continuity. Real time alerting allows the DBA to take necessary actions to prevent imminent database problems. It also enables automatic or semi-automatic containment and resolution of such problems.

Other important components of software supporting these key DBA functions are listed below:
Trend clarification using smoothing filers with moving averages
Critical conditions alerts
Alert voice notification for crowded operations rooms and busy DBAs
Secured connection to databases
Easy deployment across databases
Visual features – mixing/overlapping – RAC nodes overlap/DB2 EEE nodes overlap/comparison
Ability to mix / visually correlate databases in heterogeneous environment
Use of Monitoring Dashboard to consolidate and monitor many databases at once

 

Containment and Resolution

 

When critical situation is identified, in many cases DBA has from two to five minutes to take corrective action before the database becomes unresponsive and has to be restarted or failed over.

In these situations typing SQL commands or running scripts to find out “culprit” is in no way effective action to restore database availability to the business. Unfortunately this is how such situations dealt with nowadays. Sometimes 3rd party GUI tools are used, but only to terminate sessions explicitly specified by a DBA.

As a sound alternative, expert level software needs to be in place that can contain issue literally in seconds and resolve it automatically (or semi-automatically) in no more then three steps (containment, action request and resolution). In many cases this can be an automatic resolution of blocking locks or termination of resource consuming unauthorized SQL, bad SQL with the suddenly changed execution plan, etc.

In some cases a very “small” DBA or user action can cause significant impact on database continuity (e.g., collecting statistics on a very busy table). Such factors are usually overlooked when general resolution approach is used.

An immediate corrective action would be group elimination of database connections or logical “mark down” of the business component impacting overall database continuity.

 

Root Cause and Performance/Scalability Analysis

This information is vital for deep understanding and ability to correlate database level events with the potential changes in business activity, software and/or user behavior.

Change capture should cover all instrumented database performance areas, such as wait events, system statistics, latches, I/O, UNDO, individual SQL, changed SQL Plans, etc (for Oracle). For DB2, it includes numerous instance and database level metrics, I/O metrics on buffer pools, table spaces and tables levels.

Other important components of root cause analysis and performance/scalability analysis are:

Seasonality identification - Ability to understand patterns of changes in database behavior, i.e. for example, to identify if database resource consumption happens regularly, or if they started only number of days ago.
Combined metrics - Ability to combine multiple classes of database metrics for visual correlation
Cross Database combined metrics - Ability to combine database metrics across databases for visual correlation. This ability is extremely important for both ‘share nothing” database systems (DB2 EEE or replicated databases) and “share everything” systems (Oracle RAC/OPS)
Smoothing capability – ability to apply smoothing filters on database metrics allow for clear trend identification. Combined with the linear regression DBA can receive reliable indication of change in database resource consumption and potential impact on database business continuity.
Automated reporting - capability that allows automated generation of change capture reports, combined database metrics reports across all required databases

 

DBA InfoPower. Inc. Products

 

To employ best strategy in achieving high database business continuity, DBA InfoPower, Inc. created a product line of solutions intended to facilitate successful execution of methods listed above for every DBA.

DBA InfoPower, Inc. offers DBA Heartbeat (for Oracle, DB2 and MySQL), database real time monitoring and proactive alerting component, DBAct (for Oracle -GA and DB2 - beta release) - real time problem containment and resolution product and DBA Performance Explorer-I (for Oracle -GA and DB2 - beta release) – root cause analysis and performance/scalability analysis product that assists clients in daily task of achieving high database business continuity.

The great benefit of DBA InfoPower products is that DBA Heartbeat and Performance Explorer are heterogeneous cross-database products that allow a DBA to have maximum efficiency in supporting complex multi-database multi-vendor environments.

 

Applying DBA InfoPower, Inc. Products to the Process

 

This section will uncover successful application of strategy in achieving high database business continuity using DBA InfoPower, Inc. products.

 

Monitoring, Detection and Alerting

 

DBA PHB– PROactive Heartbeat provides the DBA with a great power to proactively identify and keep alerted on issues that can seriously impact business continuity.

The following is DBA PHB setup and work flow:

Step1: Identify set of metrics to monitor. The following can be done by:
a) Selecting prepackaged set of metrics prepared by DBA Infopower experts
b) Creating custom metric set aligned with the enterprise business usage patterns
c) Utilizing DBA Performance Explorer-I product to identify performance metrics that are directly associated with database continuity threatening events and therefore good candidates for monitoring.
Step 2: Once the metrics to monitor were identified, a DBA Heartbeat™ Agent Builder™ module is used to design and create monitoring agent definition.
Step 3: DBA Heartbeat™ Alert Builder™ is used to create and set alert conditions. Alert Builder™ sets visual and voice alerts per agent metric, its moving average and an automatic metric baseline.
Step 4: DBA Heartbeat™ Connection Manager™ is used to deploy agents across database servers. Mass agent deployment and agent activation/deactivation is accomplished by utilizing a single point of control.
Step 5: DBA Heartbeat Console™ connects to the agents and begins real-time monitoring and alerting

DBA Heartbeat™ also provides the DBA with the following advanced real-time monitoring features:

Simple and effective Agent Builder supporting database metrics, OS metrics, custom script metrics, custom SQL metrics
Alert Builder with custom message and voice alert capabilities
Easy to identify alert message panel
Effective Connection Manager acting as a single point of control for connection configuration and agent deployment, activation and deactivation
Secured communication with agents over customized SSH/SSL protocol
Automated metric baseline coupled with proactive problem identification algorithm
Smoothing filters for clear trend identification
Ability to mix monitored metrics across databases and database platforms on the same monitoring panel
Ability to consolidate and monitor multiple database servers as a logical cluster
Ability to consolidate and monitor multiple database servers on a dashboard alert panel with instant visualization of alerted metric
Portability and support of many hardware platforms - written in Java.

 

Containment and Resolution

 

DBAct - Provides the DBA with a great power to contain and instantly resolve issues threatening database business continuity. This command-line cross platform module weapons DBA with over 80 actionable functions necessary to identify and automatically or semi-automatically resolve cases of extreme contention and resource consumption.

Examples:
“dbact killblock now” – eliminate blocking locks
“dbact cleanswap now” – eliminate idle/unused sessions consuming swap space

 

Root Cause and Performance Scalability Analysis

 

DBA PEi -Performance Explorer-I is a complete root cause analysis tool that enables a lightning speed discovery of database faults/overload causes, replacing difficult and time consuming manual performance analysis and report generation.

Performance Explorer-I features:

1) Automatic period comparison and change capture - Allows quick identification of the root cause of any changes threatening database continuity and performance characteristics. It eliminates the need for eyeballing across dozens of reports to find a problem that needs to be fixed NOW!
2) High performance visualization – Tens of thousands time points are rendered in seconds. A whole quarter of data can be visualized with 5-minute snapshot granularity blazingly FAST!
3) Overlapping view of current and historic data – Quickly identifies if database behavior is normal or anomalous.
4) Performance prognosis - Calculating and visualizing regression trends reduces complex data view to a manageable and understandable form that can be used to generate clear headroom and capacity prognoses.
5) Data Smoothing -. Sophisticated, tunable filtering smoothes data to clarify performance trends.
6) SQL level Detail - Drill Down to the level of individual SQL queries to determine how a change in different SQL characteristics is affecting the system. Similar queries can be aggregated to display overall system impact!
7) Full performance management – All statistics, wait events, latches, SQL and I/O performance data collected
8) Full batch capability – Run analysis in batch and in parallel across hundreds of databases. Instantly generate HTML and graphical reports.
9) Portability and support of many platforms - Written in Java

Conclusion

 

Providing high database business continuity is not an easy task and requires thoughtful knowledge of database internals and specifics of business environment. When it comes to managing this task, DBA InfoPower, Inc. provides complete solution line of best quality components that will automate monitoring and alert identification tasks, provide powerful problem containment and resolution module as well as automated solution for root cause analysis and performance/scalability analysis. DBA InfoPower, Inc. will provide technology you need to feel confident in a hard task of securing high database business continuity and availability.

 

About the Author

 

Ron Warshawsky is a principal technologist and founder of DBA InfoPower Inc. He has over 12 years of technical experience working with Oracle/DB2 databases, starting from beginning to principal positions in database and high availability architecture. He designed and implemented numerous database high availability and fault tolerant / fault preventive solutions that serve Fortune 100 and leading e-commerce companies.

 

About DBA InfoPower, Inc.

 

DBA InfoPower, Inc. is an emerging leading provider of database business continuity solutions. Our products give our clients a significant boost in business continuity of critical database-centric systems, driving up business availability numbers and significantly reducing management and maintenance costs related to identification, prevention, containment, and root cause analysis of database related problems. Founded in 2001 and based in Santa Clara, California, DBA InfoPower has offices in Boston, MA and Yardley, PA. Currently in an expansion stage, we closely work with 500+ trial business clients, who are utilizing our technology.

 

Privacy Policy | Legal © DBA InfoPower, Inc. | 408.732.4885 | Contact Us | Feedback | Support | Site Map

 
  Community

 

Download Act

Download PEi

Free Downloads

db22 DB2 DBA Tool

SQL Review -

    SQL Formatter

 

DBAip Support

Send Suggestions

DBAip Careers

 

Contact Us - 408.732.4885

  News
July 2nd, 2007
New version of Performance Explorer-i released
New 4.0.2 Version of Performance Explorer-i - DBA InfoPower’s root cause analysis product has been released
 
March 4th, 2007
New version of Performance Explorer-i released
New 4.0.1 Version of Performance Explorer-i - DBA InfoPower’s root cause analysis product has been released
 
January 14th, 2007
New version of DBA Act released
New 2.2.1 Version of DBA Act - DBA InfoPower’s real-time problem remediation and performance analysis product has been released
 
November 13th, 2006
SUN Partnership
DBA InfoPower Inc. is becoming approved SUN partner