RELIABILITY AND SYSTEM ENGINEERING TEAM WORK IN SAFETY ANALYSIS FOR AVIONICS SYSTEMS.

 

Vigdor Brecher and Dan Rabinovitz

Elbit Systems Ltd, POB 539, Haifa 31053 Tel. 04-8316462 e-mail: brecher@elbit.co.il

 

ABSTRACT

For Avionics Systems the safety aspects need detailed and accurate attention from the safety point of view. A team work methodology is presented for a systematic approach in proper use of information and tools. The presentation includes practical equations, checklists and examples.

 

INTRODUCTION

In Elbit Systems Ltd (ESL) a teamwork concept in performing the safety related activities was developed and applied on several projects during the last three years. The Reliability, Availability, Maintainability and Safety (RAMS) group collaborates with the System Engineering Group and Chief Safety Engineer effort in properly training the system engineer on a project work basis in using the proper tools and adopting the proper methodology. The methodology in part or integral was applied on projects of  simple display systems through complex avionics display systems, weapon delivery systems and full aircraft upgrading projects as well military and civil. The methodology is used for   Aircraft and Helicopter Upgrades, Unmanned Airborne Vehicles (UAVs), Combat Vehicle Upgrades (see Figure 1).

 

Figure 1

 


We tried to assure that the same safety process is used across the whole company and the safety process covers all lifecycle phases from pre-development to in-service.

 The safety process was conducted as well as at aircraft level, therefore had the capability to identify and manage safety issues that cross system boundaries.

We considered it very important that the safety process to be concurrent with, and to interact with, system design and development. Therefore the System Engineering Group involvement was so important to us.


SAFETY ASSESSMENT - TEAMWORK OF SYSTEM ENGINEERING AND RAMS

 

Identification of Safety Standards/Procedures Currently Applied

In ESL applying safety tasks for military systems as per   MIL-STD-882C/D System Safety Program Requirements were well knows to Engineering Teams and System Engineers.

Identification of safety standards/procedures currently applied in ESL and proper tailoring of the appropriate methodology was more difficult for Civil Aviation applications or for  UAVs.

 

Safety Assessment Process (see Figure 2) includes Functional Hazard Assessment, Preliminary System Safety Assessment, System Safety Assessment, Common Cause Analysis and  Safety-Related Flight Operations or Maintenance Tasks.

 

Figure 2

 

 

FAA documents  along Military Standards were properly adopted as basis for the assessment process:

 

Tools and Methodologies

System Engineering developed the Functional Flows and prepared simplified schematics that were the basis for preparing overall Fault Trees to be discussed with RAMS team.

We introduced   FTA software tools to replace work performed with Visio and trained system engineers in using it.

Basic Steps of Hazard Analysis

1.        The Hazard Analysis was performed on hazards identified 

2.        Equipment failure rates of failure modes resulting in hazards defined in the PHL were allocated from reliability data provided by the equipment manufacturer’s designers and Elbit proprietary data.

3.        On the basis of FMECA of equipments comprising System, failure rates causing total failure and failure rates causing misleading or improper data generation were calculated.

4.        Assuming random failure occurrence and independence between failures, probability of each basic failure was calculated for normalized mission duration of one hour.

5.        Functional and Electrical block diagrams were constructed for hazard assessment.

6.        Fault trees were defined on the basis of the Functional and Electrical block diagrams, and hazard probability calculated using FTA method.

7.        The calculated probabilities of hazards occurrence were allocated to mission phases in which they are applicable.

8.        Hazard Analysis results table was prepared.

 

 

The system safety design order of precedence is consistent with MIL-STD-882D.The order of precedence for satisfying system safety requirements and resolving identified hazards is as follows:

 

(1)Design for Minimum Risk. From the onset of design development the design is set to eliminate hazards. If an identified hazard cannot be eliminated, reduce the associated risk to an acceptable level, as defined by the managing authority, through design selection.

 

(2)Incorporate Safety Devices.  If identified hazards cannot be eliminated or their associated risk adequately reduced through design selection, that risk shall be reduced to an acceptable level through the use of fixed, automatic, or other protective safety design features or devices.  Provisions shall be made for periodic functional checks of safety devices when applicable.

 

(3)  Provide Warning Devices.  When neither design nor safety devices can effectively eliminate identified hazards or adequately reduce the associated risk, devices shall be used to detect the condition and to produce an adequate warning signal to alert personnel of the hazard.  Warning signals and their application shall be designed to minimize the probability of incorrect personnel reaction to the signals and shall be standardized within like types of systems.

 

(4)Develop Procedures and Training.  Where it is impractical to eliminate hazards through design selection or adequately reduce the associated risk with safety and warning devices, procedures and training shall be used.  However, without a specific waiver, no warning, caution, or other form of written advisory shall be used as the only risk reduction method for Category I or II hazards (as defined in Table 4).

 

Used Tools

Fault Tree Analysis (FTA) is used in all the safety process during safety assessment activity.

The adopted FTA methodology, implemented by using   FTA software has the advantages that

 the analysis is more and more deep (fault tree more detailed) as the system development progresses and its pictorial modeling capability.. The main modeling capability of Fault Tree supports the following quantitative failure models:

· fixed unavailability and failure frequency model

· constant failure and repair rate model

· mean time to failure and repair model

· dormant failure with periodic inspection model

· sequential failure model

· standby model

· uncertainty values

The analyses supported by this tool are:

· cut sets (qualitative analysis)

· calculation of system unavailability and related parameters  

· calculation of gates probabilities

· common cause failures.

 

The Steps in Building a Fault Tree are:

 

Values for the Basic Events in FTA are calculated by RAMS  for:

Electrical components

Predicted failure rate data based on MIL-HDBK-217

Information from manufacturers (life test data)

Need to be adjusted for the proper environment and stresses

Software databases

Field use (last resort)

Mechanical components

Determine stresses - loads (mechanical, environmental)

Construct stress/strength equation for multiple loads if required

Calculate design (safety) margin and reliability (probability of failure) for the required life

Manufacturing defects per factory data or  field failure data.

 

Prepare Table of Criticality Levels and Probability Classifications of evaluated failure conditions to include: Failure Condition -Criticality Level - Probability Classification

 

 

In the Safety Assessment process, the Common Cause Analysis (CCA) at equipment level contributes to the verification of those independence requirements between equipment internal failures are met related to a catastrophic or hazardous Undesired Event (UE).

The CCA is performed during the equipment design and development phases, and is complementary to the Equipment Safety Analysis (ESA) but use a different methodology. The CMA is generally a qualitative analysis.

The CCA process is made up of five steps:

·         Collect CCA inputs,

·         Identify the CCA independence criteria’s,

·         Establish specific list of potential Common Cause Failures (Common Cause Types, Sources) adapted to the equipment under study,

·         Analyse the design, the installation/maintenance/operation rules regarding the independence requirements by the identification of the precautions implemented to prevent CCF,

·         Document the results of the above steps.  (See also Figure 3)

 

Why Teamwork Is So Important

 Safety analyses involve some degree of intrinsic uncertainty. While the intended behavior of the system is usually well understood, failures and their effects are harder to understand.

As a consequence, there is a degree of subjectivity, in that identification. Dealing with safety issues relies on the experience of the safety engineer and the brainstorming with system engineering, pilots, is also of real help.

 

There is no way of testing for completeness of the hazard identification during the early phases of the design process, and often no way of checking in advance whether the safety problems that have been suggested are actually present in the system (although this can usually be decided as experience is gained in the longer term).

.

 Different groups need to work with different views of the system (e.g. systems engineers / functional view; safety engineers / hazard-directed view). This is generally a benefit, in that taking different views can achieve new insights, and help to identify flaws. However, this diversity can become a weakness if the views are not consistent, or if there are problems of clarity and completeness of comprehension when working with unfamiliar models. Therefore the teamwork is so important.

 

Existing / traditional safety analysis techniques are difficult to use on modern, complex systems for the following reasons:

- it is hard to produce analysis in a timely manner (it is a slow, costly, labor-intensive, mainly manual process), which limits the contribution safety analysis can make to evaluation of design alternatives

- Minor design changes can have an extensive impact on the related safety analysis

-Modern systems have a multitude of functions, and contain intricate failure detection and management methods. As a consequence, there are a huge number of failure conditions, and it is difficult to assess and model the way in which failures propagate through the system

- It is hard to represent the dynamic behavior of complex systems using techniques such as fault trees.

 

The following issue needs further attention and methodology solutions:

· Interaction between structural and systems safety issues

· How to assess human error

· How to assess software error

· After the initial hazard identification (based on information from past programs / lessons learned) and the aircraft level FHA analysis, there is a lack of a systematic process for identifying new hazards, particularly those introduced as the design changes and develops.

· The specification of safety targets in terms of aircraft loss rates puts too much emphasis on quantitative (rather than qualitative) safety analysis.

· Textual description of failure modes is often too ambiguous.

· The classification of failure conditions - this covers a number of related issues- lack of consistency and repeatability (making it hard to achieve reliable comparisons / trade-offs between the risks of different hazards or failure modes)

- over-pessimistic assessment of failure mode effects   we reliability engineers are always pessimists!)

· Determination of Design Assurance Levels / Safety Integrity Levels, and especially the criteria for independence of failures

 

 

SUMMARY

A disciplined, effective and integrative process dealing with the safety aspects of the system is implemented in ESL projects. Serious steps were taken for an effective team work between System Engineering and RAMS in performing safety tasks for complex systems. This methodology was reviewed and applied to several projects  Integration of software with hardware or a systematic process for identifying new hazards by combining the methodologies provided by military and civil standards shall receive supplementary attention. This presentation stressed the importance of the teamwork to achieve the goal.