Application of control charts for non-normally distributed data using statistical software program : A technical case study

Control charts are a valuable assessment tool in the healthcare industry. The ease of use of these trending charts is crucial to obtain timely important results with minimum time and efforts. The current case showed analysis of nonnormal data to obtain control charts with useful output without using exhaustive different means of transformation and/or omitting aberrant numbers. Raw results for the quality of purified water from water treatment plant that converts municipal water to purified water were collected from two points-of-use (ζ and Ψ). Data gathered were conductivity and total organic carbon (TOC) measurements. The statistical processing and control charts were done using commercial statistical software package. Statistical analysis of data showed that conductivity and TOC results of both points did not follow Gaussian distribution except TOC of point Ψ where it passed normality test, but they were closest to other distributions. There were several observations of outlier values from the results. Moreover, data normalization did not improve after removal of the extreme values. Data were switched to be interpreted using Laneymodified attribute control charts and compared with the original results drawn using individual-moving range (I-MR). Interestingly, both types of control charts agreed regarding control limits and some alarm points. I-MR and Laneyattribute charts could be used for non-normal data with unusual other types of distributions that may not be suitable for conventional types of control charts with the variable charts possess greater sensitivity of alarm detection over the attribute charts.


Introduction
Process-behavior (control, trending or Shewhart are synonymous) charts have wide application in different fields including healthcare industry [1,2].Shewhart charts are important to monitor a pattern in the inspected characteristics [3].However, trending charts have prerequisites before application in a certain investigation such as normality of data for individual-moving range (I-MR) control charts [4].Interestingly, it is suggested that individuals (Is) charts could be used for non-normal and attribute data [5].
In contrast, application of I-MR charts requires specific requirements such as the continuous nature of data, time order and a certain degree of normality as recommended by some authors [6,7].Accordingly, if the collected results did not meet the specified distribution for the trending chart, the control chart would not be appropriate for use.On the other hand, discrete data that do not meet the rules and importantly the presumed distribution for the ordinary attribute control charts may be improved by the application of Laney modification in control charts which has become available in the statistical software programs such as Minitab [8].However, concerning the normal distribution requirement, the dilemma may intensify when some references stress on the transformation need before the application of variable control charts, especially I-MR [4,9].The current case aimed to investigate the application of conventionally known control charts for raw data that do not follow commonly known distributions such as normal, Poisson or negative binomial pattern (which are mentioned herein as wild data) and to study the output results that stems with none and/or minimum treatment of data before charting.In addition, a comparison between two systems of control charts will reveal the agreements and the differences to determine the preferences between each type and the flexibility to switch between both.

The subject of the study
Water treatment station that feeds healthcare facility with purified water using municipal water as a feeding source.The plant is composed generally of coarse filtration tower, Chlorination adjustment unit, interchangeable softeners, sanitization plant, reverse osmosis (RO) and electro-deionization compartments.In addition, purified water tank and distribution loop are monitored regularly for the quality of the produced water for consumption in a healthcare facility.

Sampling period
Samples of purified water were taken regularly on daily basis during the third quarter (July, August and September) of the year 2018.

Sample data
The water plant was sampled from two points-of-use -denoted by (ζ and Ψ) -to be tested for total organic carbon (TOC) and conductivity.Results were collected in an Excel sheet by the quality engineer of the firm.

Unit of measurements and limits
Alert and action limits of TOC are not more than (NMT) 300.0 and 500.0part per billion (ppb).Conductivity measurements were made at 25 °C and the primary criterion is that the reading must be NMT 1.30 µS/cm [10].

Data interpretation
Raw results will be initially subjected to a descriptive statistical analysis accompanied by distribution fitting and normality testing.Shewhart charts will be constructed through two approaches.Firstly, direct data visualization using I-MR chart regardless, data were normal or not.Secondly, the figures of the results will be turned into discrete values by multiplying by the sensitivity decimals of the test i.e.X10 for TOC and X100 for conductivity and interpreted using Laney-modified attribute control charts.

Role of commercial statistical software
Column and descriptive statistics were performed using GraphPad Prism for Windows version 6.01 [11].Best distribution fitting analysis was done using XLSTAT version 2014.5.03 [12].While box-and-whisker diagrams and control charts were plotted using Minitab ® 17.1.0[13].

Statistical analysis
Column statistics showed that the mean ± standard deviation (SD) of conductivity and TOC for both ζ and Ψ ports was 1.29 ± 0.42, 1.25 ± 0.34 µS/cm and 52.3 ± 17.3, 46.0 ± 12.1 ppb, respectively.Table 1 shows percentile analysis, the upper and the lower confidence intervals (CI) for means, medians and geometric means at 95 %, coefficient of variations (CV) and the degree of skewness and kurtosis.In general, the results of each measured parameters for sampling points of purified water are not different significantly at P <0.05 with weak Spearman correlation when assessed nonparametrically: 0.49 (TOC) and 0.33 (conductivity).

Normality testing
Normality tests were performed using three methods using GraphPad Prism at α = 0.05 as could be found in Table 2.No data showed significant Gaussian distribution except TOC  with Kolmogorov-Smirnov test.Non-normal spreading of data could be demonstrated in Figure 1 of box plot diagram where the shape of TOC  is the only one that approximate symmetry close to that of the normal distribution.It should be noted that the rate of aberrant values in the conductivity test is higher than that of TOC.However, removing the outliers does not improve the normalization process and data still did not pass the normality tests.

Control charts
Trending charts of TOC and conductivity for both ζ and Ψ point-of-use could be demonstrated in Figures 2, 3, 4 and 5 using both I-MR and Laney-modified attribute charts with red dots indicating alarms for out-of-control results.Means are indicated by green lines and control limits (CLs) are indicated with red lines above (UCL) and below (LCL) the mean lines.Z value in Laney-corrected attribute chart is a measurement for overdispersion of data which is significant in TOC rather than conductivity in both use points of the water treatment station.
Table 1 Statistical analysis of the inspected TOC and conductivity qualities of purified water in the healthcare facility

Discussion
Trending charts have become an indispensable tool in almost every field in research and industry with special focus on the healthcare industry such as in hospitals, medical device manufacturing and pharmaceutical firms [14][15][16][17].However, the ease of implementation without compromising the conclusion deduced from Shewhart charts is the crux of their application [18,19].
The present study aimed to solve this dilemma by using two approaches with a minimum amount of conversions or transformations to minimize the risk of errors in calculations and to save time with minimum efforts yielding a similar outcome.Examples of wild data that do not follow specific distribution are common such as in the current case [20,21].The use of I-MR charts was reported previously to be used with non-normal data with a similar outcome with Laney attribute control charts when compared with (I) charts section [22,23].However, variable control charts have an extra sensitivity over attribute trending charts of detection more alarms that may provide an early warning for excursions before they occur.In Minitab, eight types of alarms could be detected in variable control charts versus four in attribute counterpart.In addition, I-MR charts provide a measure of the process variation stability in MR charts [24].While it could be evident that I-MR and Laney attribute control charts share one to four alarm-types, five to eight alarms are specific for variable attribute charts.
Since many points (indicated by red dots) showed out-of-control states, the assignable specific causes of such variations require identification and corrections to improve the process and the quality of purified water that feeds healthcare facility.Once the inspected properties (TOC and conductivity) shows improvement and becomes in control (which will be evident in control charts with only common-cause variations are present), the process capabilities parameters could be calculated.

Conclusion
In general, both I-MR and Laney attribute charts can be used simply on wild raw data that do not follow any kind of assumed distributions for the creation of specific control charts.Accordingly, they save time and effort to provide a simple and effective way for continuous monitoring of the inspected properties and/or process.However, I-MR charts may show some advantages of Laney attribute charts in term of ease of construction and detection of out-of-control points.In the present case, it is important to provide better control and maintenance of the water treatment plant to improve the quality of TOC and conductivity for the output purified water.

Disclosure of conflict of interest
None to declare.

Table 2
Normality tests investigated by three different methods at α=0.05

Normality Test Conductivity ζ TOC ζ Conductivity  TOC  I-D'Agostino and Pearson omnibus normality test
KS= Kolmogorov-Smirnov ns = Not Significant