Skip to main content

Management of left-censored data in dietary exposure assessment of chemical substances

EFSA Journal logo
Wiley Online Library

Meta data

Abstract

Within the general framework of chemical risk assessment, a difficult step in dietary exposure assessment is the handling of concentration data reported to be below the limit of detection (LOD). These data are known as non-detects and the resulting distribution of occurrence values is left-censored. Handling left-censored data represents a challenge for EFSA?s collection and statistical analysis of chemical occurrence data. EFSA has so far treated left-censored data with widely used substitution methods recommended by international organisations. The appropriateness of this approach has a natural limitation in the computation of percentiles and in the application of statistical techniques. An EFSA working group was established to estimate the accuracy of methods currently used and to propose recommendations for more advanced alternative statistical approaches. Based on a simulation study and on analyses of real data, an ad hoc evaluation was carried out to assess the performance of different statistical methods to handle non-detects, i.e. parametric Maximum likelihood (ML) models, the log-probit regression method and the non-parametric Kaplan-Meier (KM) method. Results showed that the number of samples had a relatively limited impact on the accuracy and precision of estimates, but the degree of censoring had a large effect. When analysing a complex set of data, it was also shown that it is essential to identify possible sources of heterogeneity in a dataset, such as country of sample collection/origin, food group, laboratory, etc. Statistical analyses should either be conducted separately from these factors, or, to explicitly account for this heterogeneity, fixed/random effect ML models could be used. Based on a minimum number of available samples and to different values of censoring percentages, the working group outlined recommendations, including the use of appropriate statistical tests, to handle left-censored distributions of chemical contaminant data in the context of exposure assessment.