1
|
Brandsæter A, Glad IK. Shapley values for cluster importance. Data Min Knowl Discov 2022. [DOI: 10.1007/s10618-022-00896-3] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 12/12/2022]
Abstract
AbstractThis paper proposes a novel approach to explain the predictions made by data-driven methods. Since such predictions rely heavily on the data used for training, explanations that convey information about how the training data affects the predictions are useful. The paper proposes a novel approach to quantify how different data-clusters of the training data affect a prediction. The quantification is based on Shapley values, a concept which originates from coalitional game theory, developed to fairly distribute the payout among a set of cooperating players. A player’s Shapley value is a measure of that player’s contribution. Shapley values are often used to quantify feature importance, ie. how features affect a prediction. This paper extends this to cluster importance, letting clusters of the training data act as players in a game where the predictions are the payouts. The novel methodology proposed in this paper lets us explore and investigate how different clusters of the training data affect the predictions made by any black-box model, allowing new aspects of the reasoning and inner workings of a prediction model to be conveyed to the users. The methodology is fundamentally different from existing explanation methods, providing insight which would not be available otherwise, and should complement existing explanation methods, including explanations based on feature importance.
Collapse
|
2
|
Chen Y, Lin D, Dai X, Wu X, Hong C, Liu Y. Preliminary research on the evolution laws of overburden soil structure and its radon reduction ability for uranium tailings impoundment in extreme heat and insolation conditions. J Radioanal Nucl Chem 2021. [DOI: 10.1007/s10967-021-08023-0] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 10/20/2022]
|
3
|
De Saint Jean C, Tamagno P, Archier P, Noguere G. CONRAD – a code for nuclear data modeling and evaluation. EPJ NUCLEAR SCIENCES & TECHNOLOGIES 2021. [DOI: 10.1051/epjn/2021011] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The CONRAD code is an object-oriented software tool developed at CEA since 2005. It aims at providing nuclear reaction model calculations, data assimilation procedures based on Bayesian inference and a proper framework to treat all uncertainties involved in the nuclear data evaluation process: experimental uncertainties (statistical and systematic) as well as model parameter uncertainties. This paper will present the status of CONRAD-V1 developments concerning the theoretical and evaluation aspects. Each development is illustrated with examples and calculations were validated by comparison with existing codes (SAMMY, REFIT, ECIS, TALYS) or by comparison with experiment. At the end of this paper, a general perspective for CONRAD (concerning the evaluation and theoretical modules) and actual developments will be presented.
Collapse
|
4
|
Ivanov E, De Saint-Jean C, Sobes V. Nuclear data assimilation, scientific basis and current status. EPJ NUCLEAR SCIENCES & TECHNOLOGIES 2021. [DOI: 10.1051/epjn/2021008] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The use of Data Assimilation methodologies, known also as a data adjustment, liaises the results of theoretical and experimental studies improving an accuracy of simulation models and giving a confidence to designers and regulation bodies. From the mathematical point of view, it approaches an optimized fit to experimental data revealing unknown causes by known consequences that would be crucial for data calibration and validation. Data assimilation adds value in a ND evaluation process, adjusting nuclear data to particular application providing so-called optimized design-oriented library, calibrating nuclear data involving IEs since all theories and differential experiments provide the only relative values, and providing an evidence-based background for validation of Nuclear data libraries substantiating the UQ process. Similarly, it valorizes experimental data and the experiments, as such involving them in a scientific turnover extracting essential information inherently contained in legacy and newly set up experiments, and prioritizing dedicated basic experimental programs. Given that a number of popular algorithms, including deterministic like Generalized Linear Least Square methodology and stochastic ones like Backward and Hierarchic or Total Monte-Carlo, Hierarchic Monte-Carlo, etc., being different in terms of particular numerical formalism are, though, commonly grounded on the Bayesian theoretical basis. They demonstrated sufficient maturity, providing optimized design-oriented data libraries or evidence-based backgrounds for a science-driven validation of general-purpose libraries in a wide range of practical applications.
Collapse
|
5
|
Kumar D, Alam SB, Sjöstrand H, Palau J, De Saint Jean C. Nuclear data adjustment using Bayesian inference, diagnostics for model fit and influence of model parameters. EPJ WEB OF CONFERENCES 2020. [DOI: 10.1051/epjconf/202023913003] [Citation(s) in RCA: 2] [Impact Index Per Article: 0.5] [Reference Citation Analysis] [Abstract] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/14/2022] Open
Abstract
The mathematical models used for nuclear data evaluations contain a large number of theoretical parameters that are usually uncertain. These parameters can be calibrated (or improved) by the information collected from integral/differential experiments. The Bayesian inference technique is used to utilize measurements for data assimilation. The Bayesian approximation is based on the least-square or Monte-Carlo approaches. In this process, the model parameters are optimized. In the adjustment process, it is essential to include the analysis related to the influence of model parameters on the adjusted data. In this work, some statistical indicators such as the concept of Cook’s distance; Akaike, Bayesian and deviance information criteria; effective degrees of freedom are developed within the CONRAD platform. Further, these indicators are applied to a test case of 155Gd to evaluate and compare the influence of resonance parameters.
Collapse
|