Zhan T, Kang J. A general, flexible, and harmonious framework to construct interpretable functions in regression analysis.
Biometrics 2025;
81:ujaf014. [PMID:
40037599 DOI:
10.1093/biomtc/ujaf014]
[Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/15/2024] [Revised: 12/04/2024] [Accepted: 02/05/2025] [Indexed: 03/06/2025]
Abstract
An interpretable model or method has several appealing features, such as reliability to adversarial examples, transparency of decision-making, and communication facilitator. However, interpretability is a subjective concept, and even its definition can be diverse. The same model may be deemed as interpretable by a study team, but regarded as a black-box algorithm by another squad. Simplicity, accuracy and generalizability are some additional important aspects of evaluating interpretability. In this work, we present a general, flexible and harmonious framework to construct interpretable functions in regression analysis with a focus on continuous outcomes. We formulate a functional skeleton in light of users' expectations of interpretability. A new measure based on Mallows's $C_p$-statistic is proposed for model selection to balance approximation, generalizability, and interpretability. We apply this approach to derive a sample size formula in adaptive clinical trial designs to demonstrate the general workflow, and to explain operating characteristics in a Bayesian Go/No-Go paradigm to show the potential advantages of using meaningful intermediate variables. Generalization to categorical outcomes is illustrated in an example of hypothesis testing based on Fisher's exact test. A real data analysis of NHANES (National Health and Nutrition Examination Survey) is conducted to investigate relationships between some important laboratory measurements. We also discuss some extensions of this method.
Collapse