📋 AI Review from DeepReviewer will be automatically processed
📋 AI Review from ZGCA will be automatically processed
The paper proposes a Bayesian quadrature formulation of conformal risk control. Given calibration losses ℓ_1,...,ℓ_n, the authors define an aggregated loss L^{+} = ∑_{i=1}^{n+1} U_i ℓ_{(i)} with U ~ Dir(1,...,1) over the order statistics augmented by ℓ_{(n+1)} = B, and select a threshold λ such that Pr(L^{+} ≤ α) ≥ β (Sections 1–4). This framework is claimed to recover Split Conformal Prediction (SCP) via an order-statistic criterion and Conformal Risk Control (CRC) via the posterior mean E[L^{+}] = (∑ℓ_i + B)/(n+1) (Sections 3–4). The paper introduces a High Posterior Density (HPD) rule that uses Monte Carlo sampling of U to enforce the inequality and compares HPD with SCP, CRC, and a uniform concentration baseline (RCPS) on synthetic binomial-loss and heteroskedastic regression tasks (Sections 5–6). Empirically, HPD attains 0% failure with better utility than RCPS and less optimism than CRC.
Cross‑Modal Consistency: 31/50
Textual Logical Soundness: 16/30
Visual Aesthetics & Clarity: 10/20
Overall Score: 57/100
Detailed Evaluation (≤500 words):
1. Cross‑Modal Consistency
• Major 1: Table numbering/content inconsistent: Abstract says results are in Table 1, but Sec. 4 uses Table 1 for decision rules. Evidence: “These findings, summarized in Table 1” (Abstract) vs. “Table 1: Summary of decision rules” (Sec 4).
• Major 2: Results are referenced as “Table below” without numbering, breaking traceability. Evidence: “These results can be summarized in Table below” (Sec 6).
• Minor 1: Methods mention future figures (“histograms… boxplots”) not present anywhere, causing expectation–content mismatch. Evidence: “presented using detailed tables and figures (e.g., histograms… boxplots)” (Sec 5).
2. Text Logic
• Major 1: Claimed theoretical guarantees lack formal statements/proofs (no theorems/lemmas or parameterized bounds). Evidence: “We derive theoretical guarantees… ensure that Pr(L+ ≤ α) ≥ β” (Sec 1).
• Major 2: Unspecified “stochastic dominance by a Beta” claim without parameters or justification. Evidence: “aggregated loss stochastically dominates a Beta distribution” (Sec 2).
• Minor 1: “Recovers SCP and CRC as special cases” is asserted but not rigorously derived beyond informal formulas. Evidence: “recovers standard methods (SCP and CRC) as special cases” (Sec 1).
3. Figure Quality
• Major 1: Critical decision-rule table has malformed/ambiguous formulas (missing parentheses/bars), risking misinterpretation. Evidence: “CRC Σi=1nℓi+B/n+1≤α” and “RCPS ℓ+ √log(1/δ)/2n≤α” (Sec 4, Table 1).
• Major 2: Promised qualitative/quantitative figures are absent; none are labeled or numbered. Evidence: “presented using… figures” (Sec 5) but none included.
• Minor 1: Results tables lack captions/IDs; cannot be referenced unambiguously. Evidence: Two uncaptioned tables in Sec 6.
Key strengths:
Key weaknesses:
Recommendations:
📋 AI Review from SafeReviewer will be automatically processed
The paper introduces a new framework for conformal risk control using Bayesian quadrature. The authors propose a new high posterior density (HPD) rule for selecting decision thresholds, which offers a more flexible and interpretable approach to conformal prediction. The method is validated on synthetic binomial loss and heteroskedastic regression tasks, demonstrating improved performance over existing methods.
I find several aspects of this paper to be commendable. The core strength lies in the novel perspective it offers on conformal risk control. By framing the problem through Bayesian quadrature, the authors provide a fresh interpretation of existing methods and introduce a new decision rule, the HPD rule, that leverages the full posterior distribution of the aggregated loss. This approach is theoretically grounded and provides a more flexible framework compared to methods that rely solely on order statistics or the posterior mean. The paper's ability to recover existing methods like SCP and CRC as special cases is also a significant strength. This demonstrates that the proposed framework is not just an alternative but a generalization of existing approaches, providing a unified perspective on conformal risk control. The empirical results, while limited in scope, provide some evidence that the proposed HPD rule can achieve a better trade-off between risk control and the size of the prediction sets compared to SCP and CRC. This is a practically relevant finding, as it suggests that the proposed method could lead to more efficient prediction sets in real-world applications. Furthermore, the paper is generally well-written and easy to follow, making the core ideas accessible to a broad audience. The authors clearly articulate their approach and provide sufficient details for the reader to understand the main contributions. The use of synthetic data allows for a controlled evaluation of the proposed method, highlighting its potential advantages in specific scenarios. The paper's focus on risk control, rather than just coverage, is also a strength, as it addresses a critical need in many practical applications where the consequences of misestimation can be severe. Finally, the paper's exploration of a Bayesian approach to conformal prediction is a valuable contribution to the field, which has traditionally been dominated by frequentist methods. This opens up new avenues for research and could lead to the development of more robust and adaptable uncertainty quantification methods.
Despite its strengths, this paper has several weaknesses that I have identified through my analysis. A primary concern is the lack of a rigorous theoretical justification for the Bayesian quadrature (BQ) approach in the context of conformal prediction. While the authors present BQ as a core motivation, the actual derivation of the method relies on the exchangeability assumption and properties of order statistics, similar to traditional conformal prediction. The paper does not explicitly detail the integration process over the quantile function using BQ, nor does it clearly explain how the BQ weights relate to the conformal prediction procedure. This makes the BQ perspective seem somewhat superficial, as the resulting method closely resembles standard conformal risk control techniques. The paper defines the aggregated loss $L^+$ as $L^+ = \sum_{i=1}^{n+1} U_i l(i)$, where $U \sim Dir(1,\dots,1)$, and states that it is motivated by classical results that guarantee $L^+$ stochastically dominates a Beta distribution. However, the connection between this construction and the BQ approach is not clearly established. The paper does not provide a detailed explanation of how the integral approximation in BQ leads to this specific form of $L^+$. This lack of clarity undermines the claim that the method is a principled application of BQ. My analysis also reveals that the paper's contribution beyond existing conformal risk control methods is not entirely clear. The proposed HPD rule, while novel, does not demonstrate a significant advantage over existing methods, particularly Randomized Concentration Prediction Sets (RCPS). The empirical results show that the HPD rule achieves similar risk control to RCPS, while being less conservative than Split Conformal Prediction (SCP) and Conformal Risk Control (CRC). However, the paper does not provide a compelling argument for why the HPD rule is superior to RCPS, especially considering the computational overhead of sampling from the posterior distribution. The paper also does not adequately address the limitations of the proposed method, particularly its reliance on the exchangeability assumption and the difficulty in handling complex decision sets. The authors acknowledge these limitations, but they do not provide a detailed discussion of how these limitations might impact the practical applicability of the method. The paper's experimental evaluation is also limited in scope. The experiments are conducted on synthetic datasets, which may not fully reflect the challenges of real-world applications. The paper does not include experiments on more complex datasets or with different types of models. This limits the generalizability of the findings and makes it difficult to assess the practical relevance of the proposed method. Furthermore, the paper lacks a detailed analysis of the computational cost of the proposed method. The HPD rule involves Monte Carlo sampling, which can be computationally expensive, especially for large datasets. The paper does not provide any information about the computational time required for the HPD rule compared to other methods, making it difficult to assess its practical feasibility. The paper also does not provide a clear explanation of how the method can be used to construct prediction sets in regression problems. The method focuses on selecting a threshold based on risk control, but it does not explicitly detail how this translates to the construction of prediction intervals in regression. This lack of clarity makes it difficult to understand the practical implications of the proposed method for regression tasks. Finally, I found that the paper's writing could be improved in several places. The notation is not always clearly defined, and some of the explanations are vague or difficult to follow. For example, the paper uses the notation $R(\lambda)$ without explicitly defining the dependence of the loss function on $\lambda$ in the definition of $R(\lambda)$. The paper also uses the notation $L^+$ before fully defining all its components. These issues make it more difficult for the reader to understand the core ideas of the paper.
Based on the weaknesses I've identified, I have several suggestions for improving this paper. First, the authors should provide a more rigorous theoretical justification for the use of Bayesian quadrature in the context of conformal prediction. This should include a detailed explanation of how the integral approximation in BQ leads to the specific form of the aggregated loss $L^+$. The authors should also clarify the connection between the BQ weights and the conformal prediction procedure. This could involve explicitly deriving the method from a Bayesian quadrature perspective, showing how the choice of basis functions and the approximation of the integral directly lead to the specific form of $L^+$. Furthermore, the authors should discuss the convergence properties of the BQ approximation in this context and how it relates to the finite-sample guarantees of conformal prediction. This would strengthen the theoretical foundation of the proposed method and clarify its relationship to existing conformal risk control techniques. Second, the authors should provide a more compelling argument for the superiority of the HPD rule over existing methods, particularly RCPS. This could involve demonstrating a clear advantage in terms of computational efficiency or prediction set size. The authors should also provide a more detailed analysis of the trade-offs between risk control and prediction set size for the HPD rule, and compare these trade-offs to those of other methods. This could involve exploring different ways to construct the prediction set given the risk control constraint, and analyzing the resulting set sizes. The authors should also consider comparing their method to other conformal risk control methods that do not rely on randomization, as this would provide a more direct comparison of the core ideas. Third, the authors should address the limitations of the proposed method more thoroughly. This should include a detailed discussion of the exchangeability assumption and its implications for the practical applicability of the method. The authors should also discuss the challenges of applying the method to complex decision sets and propose potential solutions. This could involve exploring alternative formulations of the aggregated loss or developing new decision rules that are better suited for complex decision sets. Fourth, the authors should expand the experimental evaluation to include more complex datasets and different types of models. This would help to assess the generalizability of the findings and demonstrate the practical relevance of the proposed method. The authors should also include a detailed analysis of the computational cost of the proposed method, comparing it to other methods. This would help to assess the practical feasibility of the method for large datasets. Fifth, the authors should provide a clearer explanation of how the method can be used to construct prediction sets in regression problems. This should include a step-by-step explanation of how the method can be used to construct prediction intervals in heteroskedastic regression. Finally, the authors should improve the clarity of the paper's writing. This should include explicitly defining all notation before its first use and providing more detailed explanations of the core concepts. The authors should also ensure that the paper is consistent in its notation and that all equations are clearly explained. By addressing these suggestions, the authors can significantly strengthen the paper and make it a more valuable contribution to the field of conformal prediction.
Several questions arose during my analysis of this paper. First, I am curious about the specific choice of the Dirichlet distribution for the weights in the aggregated loss. While the paper mentions that this choice is motivated by classical results on tolerance regions, I would like to understand if other choices of prior distributions would lead to similar or different results. What are the implications of using a different prior, and how would this affect the theoretical properties of the method? Second, I am interested in the practical implications of the proposed method for complex decision sets. The paper acknowledges that the method is limited to simple decision sets, but I would like to know if there are any potential approaches to extend the method to more complex decision sets. Are there any alternative formulations of the aggregated loss or decision rules that could be used in this context? Third, I would like to understand the sensitivity of the proposed method to the choice of the confidence level $\beta$. The paper mentions that the HPD rule selects a threshold based on a user-defined confidence level, but I would like to know how the choice of $\beta$ affects the resulting prediction sets and the overall performance of the method. Is there an optimal way to choose $\beta$, or does it depend on the specific application? Fourth, I am curious about the computational cost of the proposed method, particularly the HPD rule. The paper does not provide any information about the computational time required for the HPD rule, and I would like to know how it compares to other methods. Are there any ways to optimize the computational efficiency of the HPD rule, such as using parallel computing or other techniques? Finally, I would like to understand the relationship between the proposed method and other Bayesian approaches to conformal prediction. The paper focuses on a specific Bayesian approach using Dirichlet sampling, but I would like to know if other Bayesian methods could be used in this context. How would these methods compare to the proposed approach in terms of theoretical properties and practical performance? These questions are aimed at clarifying key uncertainties and methodological choices in the paper, and I believe that addressing them would significantly enhance the paper's contribution to the field.