📋 AI Review from DeepReviewer will be automatically processed
📋 AI Review from ZGCA will be automatically processed
The paper proposes Hierarchical Adaptive Normalization for wearable HAR to address performance degradation from sensor placement and orientation variability. The method has two stages: Stage 1 performs gravity-based orientation correction and infers a "placement context" via variance features with a small classifier; a stability gate based on the L2 norm of the normalized input allows or suppresses adaptive updates. Stage 2 applies a placement-conditioned adaptive Batch Normalization whose momentum depends on the inferred placement and confidence; a lightweight CNN produces final predictions. The approach targets real-time deployment and reports macro F1 improvements (0.847 ± 0.023) over static and state-of-the-art UDA baselines, with low latency (2.3 ms) and modest memory (45.2 MB). Extensive experiments include ablations, multi-dataset tests (including Opportunity and a custom dataset), cross-subject analysis, robustness to noise/sampling rates, and sensitivity to hyperparameters (e.g., stability threshold τ and kernel size).
Cross‑Modal Consistency: 16/50
Textual Logical Soundness: 12/30
Visual Aesthetics & Clarity: 10/20
Overall Score: 38/100
Detailed Evaluation (≤500 words):
Image‑first visual ground truth
1. Cross‑Modal Consistency
• Major 1: Fig. 1(b) shows F1 ≤0.5 but Sec 5.1 claims F1>0.8 for larger kernels. Evidence: “kernel sizes 5 and 7 consistently achieve superior performance (F1 > 0.8)”.
• Major 2: Fig. 2(a) bars ≈0.25–0.3 contradict caption claim of F1>0.75 across domains. Evidence: “maintains consistent performance (F1 > 0.75) across both domains”.
• Major 3: Multi‑dataset bars (≈0.19–0.29) conflict with Sec 5.5 claims (0.823/0.847/0.789). Evidence: “achieves 0.823 ± 0.021 on the Opportunity dataset… 0.847 ± 0.023… 0.789 ± 0.019”.
• Minor 1: Fig. 1 titles say “Baseline,” while Sec 5.1 text attributes curves to the proposed model.
• Minor 2: Figure numbering/captions in text (Figs. 3–5) do not clearly correspond to provided panels; datasets anonymized as “Dataset1–3”.
2. Text Logic
• Major 1: Internal inconsistency on achievable F1. Sec 5 states “kernel sizes 5 and 7 yielded final training F1 ≈0.43 and validation ≈0.49,” yet Sec 5.1 says “peak F1‑scores above 0.8.” Evidence: “final training F1‑scores around 0.43 and validation F1‑scores near 0.49”.
• Minor 1: Stability‑gate rationale equates high‑norm signals with stability; justification is heuristic and may invert typical notions of instability.
• Minor 2: Claims of +36% and +13.7% rely on Table 1, but the figures widely disagree, weakening the argument chain.
3. Figure Quality
• Minor 1: Several panels use very small fonts/legends; difficult at print size to distinguish all kernel lines.
• Minor 2: Generic dataset labels (“Dataset1–3”) and missing units hinder stand‑alone interpretability.
• Minor 3: Key claims (e.g., error bars/CI for F1) absent in plots; bars lack variance depiction.
Key strengths:
Key weaknesses:
📋 AI Review from SafeReviewer will be automatically processed
This paper introduces a hierarchical adaptive normalization method designed to enhance the robustness of wearable human activity recognition (HAR) systems when sensor placement and orientation vary. The core idea revolves around a two-stage cascade. In the first stage, the method employs gravity-based orientation correction to align sensor data with the gravitational vector, aiming to mitigate the impact of sensor orientation changes. This is followed by a placement context inference mechanism that uses signal variance to categorize the sensor's location (e.g., wrist, waist, ankle). A stability gate, inspired by robotics, is incorporated to prevent harmful updates during unstable periods, such as abrupt movements or high-impact activities. The second stage utilizes a placement-conditioned adaptive Batch Normalization (BN) to refine feature representations in real-time, adapting to the specific sensor placement context. The authors evaluate their method on a custom dataset, demonstrating improved performance compared to baseline approaches and some state-of-the-art methods. The method is presented as computationally efficient, making it suitable for real-time, on-device applications. The paper's main contribution lies in the integration of these components into a unified framework that addresses the challenges of sensor variability in wearable HAR. The experimental results, while showing promise, are primarily based on a custom dataset, and the comparisons with state-of-the-art methods could be more comprehensive. Overall, the paper presents an interesting approach to a significant problem in wearable sensing, but it would benefit from further validation and a more thorough comparison with existing techniques.
I found several aspects of this paper to be commendable. The core strength lies in the paper's attempt to address a very relevant problem in wearable sensor-based human activity recognition: the variability introduced by changes in sensor placement and orientation. The proposed hierarchical adaptive normalization method, with its two-stage cascade, is a well-structured approach to tackle this issue. The use of gravity-based orientation correction is a logical first step in mitigating the effects of sensor orientation changes. The subsequent placement context inference, using signal variance, provides a way to categorize sensor locations, which is crucial for adapting the processing pipeline. The inclusion of a stability gate, inspired by robotics, is a novel attempt to prevent harmful updates during unstable periods, and the use of signal norm as a stability indicator is a practical choice. The adaptive Batch Normalization, conditioned on the inferred placement context, is a reasonable way to refine feature representations. The experimental results, although primarily based on a custom dataset, do show that the proposed method achieves improved performance compared to baseline approaches and some state-of-the-art methods. The authors also emphasize the computational efficiency of their method, which is a critical factor for real-time, on-device applications. The ablation studies, while not exhaustive, do provide some insights into the contribution of each component. The paper is generally well-organized and easy to follow, which facilitates understanding of the proposed method. The authors also provide a clear description of the experimental setup and the evaluation metrics. The use of a custom dataset, while raising some concerns about generalizability, also allows for a controlled evaluation of the method's performance under specific conditions. The paper's focus on real-time performance and on-device deployment is also a significant strength, as these are crucial considerations for practical wearable applications.
Despite the strengths, I have identified several weaknesses that warrant careful consideration. First, the paper's reliance on a custom dataset as the primary evaluation dataset is a significant concern. While the authors do include results on the Opportunity dataset, the core quantitative comparisons in Tables 1 and 2 are based on the custom dataset. This raises questions about the generalizability of the findings. The paper lacks detailed information about the custom dataset's characteristics, such as the number of subjects, the specific activities performed, and the duration of the recordings. This lack of transparency makes it difficult to assess the dataset's representativeness and potential biases. The paper also does not provide a clear justification for the choice of activities included in the custom dataset, nor does it discuss how these activities relate to real-world scenarios. This lack of detail about the custom dataset limits my confidence in the results. Second, the paper's comparison with state-of-the-art methods is not as comprehensive as it could be. While the authors compare their method against some existing techniques, the selection of these methods appears somewhat arbitrary. The paper does not provide a clear rationale for why these specific methods were chosen and why other relevant approaches were excluded. For instance, the paper does not compare against methods that explicitly model sensor orientation using quaternion-based representations or those that employ signal decomposition techniques to isolate gravitational components. The absence of a more thorough comparison with a wider range of state-of-the-art methods makes it difficult to assess the true novelty and effectiveness of the proposed approach. The paper also does not discuss the limitations of the chosen comparison methods and why they are not suitable for the problem at hand, which further weakens the justification for the proposed method. Third, the paper's discussion of the stability gate, while interesting, lacks sufficient detail. The paper mentions that the stability gate is inspired by robotics applications, but it does not provide specific citations to relevant robotics literature. The paper also does not fully explain how the stability gate differentiates between actual activity changes and unstable periods. The paper uses a simple norm-based threshold, but it does not discuss the limitations of this approach or how it might be affected by different types of activities or sensor placements. The paper also does not provide a detailed analysis of the stability gate's performance under various challenging conditions, such as abrupt movements or high-impact activities. Fourth, the paper's explanation of the placement-conditioned adaptive Batch Normalization (BN) is somewhat vague. The paper states that the BN parameters are adjusted based on the inferred placement context, but it does not provide a detailed explanation of how this is achieved. The paper does not clarify whether the method uses separate BN layers for each placement or if it adapts a single BN layer based on the context. The paper also does not discuss the specific parameters of the BN that are adapted and how these parameters are updated during training. This lack of clarity makes it difficult to fully understand the inner workings of the proposed method. Fifth, the paper's description of the experimental setup lacks some crucial details. The paper mentions that the CNN is trained on data from a single sensor placement, but it does not specify which placement (wrist, waist, or ankle). The paper also does not provide a clear explanation of how the baseline methods are trained and evaluated. The paper mentions that the baselines are trained on a single placement, but it does not discuss the implications of this choice for cross-placement generalization. The paper also does not provide a clear definition of the domains used in the cross-domain evaluation, which makes it difficult to interpret the results. Finally, the paper's presentation could be improved. The paper uses some abbreviations without explicit definitions, which can make it difficult to follow. The paper also lacks a clear visual representation of the proposed method, such as a flowchart or block diagram, which would greatly enhance the reader's understanding. The paper also does not provide a clear explanation of the evaluation metrics used, which makes it difficult to interpret the results. The paper also does not provide a detailed analysis of the computational cost of the proposed method, which is a critical factor for real-time applications.
Based on the identified weaknesses, I recommend several concrete improvements. First, the authors should conduct a more thorough evaluation of their method using multiple publicly available datasets, in addition to their custom dataset. This would help to assess the generalizability of their findings and address concerns about potential biases in their custom dataset. The authors should also provide detailed information about the characteristics of their custom dataset, including the number of subjects, the specific activities performed, and the duration of the recordings. This would improve the transparency of their work and allow other researchers to better understand the context of their findings. Second, the authors should include a more comprehensive comparison with state-of-the-art methods. This should include methods that explicitly model sensor orientation using quaternion-based representations, methods that employ signal decomposition techniques, and other relevant approaches. The authors should also provide a clear rationale for the selection of comparison methods and discuss the limitations of these methods in the context of the problem being addressed. Third, the authors should provide a more detailed explanation of the stability gate, including specific citations to relevant robotics literature. The authors should also discuss how the stability gate differentiates between actual activity changes and unstable periods, and they should provide a more detailed analysis of the stability gate's performance under various challenging conditions. Fourth, the authors should provide a more detailed explanation of the placement-conditioned adaptive Batch Normalization (BN). This should include a clear description of how the BN parameters are adjusted based on the inferred placement context, and whether the method uses separate BN layers for each placement or if it adapts a single BN layer. The authors should also discuss the specific parameters of the BN that are adapted and how these parameters are updated during training. Fifth, the authors should provide a more detailed description of the experimental setup, including the specific sensor placement used for training the CNN and the baseline methods. The authors should also provide a clear definition of the domains used in the cross-domain evaluation. Sixth, the authors should improve the presentation of their paper. This should include explicit definitions of all abbreviations, a clear visual representation of the proposed method, and a detailed explanation of the evaluation metrics used. The authors should also provide a detailed analysis of the computational cost of their method. Finally, the authors should consider exploring alternative approaches to sensor orientation correction, such as quaternion-based representations or signal decomposition techniques. This would provide a more robust and accurate correction, especially during dynamic activities. They should also consider exploring more advanced techniques for placement context inference, such as using a combination of signal features and sensor location information. These improvements would strengthen the paper and make it more impactful.
I have several questions that arise from my analysis of the paper. First, regarding the custom dataset, what specific criteria were used to select the subjects, and how was the data collection process standardized to ensure consistency across subjects? Second, concerning the stability gate, what is the specific threshold value used, and how was this value determined? What is the sensitivity of the method to different threshold values? Third, regarding the placement-conditioned adaptive BN, how are the parameters of the BN layer adjusted based on the inferred placement context? Are separate BN layers used for each placement, or is a single BN layer adapted? If a single BN layer is adapted, how is this adaptation achieved? Fourth, regarding the training process, what is the architecture of the CNN used in the experiments, and what are the specific training parameters used? Fifth, regarding the evaluation, what are the specific activities included in the custom dataset, and how do these activities relate to real-world scenarios? What is the performance of the method on each individual activity? Sixth, regarding the comparison with state-of-the-art methods, why were specific methods chosen for comparison, and why were other relevant methods excluded? What are the limitations of the chosen comparison methods? Seventh, regarding the computational cost, what is the inference time and memory usage of the proposed method on a typical wearable device? How does this compare to other state-of-the-art methods? Finally, regarding the gravity-based orientation correction, how does the method handle noisy accelerometer data, and how does it perform during activities that involve rapid changes in orientation? These questions are aimed at clarifying key methodological choices and assumptions, and they are crucial for a deeper understanding of the paper's contributions and limitations.