Overview
This study developed a quantitative prediction model for polyphenol content in Lonicera caerulea using mid-infrared spectroscopy coupled with a hybrid variable selection strategy optimized for high-dimensional, small-sample datasets. The research addresses the analytical need for rapid, non-destructive quality control methods in functional food assessment.
Methods and approach
One hundred ninety-one blue honeysuckle samples from Northeast China were analyzed. Spectral data (7468 dimensions) were acquired via Fourier transform infrared spectrometry, with polyphenol reference values determined using the Folin-Ciocalteu method. Preprocessing evaluation across 10 methods identified multiplicative scatter correction combined with Savitzky-Golay first derivative as optimal. The hybrid variable selection approach (VIP1.0 intersected with top 30% random forest regression variables) reduced dimensionality to 984 wavelengths. Four machine learning models (partial least squares, random forest regression, support vector regression, and XGBoost) underwent three-stage hyperparameter tuning on calibration (n=152) and prediction (n=39) sets stratified using the SPXY algorithm.
Key Findings
The optimized XGBoost model demonstrated superior performance on the independent test set with R-squared of 0.92, root mean square error of 0.098, and residual prediction deviation of 3.47. The hybrid variable selection method achieved 86.8% dimensionality reduction while improving predictive accuracy relative to the classical competitive adaptive reweighted sampling approach, which yielded R-squared of 0.78 and residual prediction deviation of 2.14, representing 16.3% and 55.2% improvements respectively.
Implications
The hybrid variable selection strategy effectively mitigates analytical challenges inherent to high-dimensional spectral datasets with limited sample sizes, addressing a methodological constraint common in spectroscopy-based quality control applications. The framework demonstrates transferable utility for rapid, non-destructive quantification of bioactive compounds in plant materials, with potential extension to other functional food matrices requiring polyphenol characterization.
Disclosure
Key points
- Research title: Quantitative Analysis of Polyphenols in Lonicera caerulea Based on Mid-Infrared Spectroscopy and Hybrid Variable Selection
- Authors: Haiwei Wu, Xuexin Li, Jianwei Liu, Zhihao Wang, Yuchun Liu
- Publication date: 2026-02-23
- DOI: https://doi.org/10.3390/molecules31040750
- OpenAlex record: View
- PDF: Download
- Disclosure: This post was generated by Claude (Anthropic). The original authors did not write or review this post.
Disclosure
- Research title:
- Quantitative Analysis of Polyphenols in Lonicera caerulea Based on Mid-Infrared Spectroscopy and Hybrid Variable Selection
- Publication date:
- 2026-02-23
- OpenAlex record:
- View
Get the weekly research newsletter
Stay current with peer-reviewed research without reading academic papers — one filtered digest, every Friday.


