Significance
One-class classification (OCC) is growing in importance within analytical chemistry, largely due to its applicability in situations where only a single, well-characterized class of samples is available. This becomes especially relevant in areas like food authentication or contamination detection. However, translating OCC from theory into reliable practice remains a major challenge. Many widely used OCC approaches are heavily dependent on tuning hyperparameters—like choosing the number of principal components in a spectral domain model; and small shifts in sample conditions, instrumental variability, or environmental noise can all destabilize these optimally tuned systems. The assumption that source and target data are identically distributed doesn’t hold up in practice, especially when hidden matrix effects or subtle structural variations creep in, as they inevitably do. Adding to this, most OCC strategies implicitly rely on the availability of labeled non-class samples to calibrate decision thresholds. In reality, those “negative” examples are frequently inaccessible or poorly defined. For instance, in authenticating a product like extra virgin olive oil, it’s impractical—if not impossible—to catalog every conceivable adulterant. This leaves a glaring gap between how OCC is often implemented and the actual constraints faced in field applications. A new research paper published in Journal of Chemometrics and conducted by Hyrum Redd and Professor John Kalivas from the Department of Chemistry at Idaho State University, researchers developed a novel autonomous one-class classification method called Consensus OCC (Con OCC), which eliminates the need for parameter optimization. It is built upon a novel, composite similarity framework called PRISM (Physicochemical Responsive Integrated Similarity Measure), a fusion-based metric that integrates multiple similarity measures to assess how well a sample matches a known class. PRISM sidesteps optimization by fusing multiple similarity metrics across a range of parameter values, capturing sample relationships from different angles without locking the model into a narrow parameter set. Indeed, with translating PRISM scores into either z-scores or conformal prediction p-values, the new method enables probabilistic, interpretable classification using only source class data.
In an effort to test their autonomous classification framework under conditions that resemble practical application, Redd and Kalivas designed a suite of experiments grounded in seven real-world datasets. The researchers selected data with inherent messiness: spectral noise, variable instrumentation, and diverse chemical profiles. From food purity assessment to the detection of heavy metal contamination, the datasets spanned a wide range of challenges, capturing the unpredictable nature of the environments where one-class classification is typically needed.
What makes their experimental approach notable is its refusal to rely on parameter optimization. Rather than tailoring each model to its dataset—a common but fragile practice—the team applied their PRISM-based system uniformly. PRISM integrates 26 distinct similarity metrics into a single composite score, without tuning. It’s an elegant workaround to a persistent problem: how to achieve stability in classification without overfitting the model to a narrow slice of the data landscape. The authors used both z-scores and conformal prediction to assess whether a target sample could reasonably be considered part of the same class as the training data. Z-scores offered a straightforward statistical lens, quantifying deviation from the source distribution, while conformal prediction approached the same question probabilistically. Across 28 classification scenarios, both strategies were tested using “hard” thresholds (fixed cutoffs) and more nuanced, “soft” interpretations that leave room for expert judgment. They found that Z-scoring tended to capture more true positives, boosting sensitivity. Meanwhile, conformal prediction proved more consistent across datasets—a trait that matters when you’re building tools meant to function outside the lab. In several cases, including the classification of milk and detection of contaminated clams, the method outperformed even optimized, preprocessed models reported in earlier studies. Moreover, they found for the olive oil HPLC dataset using completely raw chromatographic profiles—without smoothing, derivative transformations, or parameter tweaks—Con OCC still achieved over 96% accuracy which can is impressive.
The impact of Redd and Kalivas’s work reaches well beyond the immediate scope of the datasets they examined. At its core, their new study nudges chemometricians—and more broadly, the analytical sciences—toward a different mindset. Instead of chasing optimal tuning for every model and dataset, they successfully demonstrated that robust classification can emerge from something far more flexible: a system built to accommodate variation rather than suppress it. The new work resonates because it speaks directly to the practical realities of analytical labs. In most real-world applications—think food quality testing, environmental screening, or pharmaceutical validation—only the “correct” class is well-characterized. The space of everything that’s not the target class is poorly defined, if it’s defined at all. Analysts rarely have the luxury of labeled outliers or comprehensive counterexamples to train against. What Redd and Kalivas proposed is a classification system that doesn’t require that luxury. By focusing strictly on the known class and still managing to generate reliable, probabilistic outcomes, their Con OCC method fills a very real gap. One of the key implications of the method is it can be easily deployed because Con OCC avoids parameter tuning and handles raw data with confidence, it becomes highly suitable for real-time or embedded use—inside portable spectrometers, automated inspection systems, or diagnostic platforms operating far from central labs. This self-sufficiency is especially important in remote or under-resourced labs, where expertise is limited and rapid decision-making is essential. Additionally, the method’s use of z-scores and conformal prediction adds a layer of interpretability that’s often lacking in machine learning pipelines. This is particularly advantageous in regulatory or forensic scenarios, where traceability and justification matter, and transparency can be just as valuable as the classification itself.
When the number of target samples to be classified are small, Kalivas’ laboratory is finding that by using virtual reality (VR) with immersive analytics processes, better decisions are made. For OCC, autonomously determined false positives and negatives are converted respective true negatives and positives.
References
Redd, Hyrum J & Kalivas, John H. (2024). Assessment of Conformal Prediction and Standard Normal Distribution for Autonomous Consensus One‐Class Classification. Journal of Chemometrics. 39. 10.1002/cem.3639.
Go to Journal of Chemometrics.
Kalivas, John H and Redd, Hyrum (2025). One-class Classification. https://www.youtube.com/watch?v=Yzn_n5Yd0ms