Significance
Concrete remains the backbone of modern infrastructure, yet now facing growing environmental scrutiny. While ultra-high-performance concrete (UHPC) has opened up new frontiers in structural engineering, with its exceptional mechanical properties and durability, it’s impossible to ignore the environmental consequences that come bundled with its use. Much of UHPC’s strength is rooted in its high content of ordinary Portland cement (OPC), a material that accounts for a significant portion of global CO₂ emissions. One alternative that’s drawn increasing interest is alkali-activated concrete, especially alkali-activated UHPC (AA-UHPC), which substitutes OPC with industrial byproducts like fly ash and GGBS. These materials don’t just reduce the carbon footprint; they also perform surprisingly well when properly formulated. That said, designing with AA-UHPC isn’t as straightforward as swapping ingredients. Its behavior is highly sensitive to shifts in mix composition, from activator ratios to curing temperatures. Even experienced materials scientists often find it difficult to predict outcomes without extensive and often tedious lab work.
To this account, recent research paper published in Archives of Civil and Mechanical Engineering and led by Professor Doo-Yeol Yoo from the Department of Architecture and Architectural Engineering at Yonsei University together with Dr. Farzin Kazemi and Professor Robert Jankowski from Gdańsk University of Technology and Dr. Torkan Shafighfard from the Polish Academy of Sciences, researchers recognized that while there is a fair amount of experimental data on AA-UHPC, but still lack reliable tools to extract predictive conclusions from it. Rather than continuing down the path of labor-intensive trial and error, the team decided to explore whether machine learning—specifically a stacked model augmented with active learning—could offer a more efficient way forward. Their idea was to create a system that could identify the most informative data points, adapt its internal logic, and improve as it learned. The research team gathered data and compiled 284 distinct AA-UHPC mix designs from 26 peer-reviewed studies. These weren’t cherry-picked examples but a broad and intentionally messy collection—reflecting diverse material combinations, different curing strategies, and varied dosages of key ingredients like fly ash, GGBS, silica fume, and alkaline activators. What unified these data points was a common endpoint: compressive strength. But everything else—fiber aspect ratio, water-to-binder ratios, even curing temperature—varied widely. That variability was central to the study’s purpose. Moreover, instead of repeating these tests experimentally, the authors turned to machine learning to build a predictive tool and also to see if they could train a system that actually learned where its own uncertainties were. They used a stacked model architecture with active learning layered on top. That meant the model wasn’t just fed a static dataset. It was allowed to iteratively select the most “informative” samples—those that would best improve its predictions—and retrain itself over time. In materials science, where data is often scarce and expensive to generate, that approach provide a practical edge.
Of course, before any model could be trained, the data had to be cleaned. They used scaling and imputation to manage missing values and outliers—an unavoidable reality when pulling from multiple sources. Then, to interpret how the model made its decisions, they applied Shapley values. What came out wasn’t entirely surprising: flowability, NaOH content, curing duration, and water content had the strongest influence on strength predictions. Still, some subtleties emerged. For instance, flow had a clearly positive effect, likely tied to better compaction. But the role of NaOH was less straightforward—too much seemed to weaken the structure, possibly by disturbing the setting reactions. When tested, the model delivered. Their best configuration—AL-Stacked ML-3—reached an accuracy close to 99%. That’s impressive on its own, but what really stood out was the external validation. On a completely independent dataset, one it hadn’t seen before, the model performed with the same confidence. That consistency suggests it’s not overfitting noise but capturing real, generalizable patterns. For researchers working on sustainable concrete design, this kind of tool could meaningfully cut down the trial-and-error cycle—and that’s no small win.
What stands out most in the research work of Professor Doo-Yeol Yoo and colleagues, at least from a research perspective, is its potential to change how we approach the design of advanced concretes—especially those in the alkali-activated category. The field has, for some time, acknowledged the benefits of AA-UHPC in terms of performance and sustainability. Yet, its complexity has been a persistent barrier. You can’t just swap in a few ingredients and expect reliable results. The material is sensitive, and even small changes in mix proportions or curing conditions can shift its behavior dramatically. That unpredictability makes it tough to scale. What this study offers is a practical workaround—not by simplifying the material, but by tackling the design process itself. Rather than relying entirely on traditional experimental methods, which are expensive and slow, the researchers turned to machine learning, more specifically a stacked model guided by active learning. In doing so, they developed a system that doesn’t just predict compressive strength; it learns which variables matter most and adapts as more data becomes available. That’s a big step toward computationally assisted mix design. In materials labs, it’s easy to underestimate how much time is spent on trial-and-error—preparing batches, waiting on curing cycles, testing, and then tweaking based on partial intuition. And yet, even after all that effort, the outcome isn’t always conclusive. What the team demonstrates here is that if we mine the data we already have with enough precision and context, we can reduce that guesswork significantly. Additionally, there are wider implications, too, especially when it comes to sustainability. AA-UHPC has enormous potential to reduce emissions, but its adoption has been mostly limited to research labs or high-end projects. By creating an accessible predictive tool, this work effectively lowers the entry threshold for engineers who may not have the resources to run dozens of test mixes. That kind of accessibility could accelerate the shift toward greener infrastructure, particularly in regions where cost and material availability are constraints. Lastly, the study does a commendable job addressing one of the main critiques of ML in engineering: interpretability. Through Shapley values and visualization tools, it becomes possible to not only see the outcome but understand why a particular prediction was made. In safety-critical fields, that kind of clarity isn’t just useful—it’s essential.
Reference
Kazemi, F., Shafighfard, T., Jankowski, R. et al. Active learning on stacked machine learning techniques for predicting compressive strength of alkali-activated ultra-high-performance concrete. Arch. Civ. Mech. Eng. 25, 24 (2025). https://doi.org/10.1007/s43452-024-01067-5
Advances in Engineering Advances in Engineering features breaking research judged by Advances in Engineering advisory team to be of key importance in the Engineering field. Papers are selected from over 10,000 published each week from most peer reviewed journals.