← Back to blog
Industrial process control room

Neural Networks in Industrial Control: How Many Are Actually Running?

Posted on March 23, 2026 by Dr. Rafał Noga
Neural NetworksReinforcement LearningAPCProcess ControlIndustrial AIDeep RL

Everyone is talking about the industrial AI revolution. Vendors promise AI-driven plants, autonomous processes, self-optimising factories. The question worth asking — and rarely asked clearly enough — is: how much of this is actually neural-network control running in closed loop on real production assets?

The honest answer is: a very small amount. The gap between what is marketed as “industrial AI” and what is actually operating as a neural-network controller is significant — and understanding that gap is more useful than accepting the marketing framing.

A 2025 review of 672 industrial-automation studies found that only 22% of published neural-network controller research included real-world implementations — laboratory rigs, pilot plants, and production systems combined. The remaining 78% never left simulation.[1] Of those real-world implementations, the large majority are research-grade setups: university test beds, small-scale pilots, proof-of-concept demonstrations. Confirmed deployments on live industrial production systems — with documented quantified results — number ten.


Where “industrial AI” actually lives

When practitioners and vendors use the phrase “industrial AI”, they typically mean one of several distinct things. Only one of them is a neural-network controller operating in a closed production loop.

The question is being asked with new urgency because the current AI wave is driven by large language models — GPT, Gemini, Claude. LLMs are neural networks, and their dramatic impact on text processing has led every industry to ask: can the same technology transform us? In process control, the answer requires distinguishing what LLMs do from what a control system must do. LLMs are pattern-matching systems trained on static corpora. A process controller must close a feedback loop in real time, respect hard physical constraints, and remain stable under disturbances — a fundamentally different problem. The neural-network architecture relevant to control is not the transformer; it is the policy network trained by reinforcement learning. And the other tools commonly labelled “industrial AI” — MPC, soft sensors, digital twins, scheduling optimisation — predate the LLM era by decades. They did not become useful because of GPT.

What it is calledWhat it actually isCloses the loop?
Predictive maintenanceAnomaly detection on vibration, temperature, currentNo — advisory only
Digital twinOffline simulation model used for planning or trainingNo
Soft sensor / virtual sensorRegression model inferring unmeasured quality variablesSometimes — supervisory role
Process optimisationLP/MILP scheduling, batch planning, not real-timeNo
APC / MPCLinear model-predictive control — technology in use since the 1980sYes — but not a neural network
NN controllerThe thing the hype is aboutRare — see below

Advanced process control using linear MPC has been in industrial use for forty years. It is embedded in the DCS platforms of every major vendor and is standard practice in refining, petrochemicals, and polymer production. When a plant claims “AI-driven optimisation”, they are often describing MPC installed in the 1990s. When a building-management vendor claims “AI energy management”, they are often describing a rule-based scheduler that has been rebranded.

None of this is dishonest — these are useful tools. But they are not what “neural-network controller” means, and conflating them inflates the apparent maturity of NN-based closed-loop control.


The confirmed count

After an extensive review of the public record, the number of confirmed neural-network or reinforcement-learning controllers operating in closed loop on real industrial production systems — with documented quantified results and verifiable sources — is ten.

Tokamak reactors, university rigs, and papers that validate algorithms on historical industrial data are excluded. These are cases where the controller is confirmed to be running in closed loop on a live production asset.


The ten confirmed deployments

1. Petroleum refining — Imubit, USA

Imubit has deployed an NN predictor combined with an RL-trained optimisation controller across more than 90 live applications in industrial processing plants, with customers including 7 of the 10 largest US refiners — Marathon Petroleum, HF Sinclair, and Citgo among them.[2] The company launched its dedicated RL product in September 2024.[3]

Documented results: $0.30–$0.50 per barrel margin improvement in refinery operations; up to 30% natural gas reduction in rotary kiln operations.[2]

2. Chemical distillation — JSR / Yokogawa, Japan

In January–February 2022, Yokogawa’s FKDPP RL controller ran for 840 consecutive hours (35 days) in fully autonomous closed-loop control of a distillation column at a JSR production facility — documented as a world first for direct RL control of a variable in a chemical plant.[4] Following a yearlong extended trial, ENEOS Materials (which had acquired the facility) formally adopted the system in production in March 2023.[5]

3. Air separation unit, 2021

Blum et al. report a model-based deep RL controller operating in a production air separation unit, benchmarked directly against the previous linear MPC setup.[6]

4. Industrial dividing wall column, 2025

Park et al. report offline RL for temperature control of an industrial-scale dividing wall column, achieving an automation ratio of 93.11% versus manual operation.[7] The policy was trained on logged historical data — no live plant exploration.

5. Industrial photobioreactor, 2025

Gil et al. report RL-plus-behaviour-cloning deployed in an industrial photobioreactor for pH regulation, with an eight-day continuous run under varying environmental conditions.[8]

6. District heating — 13 buildings, 2026

Moshari et al. report model-free RL managing district heating across 13 real buildings over 138 winter days: 29.7% heating-energy reduction versus historical baselines, no hardware upgrades required.[9]

7. Office HVAC / TABS — SAC controller, 2024

Silvestri et al. report a Soft Actor-Critic controller in a real office building over a two-month cooling season: 68% fewer temperature comfort violations with no increase in energy use.[10]

8. Office HVAC — transfer learning, 2025

Coraci et al. report a DRL controller adapted from one building to a second via online transfer learning (HiLo case study), demonstrating multi-site deployment without retraining from scratch.[11]

9. Office HVAC — imitation learning, 2025

Silvestri et al. report an imitation-learning-assisted DRL controller on a TABS system, where policy initialisation from expert operation reduced risky early-deployment behaviour.[12]

10. Data centre cooling — Google DeepMind, worldwide

In 2016, DeepMind demonstrated 40% reduction in cooling energy (15% overall PUE reduction) using an AI advisory system.[13] In 2018 the system was upgraded to direct AI control of cooling actuators, delivering sustained ~30% cooling energy savings under operator supervision.[14] In 2022, Trane Technologies applied the same approach to two commercial (non-Google) buildings, reporting 9% and 13% energy savings in those live experiments.[15]


Why is the number not larger?

Ten confirmed deployments across the entire global industrial base is a small number. It is small not because the technology does not work — the confirmed cases prove that it does — but because a specific set of prerequisites must be in place, and most plants do not meet them yet.

The dominant reasons:

The reward signal problem. RL controllers learn by optimising a reward function. If the quantity you want to optimise cannot be measured in real time from existing instrumentation, there is no reward to compute. Cement clinker quality requires kiln sampling and laboratory analysis — hours after the control decision was made. Semiconductor CD is measured by metrology tools with multi-hour queue times. Batch product quality is only known at batch end. In each of these cases, the reward is delayed, sparse, or absent, making closed-loop NN control impractical with current methods.[16]

The exploration problem. Model-free online RL requires the agent to take exploratory actions — to try things it has not tried before — in order to learn what the reward function looks like. On a live production asset, exploratory actions that violate operating constraints can cause equipment damage, safety incidents, or product loss. This is manageable through offline RL (training entirely on historical data) or model-based RL (training in a validated simulator), but it adds significant engineering effort that the simple pitch of “deploy an AI controller” omits.[18]

MPC already works. For a large fraction of industrial control problems, a properly commissioned linear MPC already handles the constraints and economic objectives with high interpretability and established certification pathways. The business case for NN control must compare against best-in-class MPC, not against a poorly tuned PID baseline. Where MPC performs well, the added complexity of NN control does not produce a business case.[1]

Operator trust. Neural-network controllers do not produce interpretable explanations for their decisions. For operators who work with MPC — where the cost function is explicit and the predicted trajectory is visible — this opacity is a genuine barrier. The confirmed deployments all addressed it through extended trial periods, explicit fallback modes, and sustained operator engagement. Plants that skip this step tend to find that operators revert to manual within days.


The ROI table from confirmed cases

SectorReported benefitBasis
Petroleum refining$0.30–$0.50/bbl margin improvementImubit, 90+ live applications[2]
Chemical distillation93.11% automation ratio vs manualPark et al., industrial DWC[7]
Building HVAC29.7% heating energy reductionMoshari et al., 13 buildings[9]
Building HVAC68% fewer comfort violationsSilvestri et al., office building[10]
Data centre cooling~30% cooling energy reduction (autonomous)DeepMind, Google DCs[14]
Rotary kiln operationsUp to 30% natural gas reductionImubit[2]

Are you a good candidate?

Not every plant with a control problem is a good candidate for NN control. The confirmed deployments reveal a clear profile. A company is likely a strong candidate if it meets most of the following:

You have a measurable objective that is already in your historian. Every confirmed deployment optimises a metric — dollar margin per barrel, kWh consumed, automation ratio, pH deviation — that can be computed in real time from existing instrumentation. If your key performance indicator requires a laboratory analysis to measure, requires an expert to judge, or is only known at batch end, NN control cannot close the loop on it today.

Your process has nonlinear interactions that your current control strategy handles conservatively. NN controllers earn their cost in situations where the relationship between inputs and outputs is nonlinear, where multiple variables interact, and where the current approach (manual overrides, conservative setpoints, or simple PID) leaves performance on the table. If a well-tuned MPC already handles your constraints optimally, there is no business case for NN control — MPC’s explicit model and cost function are more transparent and easier to certify.[1]

You have at least 6–12 months of continuous historian data at scan rates of 5 minutes or faster. Offline RL and model-based RL — the methods used in every confirmed process-industry deployment — require substantial logged data to train a policy that will generalise across operating conditions.[7]

Your DCS or BMS can receive external setpoint targets. All ten confirmed deployments operate in a supervisory role: the NN controller sends setpoints to the existing regulatory layer, which executes them. If your control system does not support this integration pattern, that is an infrastructure problem to solve first.

The cost of a suboptimal decision is bounded and recoverable. In petroleum refining, a suboptimal FCC setpoint costs margin — undesirable, but recoverable. In a building, a bad heating setpoint produces a slightly uncomfortable room for an hour. These consequences are tolerable during the commissioning and early learning phase. If a bad control decision could cause an equipment trip, a safety incident, or the loss of an entire product batch, the deployment methodology changes significantly.


What you need in place: technical prerequisites

For a process engineer, the implementation prerequisites are more specific than the business indicators above.

Historian with sufficient coverage and scan rate. The minimum is typically 6–12 months of data covering multiple operating conditions, logged at a scan rate that captures the relevant process dynamics. For a distillation column with time constants of 30–60 minutes, a 5-minute scan rate is adequate. For a fast chemical reactor, sub-minute logging may be required.

A formalised objective function. You must be able to write a mathematical expression for what “better control” means, using variables already in the historian. This is not a software problem — it is an engineering and business problem. Agreeing on the objective function requires alignment between process engineering, operations, and management, because the objective function is what the controller will optimise, and changing it later requires retraining.

A validated process model, or willingness to train one. For safety-constrained processes (the majority of chemical and bioprocess applications), the confirmed deployment strategy is model-based RL: train the policy in a validated simulator, then transfer to the real plant. This requires a dynamic process model of adequate fidelity. If one does not already exist, it must be built — from first principles, from system identification on historical data, or from a combination of both. Building this model is often the majority of the engineering effort.

DCS/BMS integration capability. The standard integration pattern is: NN controller runs on an edge server or process computer, reads tags from the historian or OPC-UA server, and writes setpoint targets back to the DCS at a defined cycle time (typically minutes). Most modern DCS platforms support this. Older systems may require a middleware layer.

Inference runtime for the NN model. The trained policy runs on an edge server or process computer — not inside the PLC itself. The standard export format is ONNX (portable across training frameworks), served by ONNX Runtime for CPU inference or TensorRT for GPU-accelerated inference. For Siemens S7-1500 users, the AI Inference Server add-on supports ONNX models directly; Beckhoff’s TwinCAT Machine Learning extension provides equivalent capability on TwinCAT 3 systems. MATLAB’s Deep Learning Toolbox with Simulink generates deployable code directly from trained networks. For process control loops with cycle times of minutes, standard CPU inference is more than sufficient. Fast control loops — machine motion (1–10 ms scan cycle), fast reactor temperature or pressure (seconds-range dynamics) — require inference latency validation before committing to a model architecture.

Operator acceptance protocol. Every confirmed deployment includes a mechanism for operators to exit AI control and revert to the previous mode. This is not just a safety requirement — it is an adoption requirement. Operators who understand the system and trust its behaviour are the difference between a deployment that runs continuously and one that reverts to manual within a week. The JSR/Yokogawa deployment explicitly involved operator engagement throughout the trial period.[4]


Enabling factors: what makes deployment succeed

Across the confirmed deployments and the research literature, five factors consistently enable successful real-world deployment.

Offline RL removes the exploration problem. The single biggest practical barrier to RL in industry is that model-free RL requires the agent to explore — to try actions it has not tried before — in order to learn. On a live production asset, exploration can cause equipment damage, safety incidents, or product loss. Offline RL, which learns a policy entirely from logged historical data with no online interaction with the plant, removes this barrier entirely.[17] The dividing wall column deployment is the clearest industrial example: the policy was trained on historical data, deployed, and immediately achieved 93.11% automation ratio without any live exploration.[7]

Behaviour cloning and imitation learning for safe initialisation. Where some online learning is needed after deployment, initialising the NN policy from expert operator demonstrations (behaviour cloning) ensures that the system starts within the safe operating envelope rather than from a random initial policy.[12] The photobioreactor deployment used this approach; so did the imitation-learning building case. The technical effect is that the policy inherits the operator’s knowledge of safe operation before any online adaptation begins.

Model-based RL for physical systems where exploration cost is high. Where a validated dynamic model exists, the RL policy can be trained in simulation to a high level of capability before ever touching the real plant.[6] The sim-to-real transfer still requires careful validation, but it confines the risky learning phase to a computer rather than a production asset.

Transfer learning for multi-site deployment. Once an RL policy is trained and validated on one asset, transfer learning allows it to be adapted to a second similar asset at much lower cost than training from scratch.[11] For operators of multiple similar units — refiners with several FCC units, building portfolios, multi-site chemical manufacturers — this is commercially significant: the training investment is amortised across the portfolio.

Process nonlinearity that exceeds what MPC can handle effectively. NN controllers are not universally better than MPC. MPC with a linear or mildly nonlinear model, properly tuned, handles a large fraction of industrial control problems efficiently and with high interpretability. The cases where NN controllers have been deployed are typically those where the process dynamics are sufficiently nonlinear or high-dimensional that an accurate linear model cannot be maintained, or where the operating envelope is too large for a single linear approximation to be valid across all conditions.[6]


Model maintenance after deployment

A deployed NN controller is not a set-and-forget system. Industrial processes drift — feedstock composition changes, equipment ages, operating targets shift — and a policy trained on historical data will degrade in performance over time if not maintained.

Retraining frequency depends on how fast the process drifts. The practical approach is to monitor the controller’s performance KPI continuously and trigger a retraining review when it falls below an acceptable threshold. Intervals in confirmed deployments range from months (slow-drifting continuous processes) to event-driven reviews after significant plant changes.

Online learning — continuously updating model weights from live production data — is generally not viable for certified production systems. Uncontrolled weight updates cannot be validated before taking effect, which conflicts with process safety and quality management requirements. Offline retrain-and-validate is the current best practice for production deployments.

The retraining procedure follows the same steps as the original deployment, but faster because the infrastructure is in place: collect new historian data → retrain offline → validate in simulation → supervised commissioning trial on the live plant → promote to production if criteria are met.

Change management. Any modification to a deployed NN controller — new training data, changed objective function, different architecture — constitutes a software change and must pass through the site’s management of change (MoC) procedure, with documentation and re-approval. To a DCS engineer, this is the normal process control change workflow. To an ML engineer accustomed to continuous deployment on web services, it is a significant operational constraint that must be planned for before deployment.


Why some sectors lead and others lag

The table below maps each sector against the four factors that determine deployability, explaining the full pattern of confirmed deployments.

FactorPetroleum refiningBuilding HVACChemical plantsCement / batch chemistry
Reward measurable in real time?Yes — $/bbl, yield from DCSYes — kWh from meterYes — temperature, pH, ratioOften no — quality from lab, hours later
Cost of a bad decisionLost margin (recoverable)Slight discomfort (recoverable)Equipment damage / SIS trip (high)Product batch loss, kiln trip (high)
Control timescaleMinutes5–60 minutesMinutesMinutes to hours
Historian and data qualityStrong — standard in refineriesStrong — BMS standardVariableOften weak
ResultPortfolio-scale commercial deploymentWidest academic deployment baseNarrow but confirmedAbsent from confirmed list

Petroleum refining leads commercially because all four factors align: the reward is computable in real time from the DCS, a bad setpoint costs margin rather than triggering a safety incident, and refineries maintain extensive historians. The Imubit deployments are the direct consequence of this alignment across a large number of similar assets.

Building HVAC leads academically because the timescale factor is exceptionally forgiving. A controller making decisions every 15–60 minutes experiences recoverable consequences from suboptimal early behaviour — the worst case is a slightly uncomfortable room, not a column flood. This makes online model-free RL viable, which reduces the implementation cost compared with the offline or model-based approaches required for safety-constrained processes.

Chemical process applications have narrow but confirmed deployments because the reward is measurable but the cost of a bad decision is high. Offline RL and model-based RL resolve this — but they require engineering work (process models, data pipelines) that adds cost and timeline, explaining why fewer deployments exist.

Cement, batch chemistry, and semiconductor fabrication are absent for one dominant reason: the reward signal is not computable in real time. Cement clinker quality requires kiln sampling. Semiconductor CD is measured by metrology tools with multi-hour lead times. Batch endpoints are only known at batch end. Until real-time soft sensors that predict these quality variables from process data are available and trusted, RL cannot close the loop. This is a data and instrumentation problem, not an algorithm problem.


The bottom line

Most of what is sold as “industrial AI” is not neural-network control. It is predictive maintenance, digital twins, soft sensors, and scheduling optimisation — all useful, none of it a closed-loop controller.

Neural-network controllers confirmed in real production operation:

  • Petroleum refining — commercial portfolio scale, documented ROI in $/bbl[2][3]
  • Chemical and bioprocess plants — narrow but confirmed; requires offline or model-based RL[4][5][6][7][8]
  • Building energy systems — widest academic deployment base[9][10][11][12]
  • Data centre cooling — continuous AI-directed operation since 2018[13][14][15]

If you operate a continuous process with a real-time measurable objective, a plant historian, and a DCS that accepts supervisory setpoints — and your current control strategy is leaving measurable performance on the table — the technology is ready. The constraint is no longer algorithmic. It is engineering execution: building the process model or data pipeline, designing the objective function, and managing the operator adoption process.

The right starting point is a structured assessment of whether your specific process meets the deployment prerequisites — and if not, what would need to change first.


Dr. Rafał Noga specialises in model-based predictive and learning-based control for industrial systems. If you want to assess whether neural-network or model-predictive control is the right next step for your process, the free diagnostic call is where that conversation starts.


Is NN or MPC control right for your process?

Get a straight answer in a free 30-minute diagnostic call.

Book a free 30-min call →

References

1. Alginahi, Y.M., Sabri, O., Said, W. (2025). Reinforcement Learning for Industrial Automation: A Comprehensive Review of Adaptive Control and Decision-Making in Smart Factories. Machines, 13(12), 1140. https://doi.org/10.3390/machines13121140

2. Imubit (2024). Imubit Launches Closed-Loop AI Optimization Solution Powered by Reinforcement Learning. Hydrocarbon Processing, September 2024. https://www.hydrocarbonprocessing.com/news/2024/09/imubit-launches-closed-loop-ai-optimization-solution-powered-by-reinforcement-learning/

3. Imubit (2024). The Process Industry’s First Reinforcement Learning-Powered Closed-Loop AI Optimization. https://imubit.com/blog/the-process-industrys-first-reinforcement-learning-powered-closed-loop-ai-optimization/

4. Yokogawa Electric Corporation (2022). Yokogawa and JSR Achieve World-First Adoption of AI Autonomous Control in Chemical Plant. Press release, March 22, 2022. https://www.yokogawa.com/us/news/press-releases/2022/2022-03-22/

5. Yokogawa Electric Corporation (2023). ENEOS Materials and Yokogawa Achieve First Successful Autonomous Control of a Chemical Plant Using Reinforcement Learning AI. Press release, March 30, 2023. https://www.yokogawa.com/us/news/press-releases/2023/2023-03-30/

6. Blum, F. et al. (2021). Investigation of a Model-Based Deep Reinforcement Learning Controller Applied to an Air Separation Unit in a Production Environment. Chemie Ingenieur Technik. https://doi.org/10.1002/cite.202100094

7. Park, J., Choi, W., Kim, D., Park, H.E., Lee, J.M. (2025). Real-World Implementation of Offline Reinforcement Learning for Process Control in Industrial Dividing Wall Column. SSRN preprint. https://papers.ssrn.com/sol3/papers.cfm?abstract_id=5253477

8. Gil, J.D., del Rio Chanona, E.A., Guzmán, J.L., Berenguel, M. (2025). Reinforcement learning meets bioprocess control through behaviour cloning: Real-world deployment in an industrial photobioreactor. Engineering Applications of Artificial Intelligence. https://arxiv.org/abs/2509.06853

9. Moshari, A., Javanroodi, K., Nik, V.M. (2026). Real-world deployment of model-free reinforcement learning for energy control in district heating systems. Applied Energy, 402. https://doi.org/10.1016/j.apenergy.2025.126997

10. Silvestri, A., Coraci, D., Brandi, S., Capozzoli, A., Borkowski, E., Köhler, J., Wu, D., Zeilinger, M.N., Schlueter, A. (2024). Real building implementation of a deep reinforcement learning controller to enhance energy efficiency and indoor temperature control. Applied Energy, 368, 123447. https://doi.org/10.1016/j.apenergy.2024.123447

11. Coraci, D., Silvestri, A., Razzano, G., Fop, D., Brandi, S., Borkowski, E., Hong, T., Schlueter, A., Capozzoli, A. (2025). A scalable approach for real-world implementation of deep reinforcement learning controllers in buildings based on online transfer learning: The HiLo case study. Energy and Buildings, 329, 115254. https://doi.org/10.1016/j.enbuild.2024.115254

12. Silvestri, A., Coraci, D., Brandi, S., Capozzoli, A., Borkowski, E., Köhler, J., Wu, D., Zeilinger, M.N. (2025). Practical deployment of reinforcement learning for building controls using an imitation learning approach. Energy and Buildings, 335. https://www.sciencedirect.com/science/article/pii/S0378778825002415

13. DeepMind (2016). DeepMind AI reduces Google data centre cooling bill by 40%. https://deepmind.google/blog/deepmind-ai-reduces-google-data-centre-cooling-bill-by-40/

14. DeepMind (2018). Safety-first AI for autonomous data centre cooling and industrial control. https://deepmind.google/blog/safety-first-ai-for-autonomous-data-centre-cooling-and-industrial-control/

15. Luo, J. et al. (2022). Controlling Commercial Cooling Systems Using Reinforcement Learning. arXiv:2211.07357. https://arxiv.org/abs/2211.07357

16. Dulac-Arnold, G., Levine, N., Mankowitz, D.J., Li, J., Paduraru, C., Gowal, S., Hester, T. (2021). Challenges of Real-World Reinforcement Learning: Definitions, Benchmarks and Analysis. Machine Learning, 110, 2419–2468. https://doi.org/10.1007/s10994-021-05961-4

17. Levine, S., Kumar, A., Tucker, G., Fu, J. (2020). Offline Reinforcement Learning: Tutorial, Review, and Perspectives on Open Problems. arXiv:2005.01643. https://arxiv.org/abs/2005.01643

18. García, J., Fernández, F. (2015). A Comprehensive Survey on Safe Reinforcement Learning. Journal of Machine Learning Research, 16(1), 1437–1480. https://jmlr.org/papers/v16/garcia15a.html

Have a project or a question?

Contact Dr. Noga →