AI in Wind Turbine Operations

Michael Tegtmeier
Jul 10, 2023
8 min read

Updated: May 26, 2025

Transcript:

My name is Michael Tegtmeier, and I have a background in Physics. I've worked at Senvion, focusing on various measurements. I've analyzed a lot of data and found that manufacturers, despite having a wealth of sensors and data, don't really analyze in the way that it could be done today, especially with AI. This was particularly evident in 2017.

Today, I'd like to present what we have discovered so far, and peek into the current state of the wind industry. Most importantly, I will discuss the potential advantages that AI can bring to wind energy applications and provide some foresight into where things may head in the next five years.

So, I will be discussing the capabilities of AI, not only in general but also specifically in the wind energy sector. I will talk about the added value it brings, and why the direction of AI's development is vital. It's common knowledge that machine learning and big data have become buzzwords, but I'll try to explain why they're important.

There have been three major advancements in AI in recent years, and one of the most important is the training of AI on large data sets to understand how this data relates. For instance, I input into the chat: "You are the technical operator of a wind energy plant. Damage has occurred to the main bearing of a senior facility, indicated by increased temperature. Write an email to the service team requesting an oil sample from the bearing as soon as possible, and please do so within 300 characters."

The result was a well-constructed message, demonstrating that processes, such as writing an email, can already be automated effectively. On a lighter note, AI can also generate images. While the generated wind turbine image had four blades, a minor discrepancy, it still demonstrates the capabilities of AI.

This brings us to the second advancement: object recognition, which requires vast amounts of measurement data or labeled data sets. This has a broad application, especially in autonomous driving, but also for startups and companies that detect leaf damage or cracks on towers using AI.

Today, I want to present a different approach, focusing on reinforcement learning. It's an algorithm that is particularly proficient at learning to play complex games. The idea here is for an AI agent to adapt its behavior in an environment according to a specific objective function, even if it doesn't know the environment. This concept is highly applicable to a wind energy plant as well.

The benefits of AI application can be highlighted quite well already. But, before diving into that, let's discuss why it's relevant and what problems it addresses. As wind turbines become larger, we face specific challenges. For instance, in a repowering project, if one out of 50 turbines fails, it's not a significant issue. But after repowering, if I have 17 much more powerful turbines and one of them fails, the risk and loss of yield are much greater.

As the turbines increase in size, so does the physical strain on them, making them more sensitive and prone to faults. There's also the issue of staff shortage and demographic change; who will perform all the necessary tasks? Additionally, Original Equipment Manufacturers (OEMs) tend to shift these risks from their books into full service contracts, making them more lenient, resulting in opportunity costs from yield losses ending up on the operators' balance sheets.

For instance, if there's an issue with the transformer station or a turbine in a wind farm of 17 turbines, the costs from yield losses due to long delivery times can quickly escalate to millions. When you sum up these damages and evaluate which damages can lead to a yield loss, you arrive at approximately 26,000 Euros per megawatt per year due to downtime. If this can be altered with software, then you've made a good return on investment.

Here's a chart showing damage statistics. On the y-axis, we have the probability of damage occurrence, and the bubble size represents the downtime when the damage occurs. The point here is that we can prevent many events that lead to downtime and currently lie in the operators' balance sheets.

Let's look at a real-world example. We warned an operator and an OEM that a generator was about to fail. The temperature trend is shown here, but our warning was ignored. When the turbine suffered total failure, the operation of the other two turbines showing similar behavior was altered, preventing the same fate. What I want to highlight here is that faults can be detected months in advance. This gives you time to plan for component replacement and significantly reduce downtime.

Here's another example. A photo was taken of black grease on a turbine; it was originally white. Several weeks later, we sent a warning, even though we didn't detect an increased temperature in the rotor bearing. Simple preventative measures can be taken, which I believe Eric will explain better later.

The whole thing falls under the umbrella of machine learning and big data. We take historical data and train a neural network based on past behaviors. We know how well we can do this because we take datasets from the training and validation datasets, which allow us to make predictions about temperature within +/- 0.5 degrees and power output within +/- 30 kW.

We take external temperature, wind speed, and wind direction as input parameters to predict a turbine's power output. We can then make very precise predictions about power output behavior, and thus identify minor deviations, pitch problems, or issues in the control system. We can detect things that previously weren't visible.

Here you can see the power curve isn't according to the manufacturer's standard, but it has been compared to a simulation. Small deviations are clearly visible. At the bottom, you can see the difference between the simulation and the actual data.

Now, a crucial aspect is what to do with all these reports about anomalies, high temperatures, etc. If I have a portfolio of several hundred turbines, it's impossible to check everything. What we do is collect constant feedback from our customers about the relevance of these alerts. Over time, a predicted relevance score emerges using a 1 to 5-star rating system. In this example, the relevance score rose before the algorithm marked an issue, indicating the effectiveness of this system.

So, somehow the algorithm has recognized that it needs to pay attention. It's deemed something here relevant. This is excellent for prioritizing within large portfolios and identifying which events need our focus.

The holy grail, so to speak, is fault mode prediction, which makes this talk on damage classification particularly exciting. In the end, I want an analysis - not always manually done, but preferably automated, indicating possible causes, like a reduction in performance.

However, I don't always have all the necessary information. I might have some status codes, but they can be challenging to interpret. Assigning all of this is an enormous amount of work. So, what we do here is apply labels to our customers' data or the anomalies that our detection system identifies. We then give a probability estimate of how a future customer would likely label it.

In this case, we see room for improvement, but we already see in orange the rise of each patch, which is extremely exciting as an initial indication. We probably don't need to worry about the patch.

Let's take a moment to discuss our AI infrastructure, which we have developed in four stages. We believe that the system will continue to improve over time. The first stage involves classic data engineering, examining raw data and figuring out how to acquire it. In the second stage, we clean the data using AI methods or statistical methods. Then, in the third stage, we define the training set for anomaly detection. This is all part of the anomaly detection stage, where we use traditional neural networks and other statistical methods for outlier detection.

Transfer learning is particularly important. Sometimes we don't have a lot of data, and so we learn from larger datasets. For example, if a wind farm has been newly built, I don't have any data yet but still need monitoring. So, we learn beforehand from neighboring facilities or similar types, then quickly train the system to understand how this new facility operates.

The third stage, which is perhaps the most crucial, involves classification or fault mode prediction. We use another AI layer here that incorporates all available data, status code differences, anomalies, and anything else we can find to make this prediction.

None of this would work if the customer's process or the feedback from the service team was off. Hence, we've invested a lot of time and effort into differentiating this process. The crucial question with thousands of facilities is: how can we scale this up? Prioritizing the alarms via AI is crucial, and another major issue we should discuss in the roundup is how to get the service partner to actually improve the events or even initiate events at the facility?

Having confidence in the data and ensuring a low false-positive rate are important. We aim to be fast - sending an alarm within 36 hours. We do this on a facility-by-facility basis. For instance, we have 4,000 networks in production and receive an average of 120 alarms per week, labeled by customers. This has provided us with a wealth of experience in fault detection.

So, where is all this leading us? I've brought an old newspaper article here. It tells of math teachers who were worried when calculators were introduced, fearing they would lose their jobs. Of course, we know that didn't happen - there are still math teachers, but the teaching tools and methods have changed. Similarly, job roles are changing. For example, the role of the technical operator is changing faster than we might think, but it's not going away - it's important to emphasize this.

What changes do I foresee? AI-assisted operation will be a reality, with AI advising where to focus or what actions to take. At the same time, higher-level decision-making will be supported with tools that weigh whether it's worth sending a service team or not. It's essential to always have human oversight, particularly when significant costs are involved.

Another major theme is big data. We need to move away from 10-minute data intervals and move towards 60-second or even second-by-second data. There's much more information and precision in this, and it can improve the granularity of monitoring. For example, if I'm monitoring an azimuth that oscillates back and forth, I won't see anything in 10-minute intervals. But if I want to calculate the lifetime of the azimuth motor, I need to look at minute data.

In the next five years, we believe that facilities will be controlled in real-time. This means we can consider things like if a component is completely broken and the market electricity price is currently low, we can reduce output. Alternatively, if I have a good market electricity price in the evening and know I have a fault, I might reduce output in the afternoon and run at maximum when the market price and wind conditions are good.

We can deploy agents that can learn to manage these constantly changing factors like consumption and market electricity prices in real-time. These trends are also noticeable offshore, with many customers and projects already taking place. Large-scale projects that happen offshore will also occur onshore. More responsibility is falling to the operator instead of the OEM due to the large associated risks, and service agreements are becoming less comprehensive.

The automation of processes and decisions, including with AI, is already common offshore.

Thank you for your time!