Wind power is a major component within the American portfolio of renewable energy options and its role is poised to accelerate rapidly. Approximately one fifth of the world’s total capacity is now installed in U.S.A. The U.S. Department of Energy has advocated increasing the installed capacity to 300 GW so 20% of total demand is furnished by wind by 2030. In 2009, the U.S.A. increased its capacity by 39% achieving a total installed capacity of 35 GW yet wind resources meet only 2% (70 Billion KWH) of the net American demand.

We are developing a variety of machine learning approaches to provide powerful scalable statistical tools to the wind energy community. These include a variety of modeling methods like Bayesian networks, copula based dependence modeling, Gaussian processes, and a variety of optimization approaches based on sampling, and generative approaches. We closely work and collaborate with AWS Truepower. We have identified three different areas where machine learning and information technology can help improve wind systems performance. The first area helps in building a wind farm via. resource assessment. The second helps optimize the layout for a farm given turbine models, farm constraints and wind resource. We also developed an optimal power routing algorithm for large farms that improve the efficiency of the farms (current implementation of the algorithm is commercially used in OpenWind). Finally, we are interested in developing techniques to improve forecasting accuracy helping seamless integration of wind into our energy portfolio.

Small wind developers are forced to cut costs in assessment because of concerns for project return-on-investment. Accordingly, only one or two site anemometers are set up. and recording takes place over a short interval. This makes the data “sparse”. Sparse data inadequately expresses comprehensive wind speed and directional properties of the site. Sparsity exacerbates the measure-correlate-prediction problem because, in addition to possible complex correlation between met towers and the site, there is missing data.

To examine this problem, we collected wind speed and direction data from 14 airports in reasonable proximity of each other including Logan International Airport. The data was recorded at 1sample/minute frequency. For example, we used Logan Airport wind data from January and February 2007 for training and tested our algorithm's ability to predict the other 10 months. We identified that the predictors of wind velocity at Logan depend on a specific wind direction interval. We *adaptively* binned the wind data by using a Particle Swarm Optimization algorithm that selected varying direction intervals and bin sizes. This is superior to 12 bins each containing equal amounts of data or equal size bins each of 30 degree intervals.

Techniques for Accurate Wind Resource Estimation by Modeling Statistical Dependency. K. Veeramachaneni, Xiang Ye, U.M. O'Reilly, in Computational Intelligent Data Analysis for Sustainable Development, Editors, Nitesh Chawla, Simeon Simoff,Ting Yu, Data Mining and Knowledge Discovery Series, Taylor & Francis, 2013. pdf

**Copula Graphical Models for Wind Resource Estimation**, Kalyan Veeramachaneni, Alfredo Cuesta-Infante, Una-May O'Reilly, IJCAI 2015.

Wind farm design is a complex and involved process with multiple stakeholders. It can occur over an extended duration as stakeholders iterate over proposed solutions and collect more information to inform the process. The design team needs a micro-siting tool that supports addressing the problem complexity. This tool must inform the process each iteration. Current tools allow many site factors, such as wind resource and topography to inform the design to an arguably reasonable degree. They currently lack, however, a layout optimizer which is powerful enough to consider multiple siting objectives, such as energy capture maximization and land cost minimization simultaneously. Optimizing for one or the other separately won’t find the best layouts

We are currently developing an accurate, efficient, and parallelizable, optimization algorithm for the layout of hundreds, then 1000, turbines. Efficient and accurate optimization is challenged by large numbers of turbines, large farm areas, constraints on feasible sitings and expensive wake models that scale nonlinearly with each additional turbine. The algorithm could be incorporated as an "optimizer" component choice in a layout tool such as OpenWind. It is modular which allows different wake effect models to be incorporated. Its cost can be stated as a relation depending on how many layouts it searches and how expensive it is to calculate wake loss. We demonstrate how well it maximizes energy capture. We use it to observe how wake loss scales with energy capture as additional turbines are sited.

Markus Wagner, Kalyan Veeramachaneni, Frank Neumann, Una-May O'Reilly, **Optimizing the layout of 1000 turbines**, in proceedings of Annual European Wind Energy Conference (EWEA), 2011.

Dennis Wilson, Emmanuel Awa, Sylvain Cussat-Blanc, Kalyan Veeramachaneni, Una-May O'Reilly, **On Learning to Generate Wind Farm Layouts**, in the Proceedings of the 2013 annual meeting of Genetic and Evolutionary Computing Conference, (GECCO 2013).

Laying out the network of power cables between wind tur- bines and substations incurs a significant cost when building a wind farm. For small farms, an expert can often identify a good layout by hand, or by simulating all the possible layouts. But for larger farms, these ap- proaches are no longer applicable. We present some initial work towards automating the design of cabling layouts for large-scale wind farms. We build a problem model that incorporates the relevant real-world con- straints, and then decompose the problem into three layers: the circuit, the substation, and the full farm. In the case when there is a single cable type, the circuit and substation layers map to graph problems (the un- capacitated and capacitated minimum spanning tree). For the full farm layer, we present a greedy top-down algorithm to find a feasible solu- tion. In the case when there are multiple cable types, we focus on the first layer, presenting an algorithm to find the optimal circuit. We then discuss under what conditions the problem can be simplified to the case with a single cable type. We are grateful for advice from Nicholas Robinson in our endeavors.

Algorithms for Cable Network Design on Large-scale Wind Farms. Constantin Berzan, Kalyan Veeramachaneni, James McDermott, Una-May O'Reilly. pdf