一种适用于有限差分模式的负载平衡区域分解方法

A LOAD BALANCING DOMAIN DECOMPOSITION METHOD FOR FINITE DIFFERENCE NUMERICAL WEATHER PREDICTION MODELS

  • 摘要: 分布式内存并行处理在数值天气预报等超大规模科学计算中已经得到了广泛的应用。中尺度模式由于分辨率高,计算量大,需使用更多的处理机进行并行运算。另一方面,由于复杂的物理过程的采用,增加了不同天气的计算量的不平衡。但是,目前所广泛使用的并行处理方法在处理机数量较多时不能很好地均衡计算负载,引起并行计算效率的降低。本文提出了一种新的非规则区域分解负载分配方法。并与已有的负载分配方法进行了分析试验对比,该方法能更有效地平衡负载,取得更好的加速效果。

     

    Abstract: Hundreds to thousands of nodes, which can reach T flops, compose modern parallel computers. But programming on such kind of system is difficult. A very import ant issue is load balancing. The more the nodes in the system, the more difficult to balance the load. Domain decomposition is commont echnique in parallel processing of mesoscale weather prediction models. The different columns of the model are distributed on different nodes. One can expect to increase the speedup the model by increase the resolution of model. However, as the resolution of the model is increased, the grids of the model and the steps of iteration are increase. More nodes are needed if we want the model can befinished in the same periods of time. As results, less columns running on each node. Alittle of unbalance of the load can be a serious problem on highly paralleled models. At the same time, the physical process of higher resolution model can bemore complex, which results more unbalance among processors. Many models use regular east west north-south domain decomposition technique use n by m nodes, ignoring the load balancing problem completely, The advantage is its simple and the communication between processors is low. It is success if the grids and the number of the nodes is highly compatible and the physics is not very complex. However, when the grid points of the model is not highly compatible with the number of the nodes, which is of ten the case in very dynamic environments. For example, one processor has one role and column of grid points than ot hers, the processor with more grids slow down very other processors as t hough the load on each grid point is the same, and it can be more serious if the load of each grid points is very different becaus the physics of the model is very different under different weat her, or on different land surface. Some researches show that the speedup of the model goes down rapidly after the model runs several hours when the microphysics is turn on. The solution is using adjust able domain to catch up the variation of the load. Some researchers use adjustable rectangle domain, which is better than the fixed domain. But our results shows a nearly rectangle domain, adding a few steps on some sides of the rect angle domain, can balance the load more better than the rect angle domain with only a little increase of communication due to the steps.

     

/

返回文章
返回