GDSTILE

How Computations are Distributed

The tiling of a large GDSII file is not a computation that can be easily divided into equal work for all members of the team. Because of the nature of the GDSII database, the initial loading, scanning and exploding of the hierarchy should be done on a single very fast machine with lots of memory.

We recommend that the "master" machine have as much installed RAM as possible and that it run at the highest CPU speed possible.

The slaves need not have as much RAM as the master since they process a stripe of data at a time -- each stripe might be anywhere from 20 to 200 MB in size, depending on the stripe's height, width and density.

In terms of efficiency the slaves should be equally matched machines -- it is hardly worth the trouble to put a 400 MHz CPU together with a couple of 2GHz CPUs on the slave team since the faster CPUs will do the majority of the work.

Finally, one should do some benchmarking to determine whether the throughput limit is due to the master or slaves.

The master machine is responsible for scanning and loading the input data -- then exploding it and passing "stripes" of data to the slaves.

The master does as little processing on a stripe as possible saving most of the boolean computations for the slaves. It does need to clip any entities that cross the stripe boundary but leaves all entities that fall completely within the stripe untouched.

The system is balanced when the master can deliver stripes to the slaves at the same rate that the slaves can chop them into tiles. It is clear that under the current setup, there is only one master for any number of slaves. So at some point the master will be the limiting factor as additional slaves are added.

Page: 1 | [2] | 3 | 4 | 5