GDSII data is hierarchical which allows it to describe millions or billions of polygons with a much smaller number of database records.
The basic container is called a structure (often referred to as a cell). The structure can contain geometric elements such as boundaries and paths and references to other structures (SREFs and AREFs.) Let's look at a very simple example to understand what is going on.
BGNLIB start of the so called "library" BGNSTR begin a structure NAME=TOP the structure's name BOUNDARY a geometric boundary BOUNDARY a geometric boundary SREF CELL_A [X,Y,ROT,MIR,MAG] a reference placing another structure ENDSTR end the structure BGNSTR begin a new structure NAME=CELL_A the structure's name BOUNDARY a geometric boundary BOUNDARY a geometric boundary BOUNDARY a geometric boundary BOUNDARY a geometric boundary BOUNDARY a geometric boundary ENDSTR end the structure ENDLIB end the library
In the above pseudo GDSII code we have two structures defined: TOP and CELL_A. TOP contains a reference to CELL_A, essentially placing it somwhere with some transformations. CELL_A is very simple one and has no references to other cells (though it could.)
Here is a picture assuming that CELL_A (the smaller blue one) is placed in the lower left quadrant relative to the center of TOP.
Now TOP could easily reference CELL_A 4 times - each time in a different position and rotation.
BGNLIB start of the so called "library" BGNSTR begin a structure NAME=TOP the structure's name BOUNDARY a geometric boundary BOUNDARY a geometric boundary SREF CELL_A [X1,Y1,ROT1,MIR,MAG] a reference placing another structure SREF CELL_A [X2,Y2,ROT2,MIR,MAG] a reference placing another structure SREF CELL_A [X3,Y3,ROT3,MIR,MAG] a reference placing another structure SREF CELL_A [X4,Y4,ROT4,MIR,MAG] a reference placing another structure ENDSTR end the structure
Here is what it might look like:
Additional Levels of Nesting
CELL_A could reference another cell - CELL_B. [Note that it is forbidden for CELL_A to reference a cell that references it; this would create an impossible recursion.] If that was the case, each time TOP referenced CELL_A, CELL_B would also end up generated.
BGNSTR begin a structure NAME=TOP the structure's name BOUNDARY a geometric boundary BOUNDARY a geometric boundary SREF CELL_A [X1,Y1,ROT1,MIR,MAG] a reference placing another structure SREF CELL_A [X2,Y2,ROT2,MIR,MAG] a reference placing another structure SREF CELL_A [X3,Y3,ROT3,MIR,MAG] a reference placing another structure SREF CELL_A [X4,Y4,ROT4,MIR,MAG] a reference placing another structure ENDSTR end the structure BGNSTR begin a new structure NAME=CELL_A the structure's name BOUNDARY a geometric boundary BOUNDARY a geometric boundary BOUNDARY a geometric boundary BOUNDARY a geometric boundary BOUNDARY a geometric boundary SREF CELL_B [X,Y,ROT,MIR,MAG] a reference placing another structure ENDSTR end the structure BGNSTR begin a new structure NAME=CELL_B the structure's name BOUNDARY a geometric boundary BOUNDARY a geometric boundary ENDSTR end the structure
Order and Sorting?
A programmer would note that there ought to order in the database - i.e. one would prefer that a cell be defined before it is referenced. It turns out that GDSII has no such requirement and one will regularly encounter files where structures are referenced early on and only defined later.
This makes rendering the layout more work since one cannot make a single pass through the database and be able to compute what the image will look like.
For small layout files the extra pass is not a big deal but as the file size grows (and it does, to many gigabytes) this lack of "order" becomes a significant penalty when "opening" the data.
Optimizing the initial opening of the file and creating a way to quickly access elements in any particular geometric window turns out to be both very important in dealing with large files and devilishly difficult to do efficiently. This is what separates the men-from-the-boys when it comes to code for viewing, plotting and manipulating GDSII files.