Memory Usage for Large GDSII Files

Introduction

Displaying or otherwise processing a GDSII file requires that QISLIB first create tables describing the internal organization of the file and location of entities. A GDSII file is hierarchical - i.e. a number of entities can be grouped together and the group (known as a cell or structure) can be referenced many times. Cell references can and are nested: Cell A references cell B which in turn references cell C. While this makes the GDSII file very compact, it also requires that any program designed to display or process a file be able to unpack the referencing.

This would not be a serious issue if the files were small but modern GDSII files can be as large as 50GB, with >100,000 cell defintions and millions of cell references. To make matters more difficult, there is no order in which the cell definitions and cell references may appear in the file -- a cell may be referenced first and defined later downstream.

Before QISLIB can start to display a file, it has to perform two organizational tasks:

a) scans the file once to build a list of cell definitions, the extents of each cell and a hierarchy tree.

b) scans the file a second time and create a family of tables known as quad trees that organize both entities and references by location so that when zoomed in, the program can quickly access entities without traversing the entire layout database.

the scan data and quad tree data are loaded into memory

Scan and Quad Data Always are Placed in Memory

Both the scan data and the quad tree data are stored in memory since they are accessed constantly when viewing a file. The scan data is relatively small in the scheme of things but the quad tree footprint can be quite large since it is actually an array of quad trees - one for each layer of data in the layout and another for array and cell references.

Memory Requirements?

Initially our implementation of the quad tree was memory intensive and often reached 30% of the GDSII file size when resident in memory. This limited the size of the GDSII file that could be loaded into a machine. We recently optimized the quad tree and it now typically takes up only 10-20% of the GDSII file size in memory. For a 40GB GDSII test file that we recently evaluated, the scan data plus quad data filled 7.5GB of the available memory. The scan file is typically a smaller percentage of the GDSII file and its size varies depending on the number of cell definitions and the nature of the hierarchy.

Computation Requirements

Scanning a large file and building the quad tree take time. Not only is there disk IO involved, but the bigger the file, the more data there is to sort and more pointers have to be computed. Depending on the speed of your workstation and the nature of the file this can take from 30 to 60 seconds per GB of GDSII data.

Opening a 40 GB GDSII file might take anywhere from 20-40 minutes. Quite a while to wait.

Generating a Display

To actually put polygons on the screen, QISLIB has to get them from the GDSII file located on disk. Unless you are displaying 100% of the layout, the quad tree will help the program only access data that falls inside the viewing window. Nevertheless, for each pan and zoom, QISLIB must read the polygon data from the GDSII file and does this via a disk read.

using the scan data and quad data the program collects the needed entities from the gdsii file and computes the polygons to display.

Loading the Entity Data into RAM for Faster Performance

If you want to get faster pan and zooms (and faster computations for other QISLIB functions such as fracturing or care area selections) you can load the entity data normally read from disk into RAM. To do this, in QISLIB you use the API command Set_Load_Memory = ON. The GDSII data is compressed when loaded into RAM so it typically only needs about 30% of the space in memory that it takes up on disk. Nevertheless, it does take up a lot of memory.

loading the entity data into RAM requires much more RAM but also improves display performance.

flow chart showing the loading and display process when Set_Load_Memory = ON.

Let's compare the two possbilities:

Set_Load_Memory_Off

40 GB GDSII

Memory usage = 7.5 GB

Entities read from disk for display

Pan and Zoom Speed Average

Set_Load_Memory_On

40 GB GDSII

Memory usage = 19.5 GB

Entities read from RAM for display

Pan and Zoom Speed Faster

Page 2 - where we explain the heavy price paid for trying to cram too much data into RAM ....