{"title":"Large-scale analytical Fourier transform of photomask layouts using graphics processing units","authors":"J. A. Sakamoto","doi":"10.1117/12.2192040","DOIUrl":null,"url":null,"abstract":"Compensation of lens-heating effects during the exposure scan in an optical lithographic system requires knowledge of the heating profile in the pupil of the projection lens. A necessary component in the accurate estimation of this profile is the total integrated distribution of light, relying on the squared modulus of the Fourier transform (FT) of the photomask layout for individual process layers. Requiring a layout representation in pixelated image format, the most common approach is to compute the FT numerically via the fast Fourier transform (FFT). However, the file size for a standard 26- mm×33-mm mask with 5-nm pixels is an overwhelming 137 TB in single precision; the data importing process alone, prior to FFT computation, can render this method highly impractical. A more feasible solution is to handle layout data in a highly compact format with vertex locations of mask features (polygons), which correspond to elements in an integrated circuit, as well as pattern symmetries and repetitions (e.g., GDSII format). Provided the polygons can decompose into shapes for which analytical FT expressions are possible, the analytical approach dramatically reduces computation time and alleviates the burden of importing extensive mask data. Algorithms have been developed for importing and interpreting hierarchical layout data and computing the analytical FT on a graphics processing unit (GPU) for rapid parallel processing, not assuming incoherent imaging. Testing was performed on the active layer of a 392- μm×297-μm virtual chip test structure with 43 substructures distributed over six hierarchical levels. The factor of improvement in the analytical versus numerical approach for importing layout data, performing CPU-GPU memory transfers, and executing the FT on a single NVIDIA Tesla K20X GPU was 1.6×104, 4.9×103, and 3.8×103, respectively. Various ideas for algorithm enhancements will be discussed.","PeriodicalId":212434,"journal":{"name":"SPIE Optical Systems Design","volume":"51 1","pages":"0"},"PeriodicalIF":0.0000,"publicationDate":"2015-10-02","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"","citationCount":"0","resultStr":null,"platform":"Semanticscholar","paperid":null,"PeriodicalName":"SPIE Optical Systems Design","FirstCategoryId":"1085","ListUrlMain":"https://doi.org/10.1117/12.2192040","RegionNum":0,"RegionCategory":null,"ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":null,"EPubDate":"","PubModel":"","JCR":"","JCRName":"","Score":null,"Total":0}
引用次数: 0
Abstract
Compensation of lens-heating effects during the exposure scan in an optical lithographic system requires knowledge of the heating profile in the pupil of the projection lens. A necessary component in the accurate estimation of this profile is the total integrated distribution of light, relying on the squared modulus of the Fourier transform (FT) of the photomask layout for individual process layers. Requiring a layout representation in pixelated image format, the most common approach is to compute the FT numerically via the fast Fourier transform (FFT). However, the file size for a standard 26- mm×33-mm mask with 5-nm pixels is an overwhelming 137 TB in single precision; the data importing process alone, prior to FFT computation, can render this method highly impractical. A more feasible solution is to handle layout data in a highly compact format with vertex locations of mask features (polygons), which correspond to elements in an integrated circuit, as well as pattern symmetries and repetitions (e.g., GDSII format). Provided the polygons can decompose into shapes for which analytical FT expressions are possible, the analytical approach dramatically reduces computation time and alleviates the burden of importing extensive mask data. Algorithms have been developed for importing and interpreting hierarchical layout data and computing the analytical FT on a graphics processing unit (GPU) for rapid parallel processing, not assuming incoherent imaging. Testing was performed on the active layer of a 392- μm×297-μm virtual chip test structure with 43 substructures distributed over six hierarchical levels. The factor of improvement in the analytical versus numerical approach for importing layout data, performing CPU-GPU memory transfers, and executing the FT on a single NVIDIA Tesla K20X GPU was 1.6×104, 4.9×103, and 3.8×103, respectively. Various ideas for algorithm enhancements will be discussed.