How to Use GDAL in Web Applications (Part 3)



This content originally appeared on DEV Community and was authored by Yangholmes

This article focuses on optimization.

The Necessity of Optimization

The previous article introduced a complete compilation script that successfully builds the WebAssembly version of GDAL.

GDAL Build Artifacts

However, the compilation results are not suitable for production environments because:

  1. Excessive file sizes: Core wasm file (27MB), glue code (272KB), data file (11MB)
  2. Redundant glue code: Contains Node.js and bash environment code, impossible to tree-shake
  3. Debug info in production: Debug information is unnecessary in production environments

File size is the most critical issue—total artifacts exceed 38MB, which is unacceptable for any web application.

Additionally, the Makefile contains misconfigurations. Since emsdk silently ignores unsupported compilation options during build, these errors don’t halt compilation. This article will also interpret the original author’s intent and fix incorrect compilation parameters.

Disclaimer

Through compilation optimization, GDAL 3.x WebAssembly artifacts can be reduced but likely not enough. These techniques work well for GDAL 2.x and OpenCV 4.x. Deeper reasons relate to GDAL’s source code and build mechanisms—beyond this series’ scope.

TODO: Add OpenCV optimization comparison

Optimization Approaches

For web applications, smaller resources are better. Classical frontend workflows use modern build tools and modular design to shrink JavaScript via lazy loading and tree-shaking. Non-JS resources are transformed by “loaders” into JS modules for optimization. However, these methods don’t work for WebAssembly:

  1. Loader limitations: Wasm files can be compressed but can’t be used client-side without extra code.
  2. No tree-shaking: Wasm is binary code; dead code elimination can’t be done like with JS ASTs.

Could a *.wasm loader exist? Tools like vite support loading Wasm via ?init, but this doesn’t suit glue code integration.

Thus, we optimize during the wasm compilation phase.

Code Separation

WASM

Options:

  • 0: Output combined wasm.js (wasm embedded in JS)
  • 1: Separate wasm and JS output
  • 2: Output both formats

wasm.js serves legacy browsers. -sWASM=2 outputs both, but if target browsers support wasm, wasm.js is unnecessary. wasm.js encodes wasm as base64, increasing file size.

Demand-Driven Compilation

“Compile only what you use”

1. Library Functions

Projects typically use only a small subset of a library. Dead code elimination is controlled by:

EXPORTED_FUNCTIONS # List of exported functions
EXPORT_ALL # Export all functions

Note: Exported functions require a _ prefix. For example:

-sEXPORTED_FUNCTIONS="['_add']"

2. Emscripten Runtime Functions

EXPORTED_RUNTIME_METHODS

Default is empty. Export only necessary methods. For virtual filesystems:

-sEXPORTED_RUNTIME_METHODS="['FS']"

The original gdal3.js exports nearly all GDAL functions, a key reason for large artifacts.

Debug Information

emcc parameters resemble gcc’s. Disable debug info in production using:

-gsource-map
-source-map-base
-O<level>
-g<level>

1. -gsource-map and -source-map-base

Control sourcemap generation. If enabled, debuggers load .map files from <base-url>/<wasm-file-name>.map, with <base-url> set by -source-map-base (default: same as wasm path).

2. -O<level>

Optimization levels:

  • -O0: No optimization, full debug info
  • -O1: Basic optimizations, remove runtime asserts
  • -O2: Dead code elimination (beyond -O1)
  • -O3: Aggressive size reduction (beyond -O2)
  • -Og: Similar to -O1, more debug info
  • -Os: Similar to -O3, smaller output
  • -Oz: Smaller than -Os

Default -O0 retains full debug info.

Higher optimization levels increase compilation time.

3. -g<level>

Debug levels:

  • -g0: No debug info
  • -g1: Preserve whitespace in JS
  • -g2: Preserve function names
  • -g3: Full debug info (DWARF + LLVM metadata)

Omitting the number (e.g., -g) defaults to -g3.

Environment Configuration

By default, emscripten generates environment-detection code for multiple targets. For fixed environments, this is redundant. Use:

ENVIRONMENT

Valid values:

  • node: Node.js
  • web: Web browsers
  • webview: Same as web (embedded webviews)
  • worker: Web Worker
  • shell: Command line

For web apps, compile only -sENVIRONMENT=worker. Also configure:

EXPORT_ES6

Set to 1 to output ES Module-compliant glue code. Default output includes environment-sniffing CJS/IIFE, unusable with import. Compare:

// -sEXPORT_ES6=1

;return moduleRtn}export default CModule;

// -sEXPORT_ES6=0

;return moduleRtn}})();if(typeof exports==="object"&&typeof module==="object"){module.exports=CModule;module.exports.default=CModule}else if(typeof define==="function"&&define["amd"])define([],()=>CModule);

Filesystem

Libraries like GDAL rely on OS filesystems. Emscripten emulates this in JS. Disable if unused:

FILESYSTEM

Automatic if code references stdio.h/fprintf. For pure computation, disable manually.

Access via Module.FS.

Other Options

1. Polyfill

POLYFILL

Default true. Disable if polyfills are handled elsewhere.

2. Use JS Math Library

JS_MATH

Set true to use browser’s Math, avoiding libc compilation. May reduce precision. Recommended for precision-insensitive tasks.

3. Minimal Runtime

MINIMAL_RUNTIME

Minimal output (no POSIX, no Module, no built-in XHR). May break functionality—not recommended.

Practical Optimization

Fixing gdal3.js Build Script Errors

1. Invalid Debug Level

Line 4 incorrectly uses -g4 (unsupported).

Fix for type=debug:

GDAL_EMCC_FLAGS += -O0 -g3

2. Sourcemap Misconfiguration

Same line: --source-map-base without -gsource-map. Fix:

GDAL_EMCC_FLAGS += -gsource-map=1 --source-map-base $(BASE_URL)

Optimizing gdal3.js Build Script

1. Disable Debug in Production

Line 6:

GDAL_EMCC_FLAGS += -Oz -g0

2. Specify Environment

GDAL_EMCC_FLAGS += -s ENVIRONMENT=worker -s EXPORT_ES6=1

3. Reduce Exported Functions

For the use case in Part 2 (only GDALOpen, GDALInfo, GDALClose):

GDAL_EMCC_FLAGS += -s EXPORTED_FUNCTIONS="[\
'_malloc',\
'_free',\
'_CSLCount',\
'_GDALOpen',\
'_GDALClose',\
'_GDALInfo'\
]"

Minimal runtime methods:

GDAL_EMCC_FLAGS += -s EXPORTED_RUNTIME_METHODS="[\
'ccall',\
'cwrap',\
'FS'\
]"

Results

Optimization Results

  • Wasm file: Reduced by 6,177,075 bytes (22.44%)
  • JS file: Reduced by 18,299 bytes (10.21%)

Conclusion

Future articles will cover:

  1. Emscripten’s virtual filesystem
  2. Purpose and optimization of *.data files


This content originally appeared on DEV Community and was authored by Yangholmes