Overview

The FreeWorld OS kernel decompression system provides complete support for decompressing compressed kernel images. It supports three major compression formats: gzip, LZMA2/XZ, and Zstandard (ZSTD). This allows the kernel to be stored in compressed form, reducing boot media size and improving load times.

✅ Fully Implemented

Total Implementation: ~5,034 lines of production-quality assembly code

Supported Formats

1. gzip (DEFLATE)

Location: kernel/compression/inflate.asm (~970 lines)

Complete DEFLATE/INFLATE algorithm implementation with:

  • Bitstream Handling: Variable-length bit reading with buffer management
  • Huffman Coding: Fixed and dynamic Huffman table building and decoding
  • LZ77 Decompression: Literal and match decoding with sliding window
  • Block Types: Uncompressed, fixed Huffman, and dynamic Huffman blocks
  • 32KB Sliding Window: Dictionary management for match references

Key Functions

  • inflate_deflate() - Main DEFLATE decompression function
  • read_bits() - Read variable-length bits from stream
  • build_huffman_codes() - Build Huffman decoding tables
  • decode_huffman_symbol() - Decode symbol using Huffman tree
  • decode_length_and_distance() - Decode LZ77 match length and distance
  • copy_from_window() - Copy match from sliding window

2. LZMA2/XZ

Location: kernel/compression/lzma2.asm (~981 lines) and kernel/compression/lzma2_helpers.asm (~858 lines)

Complete LZMA2/XZ decompression implementation with:

  • XZ Format Parsing: Magic bytes, stream headers, block handling
  • Range Decoder: Arithmetic coding with bit normalization
  • LZMA Decoder: Literal, length, and distance decoding
  • Context-Aware Decoding: lc/lp/pb properties for literal context
  • Dictionary Management: Circular buffer for match references
  • Unpack Size Termination: Proper stream termination handling

Key Functions

  • decompress_xz() - Main XZ decompression function
  • range_decoder_decode_bit() - Decode single bit using range decoder
  • range_decoder_normalize() - Normalize range decoder state
  • lzma_decode_literal() - Decode literal byte with context
  • lzma_decode_len() - Decode match length
  • lzma_decode_distance() - Decode match distance
  • lzma_copy_match() - Copy match from dictionary

3. Zstandard (ZSTD)

Location: kernel/compression/zstd.asm (~668 lines) and kernel/compression/zstd_helpers.asm (~1,300 lines)

Complete ZSTD decompression implementation with:

  • ZSTD Format Parsing: Magic bytes, frame headers, block processing
  • FSE (Finite State Entropy): Table building and symbol decoding
  • Huffman Coding: Canonical Huffman tree building and decoding
  • Literal Decompression: Raw, RLE, and Huffman modes
  • Sequence Decompression: FSE decoding for lengths and offsets
  • Match Copying: Window-based match copying with wrapping
  • Window Management: Allocation, position tracking, and wrapping

Key Functions

  • decompress_zstd() - Main ZSTD decompression function
  • zstd_build_fse_table() - Build FSE decoding table
  • zstd_decode_fse_symbol() - Decode symbol using FSE
  • zstd_build_huffman_table() - Build canonical Huffman table
  • zstd_decode_huffman_symbol() - Decode symbol using Huffman
  • zstd_decompress_literals() - Decompress literal block
  • zstd_decompress_sequences() - Decompress sequence block
  • zstd_copy_match() - Copy match with window management

Architecture

Decompression Flow

Kernel Image (Compressed)
    ↓
kernel_decompress() - Format Detection
    ↓
    ├─> gzip → decompress_gzip() → inflate_deflate()
    ├─> XZ → decompress_xz() → LZMA2 decoder
    └─> ZSTD → decompress_zstd() → ZSTD decoder
    ↓
Kernel Image (Decompressed)
    ↓
Kernel Initialization

Format Detection

The decompression dispatcher (kernel/compression/decompress.asm) automatically detects the compression format by checking magic numbers:

  • gzip: 0x1F 0x8B at offset 0
  • XZ: 0xFD 0x37 0x7A 0x58 0x5A 0x00 at offset 0
  • ZSTD: 0x28 0xB5 0x2F 0xFD at offset 0

Memory Management

  • Compressed kernel buffer: Provided by bootloader
  • Decompressed kernel buffer: Allocated via kmalloc()
  • Working buffers: Allocated as needed for algorithms
  • Sliding windows: Managed within algorithm implementations

Implementation Details

gzip/DEFLATE

The DEFLATE implementation follows RFC 1951 and includes:

  • Bit-level reading with buffer refills
  • Fixed Huffman tables (pre-defined in RFC 1951)
  • Dynamic Huffman table building from code lengths
  • LZ77 match decoding (lengths 3-258, distances 1-32768)
  • 32KB sliding window for match references

LZMA2/XZ

The LZMA2 implementation includes:

  • XZ container format parsing (magic, stream flags, CRC)
  • LZMA2 block types (uncompressed, compressed, LZMA variants)
  • Range decoder with 32-bit precision
  • Literal decoding with lc/lp/pb context properties
  • Length and distance decoding with position-based probabilities
  • Dictionary management with circular buffer

Zstandard

The ZSTD implementation includes:

  • ZSTD frame header parsing (window descriptor, content size)
  • Block types (raw, RLE, compressed, end blocks)
  • FSE table building with normalization
  • Canonical Huffman tree construction
  • Literal decompression (raw, RLE, Huffman modes)
  • Sequence decompression (FSE for lengths/offsets)
  • Window management for match copying

Usage

Kernel Initialization

Decompression is automatically called during kernel initialization:

kernel_init()
    ├─> early_hardware_detect()
    ├─> kernel_decompress()  ← Automatic decompression
    ├─> hardware_detect()
    └─> ...

Manual Decompression

; Decompress kernel if compressed
; Input: RSI = compressed buffer, RCX = compressed size
; Output: RAX = decompressed buffer, RDX = decompressed size
extern kernel_decompress
call kernel_decompress

Build System

The decompression system is built as part of the kernel:

cd kernel/compression
make

This compiles:

  • decompress.asm - Format detection and dispatcher
  • inflate.asm - gzip/DEFLATE decompression
  • lzma2.asm - XZ format parsing and block handling
  • lzma2_helpers.asm - LZMA2 core algorithms
  • zstd.asm - ZSTD format parsing and block handling
  • zstd_helpers.asm - ZSTD core algorithms

Performance

  • gzip: Fast decompression, moderate compression ratio
  • LZMA2/XZ: Excellent compression ratio, slower decompression
  • ZSTD: Balanced compression ratio and speed

All algorithms are optimized for kernel-space execution with minimal memory overhead.

Integration Points

With Bootloader

The bootloader can load a compressed kernel image, which is automatically decompressed during kernel initialization.

With Kernel Initialization

Decompression is integrated into the kernel initialization sequence, occurring after early hardware detection but before full kernel startup.

Related Documentation