Private Homepage of Hartmut Henkel

Experimental PdfTeX Patch: AVL Search Tree for pdftoepdf.cc File

Introduction

This is just an informal write-up of my private/spare-time fiddling with pdfTeX (it's for fun).

PdfTeX makes use of frequent dictionary lookups by arrays or linked lists at many places. One finds e. g. code sniplets like for(...){...if(!strcmp(...)) break;}, which tend to make a program slow, if the list grows. This can be improved by hashing, e. g. by a patch from Heiko Oberdiek for the fm_tab list.

I experimented with the using an AVL search tree as replacement of the linked list PdfDocuments in file pdftoepdf.cc. AVL trees allow filling in and deleting of elements, while always remaining properly balanced as an near-optimum binary search tree. They therefore allow very fast searches.

This patch was actually rather straight-forward, as the main places to patch are located in a single C++ file. AVL tree sources are available from many places. I use here the libavl, version 2.0.1 (which BTW fits nicely to the current teTeX version :-) from www.gnu.org. This AVL tree implementation in C is written by Ben Pfaff, who did an amazingly brilliant and illustrative job in documenting his software.

The Code

The patch is based on the teTeX 2.0.1 version. The files/patches below go into the subdirectory pdftexdir.

Installation

Forgot anything?

Experience, Status

Code got a little smaller, as one hasn't to make his own list management. The code doesn't crash here at least, but it's in a very early test phase.

TODOs

End Remark

Help, advice, critics greatly welcome! Have fun!

This page first put online 19 February 2003.