when faster is slower, libc world

Investigating different memmem string search algorithms for my future vi like text editor, turns out that the libc replacements for embededd devices are slower than the native gnulibc. The test program opens a 300MB file (linux kernel header and source files merged-that is my reference), and searches for the (random) string “xxxxxxxxxxxxxxxxboeuoeuboeuoeu”. I linked against the memmem functions from 4 different libraries + a Boyer-Moore algorithm implementation. The results below show the time to search through the are the results (measured with time utility):

* dietlibc: 3.529
* newlibc: 1.955
* libc: 1.238
* klibc: 0.694
* Boyer Moore impl: 0.228

Did anybody really hear about klibc? Me not before. But it seems that it is providing the fastest algorithm. Dietlibc and newlibc are interesting projects for embedded devices. The goal was not speed, but size. However for a string search algorithm space shuld not be so much of compromise.

Making further tests… hm.. doing simple grep hm… result: 0.157, searching for the same string in the same file (about file cache: all programs had the same chance of having the file inside the fs cache). Well the reason is that grep is not loading the full file into the memory, it just loads chunks, and searches inside those chunks. The test program loads the full buffer into memory. The numbers above are for a mmap-ed version. Mmap seems to be overall better solution. However the page misses introduce too much noise in the results of the measurement of the algorithm. However in order to get more accurate results, I modified the test program to allocate ~300MB, and “read” the file into the 300MB. Reading the file is slower than the going through the mmap, as you can see below. However once having the full file in RAM, the measurements are more accurate:

with malloc/read
* Boyer-Moore: read/mmap time 0.358, memmem time 0.092
* klibc: read/mmap time 0.358, memmem time 0.555
* libc: read/mmap time 0.357, memmem time 1.100
* newlib: read/mmap time 0.358, memmem time 1.177
* dietlibc: read/mmap time 0.358, memmem time 3.344

So, the “Naïve string search algorithm” used by dietlibc is clearly the looser. Having in mind that dietlibc is intended for embedded devices, would make sense to improve it.

note: the read/mmap functions are from glibc for klibc/dietlibc/newlibc

Leave Your Comment

Name*
Mail*
Website
Comment