This is part 1 of possibly a 1-part series. We’ll see.
Has this ever happened to you? You have some code like this:
You compile it, run it under a debugger like GDB (or ltrace
?) and set a breakpoint for the call to strcpy()
:
Why isn’t strcpy()
defined? We’re definitely calling it in that tiny program, right? So what gives? We double-check with nm
:
puts()
is there but strcpy()
is not! It turns out that GCC has built-in
implementations of many string functions. Emphasis mine:
GCC provides a large number of built-in functions other than the ones mentioned above. Some of these are for internal use in the processing of exceptions or variable-length argument lists and are not documented here because they may change from time to time; we do not recommend general use of these functions.
The remaining functions are provided for optimization purposes.
The generated code on 32-bit x86 (below) is pretty neat; the call to strcpy()
simply becomes a sequence of immediate-to-memory mov
s!
I suppose this will have a couple performance benefits:
- One less symbol for the dynamic linker (
ld.so
) to resolve at process startup - No calls, so no stack manipulations (GCC on 32-bit x86 uses “cdecl” calling convention, where all arguments are passed on the stack)
- The string is in the mov instructions so no TLB or cache misses for the source string!
It’s a little more convoluted on amd64. As far as I can tell from page 218 of
the AMD64 Architecture Programmer’s Manual, the only 64-bit immediate
mov
is to a register, not to memory!
The generated code on amd64 reflects this, using two moves (immediate to register, then register to memory) per each 8 bytes of the string:
There’s a code size overhead here: 3 bytes of instructions per every 4 bytes of
string. As this is a significant overhead, it’s disabled by gcc -Os
. If you
absolutely need to set a breakpoint on strcpy()
, strcat()
, or the other GCC
built-ins, compile with -fno-builtin
to turn this behaviour off.