inline: The Reality

Eliminating the CALL opcode.
Merging logic directly into the instruction stream.

1. Generated Assembly

In C++, when a function is inlined, the compiler removes the function call instruction and directly inserts the function’s body at the location where it is used. This changes the structure of the generated assembly code and eliminates several instructions normally required for a function call.

A normal function call typically involves:
  • Saving the return address
  • Passing parameters
  • Jumping to the function
  • Returning control back to the caller

2. Call Elimination

When a function is inlined, these instructions are eliminated. Instead of performing a call, the compiler places the equivalent instructions of the function body directly into the caller's instruction stream.

This reduces branching and can allow the compiler to perform further optimizations, such as constant propagation, register reuse, and instruction reordering.

3. Optimization Opportunities

Inlining often enables additional compiler optimizations because the compiler can see the entire context of the code at the call site. This allows:
  • Better register allocation
  • Constant folding when parameters are known at compile time
  • Removal of redundant calculations
  • Improved instruction scheduling

As a result, inlining can sometimes produce faster machine code than a traditional function call.

4. Trade-offs

Although inlining can improve performance, it also increases the amount of generated machine code. If a function is inlined at many locations, the binary may grow significantly.

Larger binaries can negatively impact instruction cache performance, which may offset the gains from eliminating function calls. For this reason, modern compilers use sophisticated heuristics to determine when inlining is beneficial.

5. Practical Insight

In performance-critical domains such as embedded systems, game engines, and high-performance libraries, developers often rely on compiler inlining together with optimization flags to ensure that small functions are expanded where beneficial while avoiding unnecessary code growth.
Hinting at Silicon Efficiency