Optimizing C and C
Optimizing C and C
Embedded software often runs on processors with limited computation power, thus
optimizing the code becomes a necessity. In this article we will explore the following
optimization techniques for C and C++ code developed for Real-time and Embedded
Systems.
Many techniques discussed here have roots in the material we covered in the articles
dealing with C to Assembly translation. A good understanding of the following articles
will help:
C To Assembly Translation
C To Assembly Translation II
C To Assembly Translation III
To reduce the number of comparisons being performed, judiciously break big switch
statements into nested switches. Put frequently occurring case labels into one switch
and keep the rest of case labels into another switch which is the default leg of the first
switch.
pMsg = ReceiveMessage();
switch (pMsg->type)
{
case FREQUENT_MSG1:
handleFrequentMsg1();
break;
case FREQUENT_MSG2:
handleFrequentMsg2();
break;
. . .
case FREQUENT_MSGn:
handleFrequentMsgn();
break;
default:
// Nested switch statement for handling infrequent messages.
switch (pMsg->type)
{
case INFREQUENT_MSG1:
handleInfrequentMsg1();
break;
case INFREQUENT_MSG2:
handleInfrequentMsg2();
break;
. . .
case INFREQUENT_MSGm:
handleInfrequentMsgm();
break;
}
}
All local variables are in registers so this improves performance over accessing
them from memory.
If no local variables need to be saved on the stack, the compiler will not incur the
overhead of setting up and restoring the frame pointer.
Thus it is efficient to pass references as parameters. This way you save on the
overhead of a temporary object creation, copying and destruction. This optimization can
be performed easily without a major impact to the code by replacing pass by value
parameters by const references. (It is important to pass const references so that a bug
in the called function does not change the actual value of the parameter.
Passing bigger objects as return values also has the same performance issues. A
temporary return object is created in this case too.
Lets consider the following code which presents two functions that perform the same
operation with char and int.
1. Convert the second parameter into an int by sign extension (C and C++ push
parameters in reverse)
2. Push the sign extended parameter on the stack as b.
3. Convert the first parameter into an int by sign extension.
4. Push the sign extended parameter on to the stack as a.
5. The called function adds a and b
6. The result is cast to a char.
7. The result is stored in char c.
8. c is again sign extended
9. Sign extended c is copied into the return value register and function returns to
caller.
10. The caller now converts again from int to char.
11. The result is stored.
Thus we can conclude that int should be used for all interger variables unless storage
requirements force us to use a char or short. When char and short have to be used,
consider the impact of byte alignment and ordering to see if you would really save
space. (Many processors align structure elements at 16 byte boundaries)
void foo_optimized()
{
Complex c = 5;
}
In the function foo, the complex number c is being initialized first by the instantiation and
then by the assignment. In foo_optimized, c is being initialized directly to the final value,
thus saving a call to the default constructor of Complex.
In the example given below, the optimized version of the Employee constructor saves
the default constructor calls for m_name and m_designation strings.