(Continued from part 6)
“The profiling flag (-pg) causes the compiler to issue a call to mcount() on every function entry. This is dispatched by machine specific glue to _mcount(frompc,selfpc).”
This quote is taken from a very informative document at NetBSD.org:
Since we’re using gcc to compile the kernel, if we use the “-pg” option then the “glue” will be applied. It’s only necessary for us to implement the counter function _mcount( frompc,selfpc).
This approach is really “instrumenting” the kernel, but doing it in a (mostly) automatic way.
The NetBSD version of mcount can be found here:
A gprof outfitted kernel collects the data, and then has to store it someplace. In the usual case with gprof kernel profiling, the data is kept in kernel structures, and accessed (indirectly) with user land programs. Relative to gprof profiling, the difference between a kernel target and an application target is mostly a matter of how the data is collected and extracted. The kernel adds some complexity to the extraction process.
Mainly, two types of profiling data can be extracted. First, a cumulative histogram of call frequency, and secondly a call graph – which is a report of the program flow, from one function to the next. The call graph can be very long and repetitive. Sometimes, the call frequency histogram can be used to pinpoint a place in the call graph.
Profiling with histograms and call graphs can give a pretty decent overall picture of code, but just brute force step-throughs with the debugger can also be informative. An issue with the kernel we’re using, when the system is running on particular versions of the QEMU emulator, seems to be localized to two particular functions. The following graphic shows one of them:
The issue is intermittent disk access, and it’s been a real tough chase thus far. Stepping into the first function of interest (atapiStartStop()) never results in any exception or error. Stepping over the function usually causes the problem to exhibit itself. The same is true of the second function that seems to be related to the problem: sendAtapiPacket(). The following graphic shows the execution inside of the second function:
So, we see that a debugger can actually change the behaviour of the program being debugged. By using the debugger in a certain way, we “fix” the problem. We simultaneously make it more difficult to find the true cause. In the graphics, what is being presented is a situation where a function operates normally when it is stepped into, and misbehaves when it is stepped over. Due to this problem, graphical debuggers are not cure-alls. So, the latency (extreme latency) of the debugger, especially when single stepping into a function, can change the behavior of the function. In the present case, it actually “fixes” the problem, at least for the period of time that the debugger is in operation. The following graphic shows more of the code in the second function:
The graphic (below) shows what happens when the functions we’re looking at are stepped over, rather than into: