How Does GDB Parse Breakpoint Location

When you use GDB, you can add breakpoints, giving it some arguments. For example, you can specify the location by giving line numbers, function names, variable names, etc. you can also add conditions. When the condition is met, GDB will pause the inferior when the breakpoint is hit. But how does GDB find out the address?

Let's look at the file gdb/breakpoint.c. It implements the command "break" in GDB. Actually, the function break_command_really does parse the command arguments, converts the arguments to address, and adds the breakpoint.

It first calls do_captured_parse_breakpoint, which will eventually calls parse_breakpoint_sals to parse the argument of the command break and return the address specification in the form of the type symtabs_and_lines.

symtabs_and_lines is a struct. It represents a sequences of symtab_and_line. symtab_and_line is a struct containing the information of a symbol table, a line number, the pc, and etc.

The function parse_breakpoint_sals will find out the address, the line number, the symbol table, and the section of the given location, and store them in the variable sal which is of type struct symbol_and_line. What it does is dependent on whether the location is specified.

It is quite simple if the location is not specified. That is, the command is "break" but "break 47". The function just simply assigns the defaults to sal. These defaults are stored in the variables default_breakpoint_pc, default_breakpoint_line, default_breakpoint_symtab. These variables are set by the function print_stack_frame.

In the other case where the location is given, it will call decode_line_1 to parse the location. Considering the location can be an address, a line number, a function, etc, the function decode_line_1 uses different methods to parse them.
For the address, it calls decode_indirect.
If you specify an Object-C method or a compound method without a file name ("A::B::func", "[object func]", etc), it will search the symbol table for the symbol of the function.
If you specify a file name (e.g. "file.c:47"), it first reads the symbol table of that file. Then it calls decode_all_digits if the location is a number (e.g. a line number) or decode_variable if the location is a name (a function or a variable).

The function decode_variable merely looks up the symbol of the variable in the symbol table. GDB uses the struct symbol to represent a function, and a variable, etc. Each symbol has an address. A function address, unlike a local variable address, is known when the when the program is loaded. So function decode_variable can extract the address from the symbol easily.

The function decode_all_digits will calculate the line number (if it begins with "+" or "-", meaning relative to the current line number) and finds out the symbol table which will be used to calculate the address. It also sets the flag explicit_line to 1. And when to find the address of the given line number? it is finished in the function breakpoint_sals_to_pc.

Also, GDB will create a canonical string, which is similar to "file.c:123".

When all these functions return, break_command_really will call breakpoint_sals_to_pc, which will call sals_to_pc, to extract information from the struct symtab_and_line and return the address.

What sals_to_pc does is to read each item in the symbol table and to compare the line of the item with the given line. And it will find the exact match item or the approximate one. There is a line table in each symbol table. It is a mapping of a line number to an address. GDB uses this technique to find out the address of a given line number.

Once the address is found, the function sals_to_pc will find the section in which this line is.

Facebooktwitterredditlinkedintumblr

Leave a comment

Your email address will not be published. Required fields are marked *