Git debug 以及测试 - Cactusinhand personal blog

TL;DR

在上一部分 Git 内部原理中我们已经对 Git 的内部基本原理比较熟悉，如 Git 的工作区，Git 的数据对象等。

这一部分，我们从实际开发的角度上出发，看如何对 Git 进行 debug，进行性能分析，进而找出可以优化改进的点，然后如何编写可用测试用例，验证我们的任何改进，最后将修改补丁提交到上游社区。

本文介绍 Git 源码的调试与性能分析方法，重点讲解如何使用 gprof 和火焰图工具定位性能瓶颈。以实际案例展示性能分析过程，包括编译参数设置、函数调用时间分析和调用图解读，帮助开发者深入理解 Git 内部执行机制，为后续的代码优化和贡献提供必要的技术基础。

在获取最新源码后，就可以编译了：

$ git https://gitee.com/mirrors/git.git
$ cd git
# 作为开发，可以加上 development = 1 开关
$ make DEVELOPEMNT=1

性能分析#

我们需要借助一些工具进行性能分析。

gprof#

gprof 是 GCC 工具自带的用于读取 profile 结果文件的工具，以作程序性能分析用。

使用时，需要加上 -pg 编译参数，编译器会自动在目标代码中插入用于性能测试的代码片断，这些代码在程序运行时采集并记录函数的调用关系和调用次数，并记录函数自身执行时间和被调用函数的执行时间。

首先修改 Git 的编译脚本文件：

checking paths.h presence... yes
diff --git a/config.mak.in b/config.mak.in
index e6a6d0f941..f909e5938a 100644
--- a/config.mak.in
+++ b/config.mak.in
@@ -3,6 +3,7 @@

 CC = @CC@
 CFLAGS = @CFLAGS@
+CFLAGS += -pg
 CPPFLAGS = @CPPFLAGS@
 LDFLAGS = @LDFLAGS@
 AR = @AR@
(END)

修改 config.mak.in 之后，需要执行 ./configure，更新 config.mak.autogen 文件

$ ./configure
$ make DEVELOPEMNT=1 -j2

然后选择我们想要测试的某个 Git 命令：

这里的/path/bin-wrappers/git 就是刚生成的二进制可执行文件
$ /path/bin-wrappers/git sub-command >/dev/null
# 为了测试效果，可以找一个比较费时的操作：
$ .git rev-list --objects HEAD >/dev/null

之后会生成 gmon.out 文件

查看：

$ gprof -b ./git gmon.out | less

Flat profile:

Each sample counts as 0.01 seconds.
  %   cumulative   self              self     total
 time   seconds   seconds    calls   s/call   s/call  name
 66.43      0.93     0.93 38840715     0.00     0.00  lookup_object
  7.14      1.03     0.10 38833916     0.00     0.00  tree_entry
  5.00      1.10     0.07 77424572     0.00     0.00  decode_tree_entry
  2.86      1.14     0.04    63460     0.00     0.00  process_tree
  2.14      1.17     0.03 38539150     0.00     0.00  object_as_type
  1.43      1.19     0.02 36794383     0.00     0.00  process_blob
  1.43      1.21     0.02   776884     0.00     0.00  hashmap_get
  1.43      1.23     0.02   369351     0.00     0.00  bsearch_hash
  1.43      1.25     0.02   301565     0.00     0.00  create_object
  0.71      1.26     0.01 38712286     0.00     0.00  update_tree_entry
  0.71      1.27     0.01 36794383     0.00     0.00  lookup_blob
  0.71      1.28     0.01  1966315     0.00     0.00  lookup_tree
  0.71      1.29     0.01   629866     0.00     0.00  do_xmalloc
  0.71      1.30     0.01   360462     0.00     0.00  nth_packed_object_offset
  0.71      1.31     0.01   360253     0.00     0.00  do_oid_object_info_extended
  0.71      1.32     0.01   237776     0.00     0.00  show_object
  0.71      1.33     0.01   237776     0.00     0.00  show_object_with_name

第一列 %time 表示时间百分比，（不含其调用函数的执行时间）

第二列 cumulative seconds 表示累积秒数，表示所有次执行的时间总和（不含其调用函数的执行时间）

第三列 self seconds 表示此函数单词执行时间（不含其调用函数的执行时间）

第四列 calls 表示调用次数，表示此函数被调用了多少次

第五列 self s/call 表示每次调用此函数平均消耗的时间（单位 s）

第六列 total s/call 表示总的被调用时间（单位 s)

第七列 name 表示调用函数名

这次的测试结果表明，执行 git rev-list --objects HEAD 时，调用 lookup_object 花时间最多，占用总时间的 66%，其次是调用 tree_entry，耗时 0.10s，再是 decode_tree_entry，耗时 0.07s。

我们可以只关心某个函数，比如当我们对 look_object 函数进行某些修改后，可以只看优化前后这个函数的调用时间变化如何，其它不相关函数可不关心。

往下翻，可以查看调用图：

                        Call graph

granularity: each sample hit covers 2 byte(s) for 0.71% of 1.40 seconds

index % time    self  children    called     name
                0.00    1.38       1/1           cmd_main [2]
[1]     98.6    0.00    1.38       1         handle_builtin [1]
                0.00    1.38       1/1           cmd_rev_list [3]
                0.00    0.00       1/1           check_pager_config [71]
                0.00    0.00       1/1           setup_git_directory_gently [89]
                0.00    0.00       2/2           validate_cache_entries [320]
                0.00    0.00       1/1           commit_pager_choice [326]
                0.00    0.00       1/1           get_super_prefix [353]
                0.00    0.00       1/1           trace_argv_printf_fl [426]
                0.00    0.00       1/1           trace2_cmd_name_fl [424]
                0.00    0.00       1/1           trace2_cmd_list_config_fl [422]
                0.00    0.00       1/1           trace2_cmd_list_env_vars_fl [423]
                0.00    0.00       1/1           trace2_cmd_exit_fl [421]
                0.00    0.00       1/1           get_builtin [345]
                0.00    0.00       1/1           setup_git_directory [405]
                0.00    0.00       1/1           trace_repo_setup [429]
-----------------------------------------------
                                                 <spontaneous>
[2]     98.6    0.00    1.38                 cmd_main [2]
                0.00    1.38       1/1           handle_builtin [1]
                0.00    0.00       1/1           trace_command_performance [427]
                0.00    0.00       1/1           handle_options [365]
                0.00    0.00       1/1           setup_path [408]
-----------------------------------------------
                0.00    1.38       1/1           handle_builtin [1]
[3]     98.6    0.00    1.38       1         cmd_rev_list [3]
                0.00    1.38       1/1           traverse_commit_list_filtered [5]
                0.00    0.00       1/1           repo_config [74]
                0.00    0.00       1/1           setup_revisions [80]
                0.00    0.00       1/1           prepare_revision_walk [82]
                0.00    0.00       1/1           repo_init_revisions [123]
                0.00    0.00       1/1           reflog_walk_empty [395]
                0.00    0.00       1/1           mark_edges_uninteresting [380]
                0.00    0.00       1/1           stop_progress_msg [412]
                0.00    0.00       1/1           stop_progress [411]
                0.00    0.00       1/1           git_config [355]
                0.00    0.00       1/1           init_display_notes [369]
-----------------------------------------------

这张调用图展示了三条调用记录，index 号分别是 [1], [2], [3]

没条记录 index 号所在的行，是当前被调用的函数。

每条记录的当前被调用函数之上的行，都是其直接父函数

每条记录的当前被调用函数之下的行，都是其直接子函数

但是这种调用图的缺点就在于调用关系不够明显。好在我们可以借助火焰图来更好的查看调用图。

flame graph#

https://www.brendangregg.com/flamegraphs.html

首先安装 perf:

$ sudo apt install linux-tools-common linux-tools-$(uname -r) linux-cloud-tools-$(uname -r)

如果是用 Windows 上的 WSL2，以上安装方式不奏效，原因是 WSL2 使用定制版的 Linux Kernel，

我们需要手动下载源码，进行编译，获取 perf 工具：

$ sudo apt install build-essential flex bison libssl-dev libelf-dev
$ git clone https://gitee.com/mirrors/WSL2-Linux-Kernel.git
$ cd WSL2-Linux-Kernel/tools/perf
$ make
# 将编译好的 perf 工具复制到系统 /usr/bin 目录下
$ sudo cp perf /usr/bin
# 查看
$ perf -v

perf 常用命令：

pref list：查看 perf 支持的监控事件 (event)

perf stat：查看程序运行过程中各种 event 的统计

perf record：记录更详细的信息，包括 IP, Stack 等，会生成 perf.data 文件

perf report：读取 perf.data 文件，并输出 profile 结果

perf script：读取 perf.data 文件，并输出 trace 结果

更多信息，可以查看 perf 帮助文档

然后安装 flame graph 库：

$ git clone https://github.com/brendangregg/FlameGraph.git

下载完成后，无需编译，可直接使用里面的可执行文件。

stackcollapse.pl: for DTrace stacks
stackcollapse-perf.pl: for Linux perf_events “perf script” output
stackcollapse-pmc.pl: for FreeBSD pmcstat -G stacks
stackcollapse-stap.pl: for SystemTap stacks
stackcollapse-instruments.pl: for XCode Instruments
stackcollapse-vtune.pl: for Intel VTune profiles
stackcollapse-ljp.awk: for Lightweight Java Profiler
stackcollapse-jstack.pl: for Java jstack(1) output
stackcollapse-gdb.pl: for gdb(1) stacks
stackcollapse-go.pl: for Golang pprof stacks
stackcollapse-vsprof.pl: for Microsoft Visual Studio profiles

生成 perf.data 文件：

$ sudo perf record -g -F 100 'test script'

-g 指定输出数据中包含调用堆栈
-F 指定采用频率

生成 svg 文件：

$ sudo perf script -i perf.data | stackcollapse-perf.pl | flamegraph.pl > out.svg

最终的输出文件打开就是火焰图。

测试#

Git 使用了特殊的测试框架，它的测试输出是按照 TAP 格式 (Test Anything Protocol)

http://testanything.org

在这种框架中，产生 TAP 输出的叫做 TAP 生产者，读取 TAP 输出的叫做 TAP 消费者。

Git 的所有测试都是 shell 脚本，放在目录 t/ 中，要运行测试很简单，执行 make 就行。

$ cd t
# 运行所有测试
$ make
# 或者直接运行单项
$ sh ./t0000-basic.sh
# 或者
$ ./t0000-basic.sh

同时在 Linux 中自带 prove 工具，这个工具可以运行 TAP 测试，而且具有很大有用的选项。

因此 Git 的测试框架使用起来非常灵活，并由此诞生了这个框架：sharness 感兴趣的可以去了解一下。