IT评测·应用市场-qidao123.com

标题: 【Visual Leak Detector】核心源码剖析(VLD 1.0) [打印本页]

作者: 河曲智叟    时间: 2023-4-28 00:06
标题: 【Visual Leak Detector】核心源码剖析(VLD 1.0)
说明

使用 VLD 内存泄漏检测工具辅助开发时整理的学习笔记。本篇对 VLD 1.0 源码做内存泄漏检测的思路进行剖析。同系列文章目录可见 《内存泄漏检测工具》目录

目录


1. 源码获取

version 1.0 及之前版本都使用旧的检测思路,可以在网站 CodeProject-Visual-Leak-Detector 中下载 version 1.0 的源码(国内网络资源:百度网盘-vld-1.0 源码包),同时在该网站中可以看到库作者 Dan Moulding 对旧检测原理的介绍。这个网站中有下图这段文字,但经过我一番查找,还是未找到 Dan Moulding 对后续新检测原理的介绍文章,本篇文章主要对 version 1.0 的源码进行剖析。

version 1.0 的源码算上注释一共不到 3000 行,而且代码注释写得很详细,推荐有兴趣的仔细阅读源码。以下资料可能对理解其检测原理有帮助:
2. 源码文件概览

version 1.0 源码包中一共有 11 个文件,目录结构如下:
  1. vld-10-src
  2.     CHANGES.txt
  3.     COPYING.txt
  4.     README.html
  5.     vld.cpp
  6.     vld.dsp
  7.     vld.h
  8.     vldapi.cpp
  9.     vldapi.h
  10.     vldint.h
  11.     vldutil.cpp
  12.     vldutil.h
复制代码

其中 3 个 .cpp 文件,4 个 .h 文件,2 个 .txt 文件,1 个 .dsp 文件,1 个 .html 文件,各文件用途简述如下:
3. 源码剖析

3.1 注册自定义 AllocHook 函数

使用 #pragma init_seg (compiler) 指令构造一个全局对象 visualleakdetector,来确保这个对象的构造函数最先被调用(详见 vld.cpp 第 49~55 行)。
  1. // The one and only VisualLeakDetector object instance. This is placed in the
  2. // "compiler" initialization area, so that it gets constructed during C runtime
  3. // initialization and before any user global objects are constructed. Also,
  4. // disable the warning about us using the "compiler" initialization area.
  5. #pragma warning (disable:4074)
  6. #pragma init_seg (compiler)
  7. VisualLeakDetector visualleakdetector;
复制代码
在全局对象 visualleakdetector 的构造函数中调用 _CrtSetAllocHook 接口注册自定义 AllocHook 函数,使程序能捕捉之后的内存操作(内存分配/内存释放)事件(详见 vld.cpp 第 57~95 行)。
  1. // Constructor - Dynamically links with the Debug Help Library and installs the
  2. //   allocation hook function so that the C runtime's debug heap manager will
  3. //   call the hook function for every heap request.
  4. //
  5. VisualLeakDetector::VisualLeakDetector ()
  6. {
  7.     ...
  8.     if (m_tlsindex == TLS_OUT_OF_INDEXES) {
  9.         report("ERROR: Visual Leak Detector: Couldn't allocate thread local storage.\n");
  10.     }
  11.     else if (linkdebughelplibrary()) {
  12.         // Register our allocation hook function with the debug heap.
  13.         m_poldhook = _CrtSetAllocHook(allochook);
  14.         report("Visual Leak Detector Version "VLD_VERSION" installed ("VLD_LIBTYPE").\n");
  15.         ...
  16.     }
  17.    
  18.     report("Visual Leak Detector is NOT installed!\n");
  19. }
复制代码
此外,在 visualleakdetector 的构造函数中,还做了以下工作:
3.2 使用 StackWalk64 获取调用堆栈信息

全局对象 visualleakdetector 有一个成员变量 m_mallocmap,用来存储堆内存分配时的调用堆栈信息,这是一种基于红黑树的自定义 Map 容器(类似于 STL 的 map),这个容器的声明及定义可见 vldutil.h 和 vldutil.cpp 文件 。
  1. ////////////////////////////////////////////////////////////////////////////////
  2. //
  3. //  The BlockMap Class
  4. //
  5. //  This data structure is similar in concept to a STL map, but is specifically
  6. //  tailored for use by VLD, making it more efficient than a standard STL map.
  7. //
  8. //  The purpose of the BlockMap is to map allocated memory blocks (via their
  9. //  unique allocation request numbers) to the call stacks that allocated them.
  10. //  One of the primary concerns of the BlockMap is to be able to quickly insert
  11. //  search and delete. For this reason, the underlying data structure is
  12. //  a red-black tree (a type of balanced binary tree).
  13. //
  14. //  The red-black tree is overlayed on top of larger "chunks" of pre-allocated
  15. //  storage. These chunks, which are arranged in a linked list, make it possible
  16. //  for the map to have reserve capacity, allowing it to grow dynamically
  17. //  without incurring a heap hit each time a new element is added to the map.
  18. //
  19. class BlockMap
  20. {
  21.     ...
  22. };
复制代码
每次进行内存操作(alloc/realloc/free)时,都会自动执行前述自定义的 AllocHook 函数,其定义如下,详见 vld.cpp 第 175~260 行。
  1. // allochook - This is a hook function that is installed into Microsoft's
  2. //   CRT debug heap when the VisualLeakDetector object is constructed. Any time
  3. //   an allocation, reallocation, or free is made from/to the debug heap,
  4. //   the CRT will call into this hook function.
  5. //
  6. //  Note: The debug heap serializes calls to this function (i.e. the debug heap
  7. //    is locked prior to calling this function). So we don't need to worry about
  8. //    thread safety -- it's already taken care of for us.
  9. //
  10. //  - type (IN): Specifies the type of request (alloc, realloc, or free).
  11. //
  12. //  - pdata (IN): On a free allocation request, contains a pointer to the
  13. //      user data section of the memory block being freed. On alloc requests,
  14. //      this pointer will be NULL because no block has actually been allocated
  15. //      yet.
  16. //
  17. //  - size (IN): Specifies the size (either real or requested) of the user
  18. //      data section of the memory block being freed or requested. This function
  19. //      ignores this value.
  20. //
  21. //  - use (IN): Specifies the "use" type of the block. This can indicate the
  22. //      purpose of the block being requested. It can be for internal use by
  23. //      the CRT, it can be an application defined "client" block, or it can
  24. //      simply be a normal block. Client blocks are just normal blocks that
  25. //      have been specifically tagged by the application so that the application
  26. //      can separately keep track of the tagged blocks for debugging purposes.
  27. //
  28. //  - request (IN): Specifies the allocation request number. This is basically
  29. //      a sequence number that is incremented for each allocation request. It
  30. //      is used to uniquely identify each allocation.
  31. //
  32. //  - filename (IN): String containing the filename of the source line that
  33. //      initiated this request. This function ignores this value.
  34. //
  35. //  - line (IN): Line number within the source file that initiated this request.
  36. //      This function ignores this value.
  37. //
  38. //  Return Value:
  39. //
  40. //    Always returns true, unless another allocation hook function was already
  41. //    installed before our hook function was called, in which case we'll return
  42. //    whatever value the other hook function returns. Returning false will
  43. //    cause the debug heap to deny the pending allocation request (this can be
  44. //    useful for simulating out of memory conditions, but Visual Leak Detector
  45. //    has no need to make use of this capability).
  46. //
  47. int VisualLeakDetector::allochook (int type, void *pdata, size_t size, int use, long request, const unsigned char *file, int line)
  48. {
  49.     ...
  50.     // Call the appropriate handler for the type of operation.
  51.     switch (type) {
  52.     case _HOOK_ALLOC:
  53.         visualleakdetector.hookmalloc(request);
  54.         break;
  55.     case _HOOK_FREE:
  56.         visualleakdetector.hookfree(pdata);
  57.         break;
  58.     case _HOOK_REALLOC:
  59.         visualleakdetector.hookrealloc(pdata, request);
  60.         break;
  61.     default:
  62.         visualleakdetector.report("WARNING: Visual Leak Detector: in allochook(): Unhandled allocation type (%d).\n", type);
  63.         break;
  64.     }
  65.     ...
  66. }
复制代码
这个函数的输入参数中,有一个 request 值,这个值被用来做为所分配内存块的唯一标识符,即 m_mallocmap 的 key 值。函数体中,会根据内存操作事件的类型做对应的处理,hookmalloc()、hookfree() 与 hookrealloc() 的定义详见 vld.cpp 第 594~660 行。
  1. void VisualLeakDetector::hookfree (const void *pdata)
  2. {
  3.     long request = pHdr(pdata)->lRequest;
  4.     m_mallocmap->erase(request);
  5. }
  6. void VisualLeakDetector::hookmalloc (long request)
  7. {
  8.     CallStack *callstack;
  9.     if (!enabled()) {
  10.         // Memory leak detection is disabled. Don't track allocations.
  11.         return;
  12.     }
  13.     callstack = m_mallocmap->insert(request);
  14.     getstacktrace(callstack);
  15. }
  16. void VisualLeakDetector::hookrealloc (const void *pdata, long request)
  17. {
  18.     // Do a free, then do a malloc.
  19.     hookfree(pdata);
  20.     hookmalloc(request);
  21. }
复制代码
(1)若涉及到分配新内存,则使用内联汇编技术获取当前程序地址,然后将其作为参数初值,循环调用 StackWalk64 接口获得完整的调用堆栈信息 CallStack(调用堆栈中各指令的地址信息),详见 getstacktrace() 函数,vld.cpp 第 530~592 行,接着与 request 值关联一起插入到 m_mallocmap 中。如下所示,其中的 pStackWalk64 是一个函数指针,指向 dbghelp.dll 库中的 StackWalk64 函数。
  1. void VisualLeakDetector::getstacktrace (CallStack *callstack)
  2. {
  3.     DWORD        architecture;
  4.     CONTEXT      context;
  5.     unsigned int count = 0;
  6.     STACKFRAME64 frame;
  7.     DWORD_PTR    framepointer;
  8.     DWORD_PTR    programcounter;
  9.     // Get the required values for initialization of the STACKFRAME64 structure
  10.     // to be passed to StackWalk64(). Required fields are AddrPC and AddrFrame.
  11. #if defined(_M_IX86) || defined(_M_X64)
  12.     architecture = X86X64ARCHITECTURE;
  13.     programcounter = getprogramcounterx86x64();
  14.     __asm mov [framepointer], BPREG // Get the frame pointer (aka base pointer)
  15. #else
  16. // If you want to retarget Visual Leak Detector to another processor
  17. // architecture then you'll need to provide architecture-specific code to
  18. // retrieve the current frame pointer and program counter in order to initialize
  19. // the STACKFRAME64 structure below.
  20. #error "Visual Leak Detector is not supported on this architecture."
  21. #endif // defined(_M_IX86) || defined(_M_X64)
  22.     // Initialize the STACKFRAME64 structure.
  23.     memset(&frame, 0x0, sizeof(frame));
  24.     frame.AddrPC.Offset    = programcounter;
  25.     frame.AddrPC.Mode      = AddrModeFlat;
  26.     frame.AddrFrame.Offset = framepointer;
  27.     frame.AddrFrame.Mode   = AddrModeFlat;
  28.     // Walk the stack.
  29.     while (count < _VLD_maxtraceframes) {
  30.         count++;
  31.         if (!pStackWalk64(architecture, m_process, m_thread, &frame, &context,
  32.                           NULL, pSymFunctionTableAccess64, pSymGetModuleBase64, NULL)) {
  33.             // Couldn't trace back through any more frames.
  34.             break;
  35.         }
  36.         if (frame.AddrFrame.Offset == 0) {
  37.             // End of stack.
  38.             break;
  39.         }
  40.         // Push this frame's program counter onto the provided CallStack.
  41.         callstack->push_back((DWORD_PTR)frame.AddrPC.Offset);
  42.     }
  43. }
复制代码
通过内联汇编获取当前程序地址的代码详见 getprogramcounterx86x64() 函数,vld.cpp 第 501~528 行,如下,通过 return 这个函数的返回地址得到。
  1. // getprogramcounterx86x64 - Helper function that retrieves the program counter
  2. //   (aka the EIP (x86) or RIP (x64) register) for getstacktrace() on Intel x86
  3. //   or x64 architectures (x64 supports both AMD64 and Intel EM64T). There is no
  4. //   way for software to directly read the EIP/RIP register. But it's value can
  5. //   be obtained by calling into a function (in our case, this function) and
  6. //   then retrieving the return address, which will be the program counter from
  7. //   where the function was called.
  8. //
  9. //  Note: Inlining of this function must be disabled. The whole purpose of this
  10. //    function's existence depends upon it being a *called* function.
  11. //
  12. //  Return Value:
  13. //
  14. //    Returns the caller's program address.
  15. //
  16. #if defined(_M_IX86) || defined(_M_X64)
  17. #pragma auto_inline(off)
  18. DWORD_PTR VisualLeakDetector::getprogramcounterx86x64 ()
  19. {
  20.     DWORD_PTR programcounter;
  21.     __asm mov AXREG, [BPREG + SIZEOFPTR] // Get the return address out of the current stack frame
  22.     __asm mov [programcounter], AXREG    // Put the return address into the variable we'll return
  23.     return programcounter;
  24. }
  25. #pragma auto_inline(on)
  26. #endif // defined(_M_IX86) || defined(_M_X64)
复制代码
(2)若涉及到释放旧内存,则从 m_mallocmap 中去除这个内存块对应的 request 值及 CallStack 信息,详见 hookfree() 函数。
3.3 遍历双向链表生成泄漏检测报告

程序结束时,全局对象 visualleakdetector 的析构函数最后被调用(因为构造顺序与析构顺序相反)。在它的析构函数中(详见 vld.cpp 第 97~173 行),主要做了以下几件事:
(1)注销自定义 AllocHook 函数。
  1. // Unregister the hook function.
  2. pprevhook = _CrtSetAllocHook(m_poldhook);
  3. if (pprevhook != allochook) {
  4.     // WTF? Somebody replaced our hook before we were done. Put theirs
  5.     // back, but notify the human about the situation.
  6.     _CrtSetAllocHook(pprevhook);
  7.     report("WARNING: Visual Leak Detector: The CRT allocation hook function was unhooked prematurely!\n"
  8.            "    There's a good possibility that any potential leaks have gone undetected!\n");
  9. }
复制代码
(2)生成泄漏检测报告。详见 reportleaks() 函数,vld.cpp 第 802~962 行。报告生成思路如下:
(3)卸载 dbghelp.dll 库。
  1. // Unload the Debug Help Library.
  2. FreeLibrary(m_dbghelp);
复制代码
(4)泄漏自检。通过遍历系统用于内存管理的双向链表,判断 VLD 自身是否发生内存泄漏,同样是依据每个节点的 nBlockUse 值。
  1. // Do a memory leak self-check.pheap = new char;
  2. pheader = pHdr(pheap)->pBlockHeaderNext;
  3. delete pheap;while (pheader) {    if (_BLOCK_SUBTYPE(pheader->nBlockUse) == VLDINTERNALBLOCK) {        // Doh! VLD still has an internally allocated block!        // This won't ever actually happen, right guys?... guys?        internalleaks++;        leakfile = pheader->szFileName;        leakline = pheader->nLine;        report("ERROR: Visual Leak Detector: Detected a memory leak internal to Visual Leak Detector!!\n");        report("---------- Block %ld at "ADDRESSFORMAT": %u bytes ----------\n", pheader->lRequest, pbData(pheader), pheader->nDataSize);        report("%s (%d): Full call stack not available.\n", leakfile, leakline);        dumpuserdatablock(pheader);        report("\n");    }    pheader = pheader->pBlockHeaderNext;}if (_VLD_configflags & VLD_CONFIG_SELF_TEST) {    if ((internalleaks == 1) && (strcmp(leakfile, m_selftestfile) == 0) && (leakline == m_selftestline)) {        report("Visual Leak Detector passed the memory leak self-test.\n");    }    else {        report("ERROR: Visual Leak Detector: Failed the memory leak self-test.\n");    }}
复制代码
(5)输出卸载成功的提示信息。这一输出发生在析构函数的结尾括号 } 前。
  1. report("Visual Leak Detector is now exiting.\n");
复制代码
4. 其他问题

4.1 如何区分分配内存的来由

_CrtMemBlockHeader 结构体有个 nBlockUse 成员变量,用来标识分配用途,这个值是可以人为设置的,VLD 正是利用这一点,重载了 VLD 内部使用的内存分配函数,使得库内部每次进行内存请求时,都会将这个 nBlockUse 设置为 VLD 分配标识,详见 vldutil.h 第 49~153 行。
(1)分配时,核心代码如下,第二个参数为设置的 nBlockUse 值:
  1. void *pdata = _malloc_dbg(size, _CRT_BLOCK | (VLDINTERNALBLOCK << 16), file, line);
复制代码
(3)这里面涉及到的几个宏定义如下:
文件 crtdbg.h 中。
  1. // 判断是否由 CRT 或 VLD 分配
  2. if (_BLOCK_TYPE(pheader->nBlockUse) == _CRT_BLOCK) {
  3.     ...
  4. }
  5. // 判断是否由 VLD 分配
  6. if (_BLOCK_SUBTYPE(pheader->nBlockUse) == VLDINTERNALBLOCK) {
  7.     ...
  8. }
复制代码
文件 vldutil.h 中。
  1. #define _BLOCK_TYPE(block)          (block & 0xFFFF)
  2. #define _BLOCK_SUBTYPE(block)       (block >> 16 & 0xFFFF)
  3. // Memory block identification
  4. #define _FREE_BLOCK      0
  5. #define _NORMAL_BLOCK    1
  6. #define _CRT_BLOCK       2
  7. #define _IGNORE_BLOCK    3
  8. #define _CLIENT_BLOCK    4
  9. #define _MAX_BLOCKS      5
复制代码
4.2 如何实现多线程检测

使用线程本地存储(Thread Local Storage),参考 MicroSoft-Using-Thread-Local-Storage。全局对象 visualleakdetector 有个成员变量 m_tlsindex,详见 vldint.h 第 146 行,如下:
  1. #define VLDINTERNALBLOCK   0xbf42    // VLD internal memory block subtype
复制代码
这个变量被用来接收 TlsAlloc() 返回的索引值,在 visualleakdetector 的构造函数中被初始化,详见 vld.cpp 第 69 行、77~79 行,如下:
  1. DWORD m_tlsindex;     // Index for thread-local storage of VLD data
复制代码
初始化成功后,当前进程的任何线程都可以使用这个索引值来存储和访问对应线程本地的值,不同线程间互不影响,访问获得的结果也与其他线程无关,因此可用它来存储 VLD 在每个线程中的开关状态。在分配新内存时,会触发 hookmalloc() 函数,该函数会在分配行为所属的线程中执行,详见 vld.cpp 第 611~636 行:
  1. m_tlsindex = TlsAlloc();
  2. ...
  3.    
  4. if (m_tlsindex == TLS_OUT_OF_INDEXES) {
  5.     report("ERROR: Visual Leak Detector: Couldn't allocate thread local storage.\n");
  6. }
复制代码
(1)判断当前线程是否开启了 VLD。在 enabled() 函数中,会调用 TlsGetValue() 访问所属线程本地的值,根据此值判断 VLD 内存检测功能是否处于开启状态。若是第一次访问(此时 TlsGetValue() 的返回值为 VLD_TLS_UNINITIALIZED),则根据用户配置,使用 TlsSetValue() 初始化对应线程本地的值。
  1. void VisualLeakDetector::hookmalloc (long request)
  2. {
  3.     CallStack *callstack;
  4.     if (!enabled()) {
  5.         // Memory leak detection is disabled. Don't track allocations.
  6.         return;
  7.     }
  8.     callstack = m_mallocmap->insert(request);
  9.     getstacktrace(callstack);
  10. }
复制代码
(2)对当前线程设置 VLD 的开关状态。这是两个对外的接口函数,其定义如下,详见 vldapi.cpp 第 31~57 行,使用 TlsSetValue() 设置对应值即可:
  1. // enabled - Determines if memory leak detection is enabled for the current
  2. //   thread.
  3. //
  4. //  Return Value:
  5. //
  6. //    Returns true if Visual Leak Detector is enabled for the current thread.
  7. //    Otherwise, returns false.
  8. //
  9. bool VisualLeakDetector::enabled ()
  10. {
  11.     unsigned long status;
  12.     status = (unsigned long)TlsGetValue(m_tlsindex);
  13.     if (status == VLD_TLS_UNINITIALIZED) {
  14.         // TLS is uninitialized for the current thread. Use the initial state.
  15.         if (_VLD_configflags & VLD_CONFIG_START_DISABLED) {
  16.             status = VLD_TLS_DISABLED;
  17.         }
  18.         else {
  19.             status = VLD_TLS_ENABLED;
  20.         }
  21.         // Initialize TLS for this thread.
  22.         TlsSetValue(m_tlsindex, (LPVOID)status);
  23.     }
  24.     return (status & VLD_TLS_ENABLED) ? true : false;
  25. }
复制代码
免责声明:如果侵犯了您的权益,请联系站长,我们会及时删除侵权内容,谢谢合作!




欢迎光临 IT评测·应用市场-qidao123.com (https://dis.qidao123.com/) Powered by Discuz! X3.4