使用PIX对WebGPU进行性能分析

·

7 min read

Note: These steps are largely adapted and expanded from this incredibly helpful gist by Popov72 and the debug marker documentation from the Dawn repo.
注意:这些步骤在很大程度上是根据Popov72的非常有帮助的要点和Dawn存储库中的调试标记文档进行调整和扩展的。

原文地址:Profiling WebGPU with PIX | Toji.dev

Windows 系统下WebGPU后端为 D3D12使用 PIX ,Vulkan 后端可使用 RenderDoc,Metal 后端可使用 GPU Frame Capture

Microsoft PIX 微软 PIX

PIX is a tool from Microsoft for graphics debugging of D3D11/12 applications. It’s a well known, well supported, professional tool for detailed graphics debugging.
PIX是微软用于调试D3D11/12应用程序的图形工具。它是一个广为人知、得到良好支持的专业工具,用于详细的图形调试。

Since Chrome doesn’t operate like your average game it takes a little it more jumping through hoops than usual to get PIX to play nicely with Chrome and capture our WebGPU pages, but the insight it can offer once you do is well worthwhile!
由于Chrome的运行方式与普通游戏不同,要让PIX与Chrome协作并捕获我们的WebGPU页面,需要比平常更多的操作,但一旦成功,它所能提供的见解是非常值得的!

Which Chrome channel? 哪个Chrome通道?

First you should figure out what install of Chrome you’re going to be profiling. It’s not uncommon for developers to have multiple Chrome channels installed on their device, so take a moment to double check that you’re using the right one.
首先,您应该弄清楚要对哪个版本的Chrome进行性能分析。开发人员在设备上安装了多个Chrome通道并不罕见,所以请花点时间仔细检查您是否使用了正确的版本。

The default install directories for each channel are:
每个频道的默认安装目录为:

  • Chrome Stable: C:\Program Files\Google\Chrome\Application Chrome稳定版: C:\Program Files\Google\Chrome\Application

  • Chrome Beta: C:\Program Files\Google\Chrome Beta\Application Chrome 浏览器测试版: C:\Program Files\Google\Chrome Beta\Application

  • Chrome Dev: C:\Program Files\Google\Chrome Dev\Application

  • Chrome Canary: C:\Users\<username>\AppData\Local\Google\Chrome SxS\Application Chrome金丝雀: C:\Users\<username>\AppData\Local\Google\Chrome SxS\Application

We’ll refer to whichever one you’re using as <Chrome Dir>.
我们将称您正在使用的那个为 <Chrome Dir>

Also make sure you completely exit any running instance of the browser channel that you’re going to be profiling. Chrome can keep running in the background by default even after you close it to in order to do things like provide push notifications, but if it’s already running you’ll have a much harder time attaching PIX to it properly. Usually the easiest way to exit Chrome fully is to click the Chrome icon in the Windows status bar, then click “Exit”.
还要确保完全退出任何正在运行的浏览器通道实例,以便进行性能分析。Chrome默认情况下即使关闭了它,也可能在后台继续运行,以便提供推送通知等功能,但如果它已经在运行,您将更难以正确地将PIX附加到它上。通常,完全退出Chrome的最简单方法是单击Windows状态栏中的Chrome图标,然后单击“退出”。

Fully exiting Chrome from the Windows status bar

Enabling Debug Markers 启用调试标记

If you want to be able to see labels from WebGPU’s pushDebugGroup()/popDebugGroup() calls (and trust me, you do!) then you’ll need to do one more bit of setup before we launch into PIX itself to enable Debug Markers.
如果你想要能够从WebGPU的 pushDebugGroup() / popDebugGroup() 调用中看到标签(相信我,你想要!),那么在我们开始使用PIX之前,你需要进行一些额外的设置来启用调试标记。

  • Visit https://www.nuget.org/packages/WinPixEventRuntime and click the “Download package” link.
    访问https://www.nuget.org/packages/WinPixEventRuntime并单击“下载包”链接。

  • Rename the extension of the downloaded file from .nupkg to .zip, so you can open it in Explorer.
    将下载文件的扩展名从 .nupkg 改为 .zip ,这样你就可以在资源管理器中打开它。

  • Find and copy the bin\x64\WinPixEventRuntime.dll file.
    找到并复制 bin\x64\WinPixEventRuntime.dll 文件。

  • Paste it into the <Chrome Dir>\<Version Number> folder associated with the browser version you’re running.
    将其粘贴到您正在运行的浏览器版本关联的 <Chrome Dir>\<Version Number> 文件夹中。

    • For example: C:\Program Files\Google\Chrome\Application\121.0.6167.85 例如: C:\Program Files\Google\Chrome\Application\121.0.6167.85

If there are more than one version folder in <Chrome Dir>, it’s usually a safe bet to pick the larger number, but you can run that Chrome channel and visit chrome://version to be sure. Also, whenever your browser install updates you’ll need to copy the DLL into the new version folder again. For Chrome Canary that can be multiple time per day!
如果在 <Chrome Dir> 中有多个版本文件夹,通常选择较大的数字是比较安全的,但您可以运行Chrome通道并访问 chrome://version 以确保。此外,每当浏览器安装更新时,您都需要再次将DLL复制到新版本文件夹中。对于Chrome Canary来说,这可能是每天多次!

Running Chrome in PIX

在 PIX 中运行 Chrome

Now it’s time to start debugging with PIX. If you haven’t already, download and install it, then follow these steps:
现在是时候开始使用 PIX 进行调试了。如果还没有的话,下载并安装它,然后按照以下步骤进行操作:

  • Launch the PIX application as an administrator
    以管理员身份启动PIX应用程序

    • Right click on the app icon or search for the app in Start Menu and click “Run as Administrator”
      在应用程序图标上右键单击,或在“开始”菜单中搜索应用程序,然后单击“以管理员身份运行”

    • This is necessary to get timing information from captures. If you don’t run as an administrator you may get a E_PIX_MISSING_PERFORMANCE_LOGGING_PERMISSIONS error when trying to view timing data.
      这是从捕获中获取时间信息所必需的。如果您不以管理员身份运行,尝试查看时间数据时可能会出现 E_PIX_MISSING_PERFORMANCE_LOGGING_PERMISSIONS 错误。

Launching PIX as administrator

  • In the “Select Target Process” panel, click the “Launch Win32” tab and set the following fields:
    在“选择目标进程”面板中,单击“启动Win32”选项卡,并设置以下字段:

    • Path to Executable: <Chrome Dir>\chrome.exe
      可执行文件路径: <Chrome Dir>\chrome.exe

    • Working Directory: <Chrome Dir>
      工作目录: <Chrome Dir>

    • Command Line Arguments: --disable-gpu-sandbox --disable-gpu-watchdog
      命令行参数: --disable-gpu-sandbox --disable-gpu-watchdog

    • If you want to see readable shader code in the capture add --enable-dawn-features=emit_hlsl_debug_symbols,disable_symbol_renaming to the Command Line Arguments as well.
      如果您想在捕获中看到可读的着色器代码,请在命令行参数中添加 --enable-dawn-features=emit_hlsl_debug_symbols,disable_symbol_renaming

    • You may also want to put the address of the page you’re profiling at the end of the command line arguments string, so that it navigates to it automatically on launch.
      您可能还希望将要分析的页面的地址放在命令行参数字符串的末尾,这样它在启动时会自动导航到该页面。

  • Make sure “Launch for GPU capture” is checked
    确保“启用 GPU 捕获”已选中

  • Click “Launch” 点击“启动”

Example PIX launch args

This will launch the browser. Navigate to the WebGPU content you want to profile, then go back to PIX and find the “GPU Capture” panel. Here you’ll want to expand the “Options” dropdown and ensure that the “Capture Frame count” is something like 2-4. This is because in addition to capturing calls from your WebGPU application PIX will also see rendering commands from the browser compositor. Setting a higher number of frames increases the chances that you’ll capture the WebGPU content that you intend to.
这将启动浏览器。导航到您想要分析的WebGPU内容,然后返回到PIX并找到“GPU捕获”面板。在这里,您需要展开“选项”下拉菜单,并确保“捕获帧数”大约为2-4。这是因为除了捕获来自您的WebGPU应用程序的调用之外,PIX还将看到浏览器合成器的渲染命令。设置更多帧数会增加您捕获所需的WebGPU内容的机会。

Example PIX GPU Capture args

Note: If the GPU Capture panel says “No target process has been selected, that’s probably because the Chrome channel you picked was still running in the background. As explained above, make sure it’s fully exited before launching from PIX.
注意:如果 GPU 捕获面板显示“未选择目标进程”,那可能是因为您选择的 Chrome 通道仍在后台运行。如上所述,请确保在从 PIX 启动之前完全退出。

If it captures the browser commands correctly, you’ll see a screenshot of the captured content appear. Click it and it’ll take you to the capture details, where you should be able to see every GPU command executed by your application.
点击照相机图标捕获。如果它正确捕获了浏览器命令,您将看到捕获内容的屏幕截图。单击它,它将带您到捕获详情页面,在那里您应该能够看到应用程序执行的每个GPU命令。

Please keep in mind, however, that this capture represents the D3D12 commands that your WebGPU code was translated into, not the WebGPU commands directly. That means both that the function names will be different, the arguments may not always line up 1:1, and there will be additional commands in this stream that don’t directly correspond to any WebGPU calls that you made. (This is especially true of resource barriers, which are handled implicitly by your WebGPU implementation.) Usually, though, it’s not too hard to make connections between the commands you see here and the commands your page issues.
请记住,这个捕获代表了您的WebGPU代码被转换成的D3D12命令,而不是直接的WebGPU命令。这意味着函数名称将不同,参数可能不总是一一对应,而且在这个流中会有额外的命令,这些命令并不直接对应您所做的任何WebGPU调用。(这在资源屏障方面尤其如此,这些资源屏障是由您的WebGPU实现隐式处理的。)不过,通常来说,很容易将您在这里看到的命令与您的页面发出的命令联系起来。

Example PIX Capture overview

From here there’s a lot you can do, such as view the values in buffers, see the state of a pass when an dreaw was issued, or see render targets be built up step-by-step. It’s an incredibly powerful tool, and I don’t have the time to cover even a fraction of what it can do here. For that you’re better off looking through their official docs.
从这里开始,你可以做很多事情,比如查看缓冲区中的数值,查看在发出绘制命令时的通道状态,或者逐步查看渲染目标的构建过程。这是一个非常强大的工具,我没有时间在这里覆盖它所能做的一小部分。你最好查阅官方文档。

Timelines and Debug Groups

时间轴和调试组

One thing I do want to highlight, however, is that you can see a timing breakdown of all the commands listed in the capture by clicking the “Click Here to start analysis and collect timing data.” link in the bottom panel. This will produce a timeline that looks something like this:
然而,我想要强调的一点是,您可以通过点击底部面板中的“单击此处开始分析并收集时间数据”链接,查看捕获中列出的所有命令的时间分解。这将生成一个类似于以下内容的时间轴:

Example PIX timeline

You can mouse over or click on any of the bars here to see how long a given operation took on the GPU, as well as get a sense for the concurrency at play by seeing how the bars overlap. Of course, it’s easy to get lost in a sea of draw calls if you don’t have any indication as to what they’re for, and that’s where the Debug Markers mentioned previously come in! You can see in the above screenshot a set of black bars with plain text labels like “Render Bloom”, “Render Shadows”, and “Forward Render Pass”. Those are labels that I set in my WebGPU code, which (because I followed the steps in the “Enabling Debug Markers” section of this doc) are being passed all the way down to this native debugging tool.
您可以将鼠标悬停或单击这里的任何条形图,以查看在GPU上执行的特定操作花费了多长时间,并通过查看条形图重叠来了解并发性。当然,如果您不知道绘制调用的用途,很容易迷失其中,这就是之前提到的调试标记的作用!您可以在上面的截图中看到一组带有普通文本标签的黑色条形图,如“渲染泛光”、“渲染阴影”和“前向渲染通道”。这些是我在WebGPU代码中设置的标签,因为我遵循了本文档中“启用调试标记”部分的步骤,它们一直传递到这个本地调试工具。

The code in my page looks something like this:
我的页面中的代码看起来像这样:

const commandEncoder = device.createCommandEncoder({});
commandEncoder.pushDebugGroup('Render Bloom');

  commandEncoder.pushDebugGroup('1st Pass (Horizontal Blur)');
    let passEncoder = commandEncoder.beginRenderPass(/*...*/);
    passEncoder.setPipeline(this.blurHorizonalPipeline);
    passEncoder.setBindGroup(0, this.pass0BindGroup);
    passEncoder.draw(3);
    passEncoder.end();
  commandEncoder.popDebugGroup();

  commandEncoder.pushDebugGroup('2nd Pass (Vertical Blur)');
    passEncoder = commandEncoder.beginRenderPass(/*...*/);
    passEncoder.setPipeline(this.blurVerticalPipeline);
    passEncoder.setBindGroup(0, this.pass1BindGroup);
    passEncoder.draw(3);
    passEncoder.end();
  commandEncoder.popDebugGroup();

  // Blend pass
  commandEncoder.pushDebugGroup('Blend Pass');
    passEncoder = commandEncoder.beginRenderPass(/*...*/);
    passEncoder.setPipeline(this.blendPipeline);
    passEncoder.setBindGroup(0, this.blendPassBindGroups);
    passEncoder.draw(3);
    passEncoder.end();
  commandEncoder.popDebugGroup();

commandEncoder.popDebugGroup();

device.queue.submit([commandEncoder.finish()]);

Note that the debug groups can be nested! (See WebGPU error handling best practices for more information on using debug groups and labels.)
请注意,调试组可以嵌套!(有关使用调试组和标签的更多信息,请参阅WebGPU错误处理最佳实践。)

You can see each debug group I set here both in the timeline and in the list of captured commands, where each group can be collapsed and expanded for easier navigation. This makes it far easier to correlate the commands you see here with the ones being made by your page, and helps highlight where the majority of the GPU time is going in your page.
您可以在时间轴和捕获命令列表中看到我设置的每个调试组,每个组都可以折叠和展开,以便更轻松地导航。这样可以更容易地将您在此处看到的命令与页面正在执行的命令进行关联,并帮助突出显示页面中大部分 GPU 时间的去向。

“The requested resource is in use”

所请求的资源正在使用中

One last tip I’ll leave you with is that if you start getting errors from PIX stating “The requested resource is in use” when trying to capture timing information, try temporarily disabling “Real-time protection” in Window’s “Virus and Threat protection settings”. This doesn’t seem to be a super common problem, but it’s one that I lost an hour or two of troubleshooting to, so I figured it was worth a mention here!
我要留给你的最后一个提示是,如果你在尝试捕获时间信息时,从PIX收到“所请求的资源正在使用”错误,请尝试在Windows的“病毒和威胁防护设置”中暂时禁用“实时保护”。这似乎不是一个非常常见的问题,但我花了一两个小时来解决它,所以我觉得值得在这里提一下!

Good luck! 祝你好运!

This was just a brief overview of getting started with PIX, but I hope you found it useful. Graphics debuggers like this are a fantastic way to gain insight into the behaviors and performance of your WebGPU application, and I encourage everyone to make good use of them!
这只是一个关于如何开始使用PIX的简要概述,但我希望你觉得它有用。像这样的图形调试器是了解您的WebGPU应用程序行为和性能的绝佳方式,我鼓励每个人充分利用它们!