Skip to content

[SYCL][Driver] Enable time tracing capability for SYCL applications.#21207

Draft
srividya-sundaram wants to merge 17 commits intointel:syclfrom
srividya-sundaram:enable-time-trace-sycl
Draft

[SYCL][Driver] Enable time tracing capability for SYCL applications.#21207
srividya-sundaram wants to merge 17 commits intointel:syclfrom
srividya-sundaram:enable-time-trace-sycl

Conversation

@srividya-sundaram
Copy link
Contributor

@srividya-sundaram srividya-sundaram commented Feb 3, 2026

Enable Clang driver to generate JSON time-trace output and propagate time-trace options to the compilation commands.

@srividya-sundaram srividya-sundaram marked this pull request as ready for review February 3, 2026 23:22
@srividya-sundaram srividya-sundaram requested a review from a team as a code owner February 3, 2026 23:22
@Maetveis
Copy link
Contributor

Maetveis commented Feb 9, 2026

Hi @srividya-sundaram, this is an area I've explored previously, and I remember that -ftime-trace has usuability problems when it comes to offloading. At that time I was looking at cuda offloading, but I think they also apply to SYCL.

Have you checked the comment at https://reviews.llvm.org/D150282 and https://reviews.llvm.org/D133662, and the github issue llvm/llvm-project#55455?

It would be ideal to resolve this in upstream clang, and do so for all offloading models, not just SYCL.

@srividya-sundaram
Copy link
Contributor Author

Hi @srividya-sundaram, this is an area I've explored previously, and I remember that -ftime-trace has usuability problems when it comes to offloading. At that time I was looking at cuda offloading, but I think they also apply to SYCL.

Have you checked the comment at https://reviews.llvm.org/D150282 and https://reviews.llvm.org/D133662, and the github issue llvm/llvm-project#55455?

It would be ideal to resolve this in upstream clang, and do so for all offloading models, not just SYCL.

Hi @Maetveis
Thanks for pointing these out.
I’ve gone through the referenced reviews briefly but haven't done an in-depth studying of the comments. I will go through them as well as the linked issue.

I remember that -ftime-trace has usuability problems when it comes to offloading.

Could you please share the usability problems you encountered?

Some questions I have are:
When compilation and linkage are done in one compiler driver invocation, which actions should produce traces.
My understanding is both the SYCL host compilation and SYCL device compilation invocations should produce the trace files (not in the /tmp folder)

@Maetveis
Copy link
Contributor

Maetveis commented Feb 10, 2026

Could you please share the usability problems you encountered?

Sure :). This was a while ago, and at the time for a different toolchain (AMD's HIP) but I think they mostly still apply.

  1. The fact that there are multiple traces for compiling a single offloading source file goes counter to tooling and user expectations.

    • If a user does clang++ -ftime-trace source.cpp -o source.o clang also produces source.json. IMO it is reasonable to expect that clang++ -fsycl -fsycl-targets=A,B -ftime-trace sycl-source.cpp -o sycl-source.o should produce a single sycl-source.json and for this file to contain the traces for host and all device compilation steps.
    • Tools used for visualizing / understanding build times like ninjatracing assume a model like this. I think it's more fruitful to try match non-offload behavior by the compiler rather than require tooling to adapt for the niche use-case that is offloading.
  2. The traces for device side compiles have very poor discoverability:

    • They are written to somewhere in /tmp, but the only way to know that is by looking at -v output
    • For some compilations they are not written to /tmp, but rather use the same file-name as the host-trace and are therefore overwritten by it.

To frame these a bit more, I think it's useful to think about the following use-cases:

Use-case A:

As a developer of the library libFoo which uses an offloading API for (some of) its sources, I want to analyze the overall build-time and look for "hot-spots" where I can reduce it the most. In order to do this, I use tools like ninjatracing and pass -ftime-trace to all compilations through the build-system. This gives me an overview of which files take the longest to compile, and also gives me the ability to dig down into details to spot patterns like a particular template instantiation that slows down multiple compilation units.

Use-case B:

I have identified that the file A.cpp takes a long time to compile for offload target foo. In order to understand why I pass the -ftime-trace option to clang invoked from the command line.

The second case is already reasonably well served by what clang can do for -ftime-trace with offloading ifwe simply "re-enable" it. Yes it's annoying to look for files in /tmp, but it's not the end of the world for a single file.

The first case basically breaks down, the level of detail is reduced to the object file level instead of fine-grain we would have without offloading. We don't get any information about which step of the combined offload "compilation" took longest.

When compilation and linkage are done in one compiler driver invocation, which actions should produce traces.
My understanding is both the SYCL host compilation and SYCL device compilation invocations should produce the trace files (not in the /tmp folder)

In an ideal world in my opinion there should be just one trace and that includes traces for every step: host and device compilation and linking too, assuming the linker is capable of producing compatible traces.
Combined compilation and linking isn't common though, because most projects of moderate complexity will use a build-system, therefore -ftime-trace doesn't accommodate it (even for non-offload compilations). I think this is fine, advanced users who are looking at build time should be able to break down clang++ foo.cpp into clang++ -ftime-trace -c foo.cpp and time clang++ foo.o.

@srividya-sundaram
Copy link
Contributor Author

srividya-sundaram commented Feb 10, 2026

Thank you for the detailed explanation. Super helpful to understand your POV with the example use cases.
I have some follow up questions and observations:

If a user does clang++ -ftime-trace source.cpp -o source.o clang also produces source.json. IMO it is reasonable to expect that clang++ -fsycl -fsycl-targets=A,B -ftime-trace sycl-source.cpp -o sycl-source.o should produce a single sycl-source.json and for this file to contain the traces for host and all device compilation steps.

For regular/non-offload compilation like this clang++ -ftime-trace source.cpp -o source.o , we currently produce a trace corresponding to a single clang -cc1 invocation.

When -ftime-trace is enabled for offload compilations, Clang could generate one time-trace JSON file per compiler invocation. (one JSON file for the host compilation and one JSON file for each device compilation)

This design aligns with the existing semantics of -ftime-trace, which today produces a trace corresponding to a single clang -cc1 invocation.

Also, with my current patch, I was able to generate the SYCL host compilation trace file and SYCL device compilation trace file separately and both the traces appear to be quite different. Example: different targets, different passes, different toolchains/backends. Combining the SYCL host and device jsons into a single one might need namespacing everywhere and proper grouping of the events/passes etc. Seems to blur the host/device invocation boundries.

I used chrome://tracing/ to load the generated SYCL host and device JSON files and I believe it expects a single timeline from a single process (clang-22) I am not sure if the tool would adopt if we were to emit one big json file for host and all the device compilations. As device count grows, our single JSON file will also scale bigger. Please see attached image, top right, processes tab.

sycl-device-json

Given these observations, I was wondering if we could instead generate descriptive json filenames in user detectable directories like you have mentioned.
Example:

clang++ -fsycl -fsycl-targets=spir64,nvptx64-nvidia-cuda  -ftime-trace source.cpp -o source.o
source.json                (host)
source-sycl-spir64.json    (device)
source-sycl-nvptx64.json   (device)

WDYT?

@Maetveis
Copy link
Contributor

For regular/non-offload compilation like this clang++ -ftime-trace source.cpp -o source.o , we currently produce a trace corresponding to a single clang -cc1 invocation.

When -ftime-trace is enabled for offload compilations, Clang could generate one time-trace JSON file per compiler invocation. (one JSON file for the host compilation and one JSON file for each device compilation)

This design aligns with the existing semantics of -ftime-trace, which today produces a trace corresponding to a single clang -cc1 invocation.

I don't think that was an intentional design choice for -ftime-trace, probably offloading simply wasn't a consideration back then. Allow me to ask this slightly provocative question: clang++ -fsycl source.cpp also produces a single object file: downstream tools like the linker are shielded from the complexity of multiple compiler invocations by the offloading toolchain. What's the difference between object files and compile-time traces that justifies the difference in behaviour?

Also, with my current patch, I was able to generate the SYCL host compilation trace file and SYCL device compilation trace file separately and both the traces appear to be quite different. Example: different targets, different passes, different toolchains/backends. Combining the SYCL host and device jsons into a single one might need namespacing everywhere and proper grouping of the events/passes etc. Seems to blur the host/device invocation boundries.

There are already separate high-level categories in the traces like "Frontend" and "Backend", I don't see why an additional level of "Offload Host", "Offload Device (nvptx)" etc couldn't be added.

I used chrome://tracing/ to load the generated SYCL host and device JSON files and I believe it expects a single timeline from a single process (clang-22) I am not sure if the tool would adopt if we were to emit one big json file for host and all the device compilations.
As device count grows, our single JSON file will also scale bigger. Please see attached image, top right, processes tab.

Perfetto is the successor of the chrome-tracing visualizer; it supports binary traces (much smaller sizes), is designed with multi-process traces in mind.

Given these observations, I was wondering if we could instead generate descriptive json filenames in user detectable directories like you have mentioned.
WDYT?

I think your suggestion improves the status quo for at least the simpler use case, so SGTM. I understand that implementing a single trace is a significantly more work, and there might not be a big enough motivation to do that.

@mdtoguchi
Copy link
Contributor

For short term usability, having separate traces for each compilation (host/targetA/targetB) with different unique file names sounds reasonable to me. The perspective of having a single time-trace file when offloading enabled with all target embedded does make sense as from a general user perspective there is one binary generated - at least when generating an object. This of course goes beyond the scope of just modifying the driver.

Documentation should be updated in the SYCL space to show generated file expectations.

@srividya-sundaram srividya-sundaram marked this pull request as draft February 21, 2026 01:17
Args.hasArg(options::OPT_offload_new_driver) &&
Args.hasArg(options::OPT_ftime_trace, options::OPT_ftime_trace_EQ);

const bool CreatePrefixForHost =
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

SYCL Offloading Actions: AtTopLevel Behavior

Invocation Type Example Action AtTopLevel
Compile-only (-c) clang++ -fsycl -c sycl-code.cpp -o sycl-code.o SYCL host offloading true
SYCL device offloading false
Compile + Link clang++ -fsycl sycl-code.cpp Linking action true
SYCL host offloading false
SYCL device offloading false

With -c, the SYCL host offloading action is top-level. Without -c, the linking action is top-level and both SYCL offloading actions are nested.

The current requirement is to generate trace files for both SYCL host compilation and SYCL device compilation, with corresponding offloading filename prefixes:

  • Host compilation: input-file-name-host-x86_64-unknown-linux-gnu.json
  • Device compilation (SPIR-V targets): input-file-name-sycl-spir64-unknown-unknown.json

In the previous implementation, offloading filename prefixes were only generated for offload actions (host or device) that were not at the top level.

With the new requirements, we need to generate offloading filename prefixes for SYCL host offloading actions even when they are at the top level (i.e., in compile-only mode with -c).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the impact when using -fsycl-device-only? Do we care about the output file name in that case?

Copy link
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enables Clang’s driver-side time tracing for SYCL (new offload driver), ensuring -ftime-trace* options are propagated to SYCL host/device compilation jobs and producing distinct JSON outputs to avoid host/device filename collisions.

Changes:

  • Add SYCL driver test coverage for -ftime-trace behavior (compile-only, with -dumpdir, and link flow).
  • Extend driver logic to generate per-(host/device) SYCL time-trace filenames by incorporating offloading prefixes.
  • Adjust offloading prefix creation conditions and time-trace handling to cover SYCL offloading actions.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 5 comments.

File Description
clang/test/Driver/sycl-time-trace.cpp New driver test validating SYCL host/device time-trace propagation and JSON output naming.
clang/lib/Driver/Driver.cpp Driver changes to compute SYCL-specific time-trace filenames and propagate time-trace options through SYCL offload jobs.

You can also share your feedback on Copilot code review. Take the survey.

Comment on lines +8 to +19
// SYCL-HOST-COMPILE: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=e/a-host-x86_64-unknown-linux-gnu.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"

// Verify that the Clang driver generates JSON time-trace output for compile-only
// invocation and propagates the time-trace options, respecting the specified dump directory.
// RUN: %clang -### -fsycl --offload-new-driver -c -ftime-trace -ftime-trace-granularity=0 -ftime-trace-verbose d/a.cpp -dumpdir f/ 2>&1 | FileCheck %s --check-prefixes=SYCL-DEVICE-DUMPDIR,SYCL-HOST-DUMPDIR
// SYCL-DEVICE-DUMPDIR: -cc1{{.*}} "-fsycl-is-device"{{.*}} "-ftime-trace=f/a-sycl-spir64-unknown-unknown.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
// SYCL-HOST-DUMPDIR: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=f/a-host-x86_64-unknown-linux-gnu.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"

// This test verifies that Clang driver correctly propagates time-trace related options
// during a compile-and-link invocation and enables JSON time-trace output.
// RUN: %clang -### -fsycl --offload-new-driver -ftime-trace=e -ftime-trace-granularity=0 -ftime-trace-verbose d/a.cpp -o f/x -dumpdir f/ 2>&1 | FileCheck %s --check-prefixes=LINK-DEVICE,LINK-CLW
// LINK-HOST: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=e{{/|\\\\}}a-host-x86_64-unknown-linux-gnu.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test hardcodes the host triple in expected time-trace filenames (e.g. x86_64-unknown-linux-gnu) but the RUN lines don’t fix the target triple or restrict the test to a specific host. This will fail on non-x86_64 and/or non-Linux bots. Consider either (a) adding --target=x86_64-unknown-linux-gnu and // REQUIRES: system-linux (matching other SYCL new-driver tests), or (b) relaxing the checks to match any host triple with a regex.

Suggested change
// SYCL-HOST-COMPILE: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=e/a-host-x86_64-unknown-linux-gnu.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
// Verify that the Clang driver generates JSON time-trace output for compile-only
// invocation and propagates the time-trace options, respecting the specified dump directory.
// RUN: %clang -### -fsycl --offload-new-driver -c -ftime-trace -ftime-trace-granularity=0 -ftime-trace-verbose d/a.cpp -dumpdir f/ 2>&1 | FileCheck %s --check-prefixes=SYCL-DEVICE-DUMPDIR,SYCL-HOST-DUMPDIR
// SYCL-DEVICE-DUMPDIR: -cc1{{.*}} "-fsycl-is-device"{{.*}} "-ftime-trace=f/a-sycl-spir64-unknown-unknown.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
// SYCL-HOST-DUMPDIR: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=f/a-host-x86_64-unknown-linux-gnu.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
// This test verifies that Clang driver correctly propagates time-trace related options
// during a compile-and-link invocation and enables JSON time-trace output.
// RUN: %clang -### -fsycl --offload-new-driver -ftime-trace=e -ftime-trace-granularity=0 -ftime-trace-verbose d/a.cpp -o f/x -dumpdir f/ 2>&1 | FileCheck %s --check-prefixes=LINK-DEVICE,LINK-CLW
// LINK-HOST: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=e{{/|\\\\}}a-host-x86_64-unknown-linux-gnu.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
// SYCL-HOST-COMPILE: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=e/a-host-{{[^"]*}}.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
// Verify that the Clang driver generates JSON time-trace output for compile-only
// invocation and propagates the time-trace options, respecting the specified dump directory.
// RUN: %clang -### -fsycl --offload-new-driver -c -ftime-trace -ftime-trace-granularity=0 -ftime-trace-verbose d/a.cpp -dumpdir f/ 2>&1 | FileCheck %s --check-prefixes=SYCL-DEVICE-DUMPDIR,SYCL-HOST-DUMPDIR
// SYCL-DEVICE-DUMPDIR: -cc1{{.*}} "-fsycl-is-device"{{.*}} "-ftime-trace=f/a-sycl-spir64-unknown-unknown.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
// SYCL-HOST-DUMPDIR: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=f/a-host-{{[^"]*}}.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
// This test verifies that Clang driver correctly propagates time-trace related options
// during a compile-and-link invocation and enables JSON time-trace output.
// RUN: %clang -### -fsycl --offload-new-driver -ftime-trace=e -ftime-trace-granularity=0 -ftime-trace-verbose d/a.cpp -o f/x -dumpdir f/ 2>&1 | FileCheck %s --check-prefixes=LINK-DEVICE,LINK-CLW
// LINK-HOST: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=e{{/|\\\\}}a-host-{{[^"]*}}.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"

Copilot uses AI. Check for mistakes.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding --target=x86_64-unknown-linux-gnu makes sense here.

Comment on lines +9517 to +9530
// For SYCL device compilation with -c -o and c + l -o
const bool IsSYCLDeviceTimeTrace =
JA.isDeviceOffloading(Action::OFK_SYCL) && isa<BackendJobAction>(JA) &&
!AtTopLevel && C.getArgs().hasArg(options::OPT_offload_new_driver) &&
C.getArgs().hasArg(options::OPT_ftime_trace, options::OPT_ftime_trace_EQ);

if (IsSYCLDeviceTimeTrace) {
if (Arg *FinalOutput = C.getArgs().getLastArg(options::OPT_o))
return C.addResultFile(FinalOutput->getValue(), &JA);
else {
StringRef BaseName = llvm::sys::path::filename(BaseInput);
return C.addResultFile(C.getArgs().MakeArgString(BaseName), &JA);
}
}
Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GetNamedOutputPath is returning the user-provided -o path for nested SYCL device backend jobs when -ftime-trace is enabled. This changes the actual output file for that backend job (not just the trace filename) and can overwrite the real host object / break the offload pipeline. Instead, keep normal temp/named outputs for the job and derive the time-trace path in handleTimeTrace from -o (or another dedicated “trace basename”) without affecting the job output path.

Suggested change
// For SYCL device compilation with -c -o and c + l -o
const bool IsSYCLDeviceTimeTrace =
JA.isDeviceOffloading(Action::OFK_SYCL) && isa<BackendJobAction>(JA) &&
!AtTopLevel && C.getArgs().hasArg(options::OPT_offload_new_driver) &&
C.getArgs().hasArg(options::OPT_ftime_trace, options::OPT_ftime_trace_EQ);
if (IsSYCLDeviceTimeTrace) {
if (Arg *FinalOutput = C.getArgs().getLastArg(options::OPT_o))
return C.addResultFile(FinalOutput->getValue(), &JA);
else {
StringRef BaseName = llvm::sys::path::filename(BaseInput);
return C.addResultFile(C.getArgs().MakeArgString(BaseName), &JA);
}
}

Copilot uses AI. Check for mistakes.
Comment on lines 9531 to 9545
// For SYCL Host compilation with c + l -o
const bool IsSYCLHostTimeTraceNotTopLevel =
JA.isHostOffloading(Action::OFK_SYCL) && isa<AssembleJobAction>(JA) &&
!AtTopLevel && C.getArgs().hasArg(options::OPT_offload_new_driver) &&
C.getArgs().hasArg(options::OPT_ftime_trace, options::OPT_ftime_trace_EQ);

if (IsSYCLHostTimeTraceNotTopLevel) {
if (Arg *FinalOutput = C.getArgs().getLastArg(options::OPT_o))
return C.addResultFile(FinalOutput->getValue(), &JA);
else {
StringRef BaseName = llvm::sys::path::filename(BaseInput);
return C.addResultFile(C.getArgs().MakeArgString(BaseName), &JA);
}
}

Copy link

Copilot AI Mar 11, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, GetNamedOutputPath returns the user -o value for nested SYCL host assemble jobs when -ftime-trace is enabled. For a compile+link invocation, -o typically names the final executable, so emitting an intermediate object/assembly to that path is incorrect and can clobber the final output. Please avoid special-casing job output paths for time-trace; compute the trace filename separately while leaving intermediate outputs unchanged.

Suggested change
// For SYCL Host compilation with c + l -o
const bool IsSYCLHostTimeTraceNotTopLevel =
JA.isHostOffloading(Action::OFK_SYCL) && isa<AssembleJobAction>(JA) &&
!AtTopLevel && C.getArgs().hasArg(options::OPT_offload_new_driver) &&
C.getArgs().hasArg(options::OPT_ftime_trace, options::OPT_ftime_trace_EQ);
if (IsSYCLHostTimeTraceNotTopLevel) {
if (Arg *FinalOutput = C.getArgs().getLastArg(options::OPT_o))
return C.addResultFile(FinalOutput->getValue(), &JA);
else {
StringRef BaseName = llvm::sys::path::filename(BaseInput);
return C.addResultFile(C.getArgs().MakeArgString(BaseName), &JA);
}
}

Copilot uses AI. Check for mistakes.
if (llvm::sys::fs::is_directory(Path)) {
SmallString<128> Tmp(Result.getFilename());
if (!OffloadingPrefix.empty() &&
Args.hasArg(options::OPT_offload_new_driver) &&
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why the --offload-new-driver requirement? Adding the prefix should be valid for old and new model when performing -ftime-trace. The ability to use -ftime-trace -fsycl should work with the old model too.

Args.hasArg(options::OPT_offload_new_driver) &&
Args.hasArg(options::OPT_ftime_trace, options::OPT_ftime_trace_EQ);

const bool CreatePrefixForHost =
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the impact when using -fsycl-device-only? Do we care about the output file name in that case?

if (Arg *FinalOutput = C.getArgs().getLastArg(options::OPT__SLASH_o))
return C.addResultFile(FinalOutput->getValue(), &JA);
}
// For SYCL device compilation with -c -o and c + l -o
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe spell out compile and link here c + l isn't very clear :)

Comment on lines +8 to +19
// SYCL-HOST-COMPILE: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=e/a-host-x86_64-unknown-linux-gnu.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"

// Verify that the Clang driver generates JSON time-trace output for compile-only
// invocation and propagates the time-trace options, respecting the specified dump directory.
// RUN: %clang -### -fsycl --offload-new-driver -c -ftime-trace -ftime-trace-granularity=0 -ftime-trace-verbose d/a.cpp -dumpdir f/ 2>&1 | FileCheck %s --check-prefixes=SYCL-DEVICE-DUMPDIR,SYCL-HOST-DUMPDIR
// SYCL-DEVICE-DUMPDIR: -cc1{{.*}} "-fsycl-is-device"{{.*}} "-ftime-trace=f/a-sycl-spir64-unknown-unknown.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
// SYCL-HOST-DUMPDIR: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=f/a-host-x86_64-unknown-linux-gnu.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"

// This test verifies that Clang driver correctly propagates time-trace related options
// during a compile-and-link invocation and enables JSON time-trace output.
// RUN: %clang -### -fsycl --offload-new-driver -ftime-trace=e -ftime-trace-granularity=0 -ftime-trace-verbose d/a.cpp -o f/x -dumpdir f/ 2>&1 | FileCheck %s --check-prefixes=LINK-DEVICE,LINK-CLW
// LINK-HOST: -cc1{{.*}} "-fsycl-is-host"{{.*}} "-ftime-trace=e{{/|\\\\}}a-host-x86_64-unknown-linux-gnu.json" "-ftime-trace-granularity=0" "-ftime-trace-verbose"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Adding --target=x86_64-unknown-linux-gnu makes sense here.

const bool IsSYCLHostTimeTraceNotTopLevel =
JA.isHostOffloading(Action::OFK_SYCL) && isa<AssembleJobAction>(JA) &&
!AtTopLevel && C.getArgs().hasArg(options::OPT_offload_new_driver) &&
C.getArgs().hasArg(options::OPT_ftime_trace, options::OPT_ftime_trace_EQ);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the expected output name for this JobAction when -ftime-trace is not enabled? It is my understanding that the only thing we should be modifying is the output file for -ftime-trace=file and not modify any of the intermediate files that are generated during the toolchain execution.

For SYCL non-top-level jobs: derives trace path from -o (with full directory path) or BaseInput
For all other jobs: uses Result.getFilename() (existing behavior)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants