Topic: -Z self-profile
Audience
Anyone interesting in profiling the compiler
Key People
Wesley Wiser, Michael Woerister
When
TBD
Where
TBD
Meeting Style
Discussion
Deliverables
  • Action Items for the next steps regarding impl in compiler
  • Action Items for the next steps regarding visualization 
Homework to do before meeting
  • Read up on the current state of the feature
Agenda


Questions and Notes
  • Query times don’t seem to subtract time spent in sub-queries
  • that’s true => the feature will be added back in. PR will be updated.
  • Add a -Z flag to dump all of the raw profiler data
  • Should we try to record query keys too?
  • Storing DepNodes instead of query keys might be an option. There is already some infrastructure in DepGraph. Might need cleanup.
  • The strategy we came up with was:
  • store the DepNodeIndex with each event that corresponds to query
  • do one pass over all query tables right before the tcx is destroyed and in that pass populate a DepNodeIndex => format!("{:?}", query_key) table in the profiler that can then be used for generating the actual output.
  • Record events in thread-local arrays?
  • yes, somehow 🙂 
  • Would it makes sense to store data in SQL data base for more complex querying
  • perf-focus does that
  • Use existing UI for inspecting multi-threaded call graphs (Chromium?)
  • seems to have a text-based format for profiles. might be usable to visualize self-profile output. separate tool needed to transform rustc’s to chromium’s format.
  • Add query/lock contention events (useful for parallel queries)
  • How efficient is the current implementation? Does it skew results?
  • Let’s just do the optimizations (avoid hash table lookups and possibly locking)
  • Can we assign small integer indices to queries and other tasks, so that we don’t have to hash string each time an event is recorded?
  • let the query macro generate something per query / or use enum discriminant.
  • It would be nice to have a start/stop event for loading things off the incremental disk cache. (#58309)


Wesley TODO:

  • Fix results to show self-time (to land PR) (#58085)
  • Add events for lock contention for parallel queries (#58309)
  • Dump raw profile events
  • Chromium viewer
  • Change the time stamping to use SystemTime 
  • Wrap Profiler in Arc<>, create it earlier, and also record LLVM stuff
  • We can also remove the timeline thing at this point.

Notes Friday

  • track try-mark-green?