Skip to content

Commit be2df95

Browse files
authoredDec 9, 2024··
Switch builtin strings to use string tables (#118734)
The Clang binary (and any binary linking Clang as a library), when built using PIE, ends up with a pretty shocking number of dynamic relocations to apply to the executable image: roughly 400k. Each of these takes up binary space in the executable, and perhaps most interestingly takes start-up time to apply the relocations. The largest pattern I identified were the strings used to describe target builtins. The addresses of these string literals were stored into huge arrays, each one requiring a dynamic relocation. The way to avoid this is to design the target builtins to use a single large table of strings and offsets within the table for the individual strings. This switches the builtin management to such a scheme. This saves over 100k dynamic relocations by my measurement, an over 25% reduction. Just looking at byte size improvements, using the `bloaty` tool to compare a newly built `clang` binary to an old one: ``` FILE SIZE VM SIZE -------------- -------------- +1.4% +653Ki +1.4% +653Ki .rodata +0.0% +960 +0.0% +960 .text +0.0% +197 +0.0% +197 .dynstr +0.0% +184 +0.0% +184 .eh_frame +0.0% +96 +0.0% +96 .dynsym +0.0% +40 +0.0% +40 .eh_frame_hdr +114% +32 [ = ] 0 [Unmapped] +0.0% +20 +0.0% +20 .gnu.hash +0.0% +8 +0.0% +8 .gnu.version +0.9% +7 +0.9% +7 [LOAD #2 [R]] [ = ] 0 -75.4% -3.00Ki .relro_padding -16.1% -802Ki -16.1% -802Ki .data.rel.ro -27.3% -2.52Mi -27.3% -2.52Mi .rela.dyn -1.6% -2.66Mi -1.6% -2.66Mi TOTAL ``` We get a 16% reduction in the `.data.rel.ro` section, and nearly 30% reduction in `.rela.dyn` where those reloctaions are stored. This is also visible in my benchmarking of binary start-up overhead at least: ``` Benchmark 1: ./old_clang --version Time (mean ± σ): 17.6 ms ± 1.5 ms [User: 4.1 ms, System: 13.3 ms] Range (min … max): 14.2 ms … 22.8 ms 162 runs Benchmark 2: ./new_clang --version Time (mean ± σ): 15.5 ms ± 1.4 ms [User: 3.6 ms, System: 11.8 ms] Range (min … max): 12.4 ms … 20.3 ms 216 runs Summary './new_clang --version' ran 1.13 ± 0.14 times faster than './old_clang --version' ``` We get about 2ms faster `--version` runs. While there is a lot of noise in binary execution time, this delta is pretty consistent, and represents over 10% improvement. This is particularly interesting to me because for very short source files, repeatedly starting the `clang` binary is actually the dominant cost. For example, `configure` scripts running against the `clang` compiler are slow in large part because of binary start up time, not the time to process the actual inputs to the compiler. ---- This PR implements the string tables using `constexpr` code and the existing macro system. I understand that the builtins are moving towards a TableGen model, and if complete that would provide more options for modeling this. Unfortunately, that migration isn't complete, and even the parts that are migrated still rely on the ability to break out of the TableGen model and directly expand an X-macro style `BUILTIN(...)` textually. I looked at trying to complete the move to TableGen, but it would both require the difficult migration of the remaining targets, and solving some tricky problems with how to move away from any macro-based expansion. I was also able to find a reasonably clean and effective way of doing this with the existing macros and some `constexpr` code that I think is clean enough to be a pretty good intermediate state, and maybe give a good target for the eventual TableGen solution. I was also able to factor the macros into set of consistent patterns that avoids a significant regression in overall boilerplate.
1 parent f6c51ea commit be2df95

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

48 files changed

+606
-309
lines changed
 

‎clang/include/clang/Basic/Builtins.h

+169-36
Original file line numberDiff line numberDiff line change
@@ -55,6 +55,7 @@ struct HeaderDesc {
5555
#undef HEADER
5656
} ID;
5757

58+
constexpr HeaderDesc() : ID() {}
5859
constexpr HeaderDesc(HeaderID ID) : ID(ID) {}
5960

6061
const char *getName() const;
@@ -68,23 +69,152 @@ enum ID {
6869
FirstTSBuiltin
6970
};
7071

72+
// The info used to represent each builtin.
7173
struct Info {
72-
llvm::StringLiteral Name;
73-
const char *Type, *Attributes;
74-
const char *Features;
74+
// Rather than store pointers to the string literals describing these four
75+
// aspects of builtins, we store offsets into a common string table.
76+
struct StrOffsets {
77+
int Name;
78+
int Type;
79+
int Attributes;
80+
int Features;
81+
} Offsets;
82+
7583
HeaderDesc Header;
7684
LanguageID Langs;
7785
};
7886

87+
// The storage for `N` builtins. This contains a single pointer to the string
88+
// table used for these builtins and an array of metadata for each builtin.
89+
template <size_t N> struct Storage {
90+
const char *StringTable;
91+
92+
std::array<Info, N> Infos;
93+
94+
// A constexpr function to construct the storage for a a given string table in
95+
// the first argument and an array in the second argument. This is *only*
96+
// expected to be used at compile time, we should mark it `consteval` when
97+
// available.
98+
//
99+
// The `Infos` array is particularly special. This function expects an array
100+
// of `Info` structs, where the string offsets of each entry refer to the
101+
// *sizes* of those strings rather than their offsets, and for the target
102+
// string to be in the provided string table at an offset the sum of all
103+
// previous string sizes. This function walks the `Infos` array computing the
104+
// running sum and replacing the sizes with the actual offsets in the string
105+
// table that should be used. This arrangement is designed to make it easy to
106+
// expand `.def` and `.inc` files with X-macros to construct both the string
107+
// table and the `Info` structs in the arguments to this function.
108+
static constexpr Storage<N> Make(const char *Strings,
109+
std::array<Info, N> Infos) {
110+
// Translate lengths to offsets.
111+
int Offset = 0;
112+
for (auto &I : Infos) {
113+
Info::StrOffsets NewOffsets = {};
114+
NewOffsets.Name = Offset;
115+
Offset += I.Offsets.Name;
116+
NewOffsets.Type = Offset;
117+
Offset += I.Offsets.Type;
118+
NewOffsets.Attributes = Offset;
119+
Offset += I.Offsets.Attributes;
120+
NewOffsets.Features = Offset;
121+
Offset += I.Offsets.Features;
122+
I.Offsets = NewOffsets;
123+
}
124+
return {Strings, Infos};
125+
}
126+
};
127+
128+
// A detail macro used below to emit a string literal that, after string literal
129+
// concatenation, ends up triggering the `-Woverlength-strings` warning. While
130+
// the warning is useful in general to catch accidentally excessive strings,
131+
// here we are creating them intentionally.
132+
//
133+
// This relies on a subtle aspect of `_Pragma`: that the *diagnostic* ones don't
134+
// turn into actual tokens that would disrupt string literal concatenation.
135+
#ifdef __clang__
136+
#define CLANG_BUILTIN_DETAIL_STR_TABLE(S) \
137+
_Pragma("clang diagnostic push") \
138+
_Pragma("clang diagnostic ignored \"-Woverlength-strings\"") \
139+
S _Pragma("clang diagnostic pop")
140+
#else
141+
#define CLANG_BUILTIN_DETAIL_STR_TABLE(S) S
142+
#endif
143+
144+
// A macro that can be used with `Builtins.def` and similar files as an X-macro
145+
// to add the string arguments to a builtin string table. This is typically the
146+
// target for the `BUILTIN`, `LANGBUILTIN`, or `LIBBUILTIN` macros in those
147+
// files.
148+
#define CLANG_BUILTIN_STR_TABLE(ID, TYPE, ATTRS) \
149+
CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" /*FEATURE*/ "\0")
150+
151+
// A macro that can be used with target builtin `.def` and `.inc` files as an
152+
// X-macro to add the string arguments to a builtin string table. this is
153+
// typically the target for the `TARGET_BUILTIN` macro.
154+
#define CLANG_TARGET_BUILTIN_STR_TABLE(ID, TYPE, ATTRS, FEATURE) \
155+
CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" FEATURE "\0")
156+
157+
// A macro that can be used with target builtin `.def` and `.inc` files as an
158+
// X-macro to add the string arguments to a builtin string table. this is
159+
// typically the target for the `TARGET_HEADER_BUILTIN` macro. We can't delegate
160+
// to `TARGET_BUILTIN` because the `FEATURE` string changes position.
161+
#define CLANG_TARGET_HEADER_BUILTIN_STR_TABLE(ID, TYPE, ATTRS, HEADER, LANGS, \
162+
FEATURE) \
163+
CLANG_BUILTIN_DETAIL_STR_TABLE(#ID "\0" TYPE "\0" ATTRS "\0" FEATURE "\0")
164+
165+
// A detail macro used internally to compute the desired string table
166+
// `StrOffsets` struct for arguments to `Storage::Make`.
167+
#define CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS) \
168+
Builtin::Info::StrOffsets { \
169+
sizeof(#ID), sizeof(TYPE), sizeof(ATTRS), sizeof("") \
170+
}
171+
172+
// A detail macro used internally to compute the desired string table
173+
// `StrOffsets` struct for arguments to `Storage::Make`.
174+
#define CLANG_TARGET_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS, FEATURE) \
175+
Builtin::Info::StrOffsets { \
176+
sizeof(#ID), sizeof(TYPE), sizeof(ATTRS), sizeof(FEATURE) \
177+
}
178+
179+
// A set of macros that can be used with builtin `.def' files as an X-macro to
180+
// create an `Info` struct for a particular builtin. It both computes the
181+
// `StrOffsets` value for the string table (the lengths here, translated to
182+
// offsets by the Storage::Make function), and the other metadata for each
183+
// builtin.
184+
//
185+
// There is a corresponding macro for each of `BUILTIN`, `LANGBUILTIN`,
186+
// `LIBBUILTIN`, `TARGET_BUILTIN`, and `TARGET_HEADER_BUILTIN`.
187+
#define CLANG_BUILTIN_ENTRY(ID, TYPE, ATTRS) \
188+
Builtin::Info{CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS), \
189+
HeaderDesc::NO_HEADER, ALL_LANGUAGES},
190+
#define CLANG_LANGBUILTIN_ENTRY(ID, TYPE, ATTRS, LANG) \
191+
Builtin::Info{CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS), \
192+
HeaderDesc::NO_HEADER, LANG},
193+
#define CLANG_LIBBUILTIN_ENTRY(ID, TYPE, ATTRS, HEADER, LANG) \
194+
Builtin::Info{CLANG_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS), \
195+
HeaderDesc::HEADER, LANG},
196+
#define CLANG_TARGET_BUILTIN_ENTRY(ID, TYPE, ATTRS, FEATURE) \
197+
Builtin::Info{ \
198+
CLANG_TARGET_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS, FEATURE), \
199+
HeaderDesc::NO_HEADER, ALL_LANGUAGES},
200+
#define CLANG_TARGET_HEADER_BUILTIN_ENTRY(ID, TYPE, ATTRS, HEADER, LANG, \
201+
FEATURE) \
202+
Builtin::Info{ \
203+
CLANG_TARGET_BUILTIN_DETAIL_STR_OFFSETS(ID, TYPE, ATTRS, FEATURE), \
204+
HeaderDesc::HEADER, LANG},
205+
79206
/// Holds information about both target-independent and
80207
/// target-specific builtins, allowing easy queries by clients.
81208
///
82209
/// Builtins from an optional auxiliary target are stored in
83210
/// AuxTSRecords. Their IDs are shifted up by TSRecords.size() and need to
84211
/// be translated back with getAuxBuiltinID() before use.
85212
class Context {
86-
llvm::ArrayRef<Info> TSRecords;
87-
llvm::ArrayRef<Info> AuxTSRecords;
213+
const char *TSStrTable = nullptr;
214+
const char *AuxTSStrTable = nullptr;
215+
216+
llvm::ArrayRef<Info> TSInfos;
217+
llvm::ArrayRef<Info> AuxTSInfos;
88218

89219
public:
90220
Context() = default;
@@ -100,12 +230,13 @@ class Context {
100230

101231
/// Return the identifier name for the specified builtin,
102232
/// e.g. "__builtin_abs".
103-
llvm::StringRef getName(unsigned ID) const { return getRecord(ID).Name; }
233+
llvm::StringRef getName(unsigned ID) const;
104234

105235
/// Get the type descriptor string for the specified builtin.
106-
const char *getTypeString(unsigned ID) const {
107-
return getRecord(ID).Type;
108-
}
236+
const char *getTypeString(unsigned ID) const;
237+
238+
/// Get the attributes descriptor string for the specified builtin.
239+
const char *getAttributesString(unsigned ID) const;
109240

110241
/// Return true if this function is a target-specific builtin.
111242
bool isTSBuiltin(unsigned ID) const {
@@ -114,40 +245,40 @@ class Context {
114245

115246
/// Return true if this function has no side effects.
116247
bool isPure(unsigned ID) const {
117-
return strchr(getRecord(ID).Attributes, 'U') != nullptr;
248+
return strchr(getAttributesString(ID), 'U') != nullptr;
118249
}
119250

120251
/// Return true if this function has no side effects and doesn't
121252
/// read memory.
122253
bool isConst(unsigned ID) const {
123-
return strchr(getRecord(ID).Attributes, 'c') != nullptr;
254+
return strchr(getAttributesString(ID), 'c') != nullptr;
124255
}
125256

126257
/// Return true if we know this builtin never throws an exception.
127258
bool isNoThrow(unsigned ID) const {
128-
return strchr(getRecord(ID).Attributes, 'n') != nullptr;
259+
return strchr(getAttributesString(ID), 'n') != nullptr;
129260
}
130261

131262
/// Return true if we know this builtin never returns.
132263
bool isNoReturn(unsigned ID) const {
133-
return strchr(getRecord(ID).Attributes, 'r') != nullptr;
264+
return strchr(getAttributesString(ID), 'r') != nullptr;
134265
}
135266

136267
/// Return true if we know this builtin can return twice.
137268
bool isReturnsTwice(unsigned ID) const {
138-
return strchr(getRecord(ID).Attributes, 'j') != nullptr;
269+
return strchr(getAttributesString(ID), 'j') != nullptr;
139270
}
140271

141272
/// Returns true if this builtin does not perform the side-effects
142273
/// of its arguments.
143274
bool isUnevaluated(unsigned ID) const {
144-
return strchr(getRecord(ID).Attributes, 'u') != nullptr;
275+
return strchr(getAttributesString(ID), 'u') != nullptr;
145276
}
146277

147278
/// Return true if this is a builtin for a libc/libm function,
148279
/// with a "__builtin_" prefix (e.g. __builtin_abs).
149280
bool isLibFunction(unsigned ID) const {
150-
return strchr(getRecord(ID).Attributes, 'F') != nullptr;
281+
return strchr(getAttributesString(ID), 'F') != nullptr;
151282
}
152283

153284
/// Determines whether this builtin is a predefined libc/libm
@@ -158,29 +289,29 @@ class Context {
158289
/// they do not, but they are recognized as builtins once we see
159290
/// a declaration.
160291
bool isPredefinedLibFunction(unsigned ID) const {
161-
return strchr(getRecord(ID).Attributes, 'f') != nullptr;
292+
return strchr(getAttributesString(ID), 'f') != nullptr;
162293
}
163294

164295
/// Returns true if this builtin requires appropriate header in other
165296
/// compilers. In Clang it will work even without including it, but we can emit
166297
/// a warning about missing header.
167298
bool isHeaderDependentFunction(unsigned ID) const {
168-
return strchr(getRecord(ID).Attributes, 'h') != nullptr;
299+
return strchr(getAttributesString(ID), 'h') != nullptr;
169300
}
170301

171302
/// Determines whether this builtin is a predefined compiler-rt/libgcc
172303
/// function, such as "__clear_cache", where we know the signature a
173304
/// priori.
174305
bool isPredefinedRuntimeFunction(unsigned ID) const {
175-
return strchr(getRecord(ID).Attributes, 'i') != nullptr;
306+
return strchr(getAttributesString(ID), 'i') != nullptr;
176307
}
177308

178309
/// Determines whether this builtin is a C++ standard library function
179310
/// that lives in (possibly-versioned) namespace std, possibly a template
180311
/// specialization, where the signature is determined by the standard library
181312
/// declaration.
182313
bool isInStdNamespace(unsigned ID) const {
183-
return strchr(getRecord(ID).Attributes, 'z') != nullptr;
314+
return strchr(getAttributesString(ID), 'z') != nullptr;
184315
}
185316

186317
/// Determines whether this builtin can have its address taken with no
@@ -194,33 +325,33 @@ class Context {
194325

195326
/// Determines whether this builtin has custom typechecking.
196327
bool hasCustomTypechecking(unsigned ID) const {
197-
return strchr(getRecord(ID).Attributes, 't') != nullptr;
328+
return strchr(getAttributesString(ID), 't') != nullptr;
198329
}
199330

200331
/// Determines whether a declaration of this builtin should be recognized
201332
/// even if the type doesn't match the specified signature.
202333
bool allowTypeMismatch(unsigned ID) const {
203-
return strchr(getRecord(ID).Attributes, 'T') != nullptr ||
334+
return strchr(getAttributesString(ID), 'T') != nullptr ||
204335
hasCustomTypechecking(ID);
205336
}
206337

207338
/// Determines whether this builtin has a result or any arguments which
208339
/// are pointer types.
209340
bool hasPtrArgsOrResult(unsigned ID) const {
210-
return strchr(getRecord(ID).Type, '*') != nullptr;
341+
return strchr(getTypeString(ID), '*') != nullptr;
211342
}
212343

213344
/// Return true if this builtin has a result or any arguments which are
214345
/// reference types.
215346
bool hasReferenceArgsOrResult(unsigned ID) const {
216-
return strchr(getRecord(ID).Type, '&') != nullptr ||
217-
strchr(getRecord(ID).Type, 'A') != nullptr;
347+
return strchr(getTypeString(ID), '&') != nullptr ||
348+
strchr(getTypeString(ID), 'A') != nullptr;
218349
}
219350

220351
/// If this is a library function that comes from a specific
221352
/// header, retrieve that header name.
222353
const char *getHeaderName(unsigned ID) const {
223-
return getRecord(ID).Header.getName();
354+
return getInfo(ID).Header.getName();
224355
}
225356

226357
/// Determine whether this builtin is like printf in its
@@ -245,27 +376,25 @@ class Context {
245376
/// Such functions can be const when the MathErrno lang option and FP
246377
/// exceptions are disabled.
247378
bool isConstWithoutErrnoAndExceptions(unsigned ID) const {
248-
return strchr(getRecord(ID).Attributes, 'e') != nullptr;
379+
return strchr(getAttributesString(ID), 'e') != nullptr;
249380
}
250381

251382
bool isConstWithoutExceptions(unsigned ID) const {
252-
return strchr(getRecord(ID).Attributes, 'g') != nullptr;
383+
return strchr(getAttributesString(ID), 'g') != nullptr;
253384
}
254385

255-
const char *getRequiredFeatures(unsigned ID) const {
256-
return getRecord(ID).Features;
257-
}
386+
const char *getRequiredFeatures(unsigned ID) const;
258387

259388
unsigned getRequiredVectorWidth(unsigned ID) const;
260389

261390
/// Return true if builtin ID belongs to AuxTarget.
262391
bool isAuxBuiltinID(unsigned ID) const {
263-
return ID >= (Builtin::FirstTSBuiltin + TSRecords.size());
392+
return ID >= (Builtin::FirstTSBuiltin + TSInfos.size());
264393
}
265394

266395
/// Return real builtin ID (i.e. ID it would have during compilation
267396
/// for AuxTarget).
268-
unsigned getAuxBuiltinID(unsigned ID) const { return ID - TSRecords.size(); }
397+
unsigned getAuxBuiltinID(unsigned ID) const { return ID - TSInfos.size(); }
269398

270399
/// Returns true if this is a libc/libm function without the '__builtin_'
271400
/// prefix.
@@ -277,16 +406,20 @@ class Context {
277406

278407
/// Return true if this function can be constant evaluated by Clang frontend.
279408
bool isConstantEvaluated(unsigned ID) const {
280-
return strchr(getRecord(ID).Attributes, 'E') != nullptr;
409+
return strchr(getAttributesString(ID), 'E') != nullptr;
281410
}
282411

283412
/// Returns true if this is an immediate (consteval) function
284413
bool isImmediate(unsigned ID) const {
285-
return strchr(getRecord(ID).Attributes, 'G') != nullptr;
414+
return strchr(getAttributesString(ID), 'G') != nullptr;
286415
}
287416

288417
private:
289-
const Info &getRecord(unsigned ID) const;
418+
std::pair<const char *, const Info &> getStrTableAndInfo(unsigned ID) const;
419+
420+
const Info &getInfo(unsigned ID) const {
421+
return getStrTableAndInfo(ID).second;
422+
}
290423

291424
/// Helper function for isPrintfLike and isScanfLike.
292425
bool isLike(unsigned ID, unsigned &FormatIdx, bool &HasVAListArg,

‎clang/include/clang/Basic/BuiltinsPPC.def

+1
Original file line numberDiff line numberDiff line change
@@ -1138,5 +1138,6 @@ UNALIASED_CUSTOM_BUILTIN(mma_pmxvbf16ger2nn, "vW512*VVi15i15i3", true,
11381138
// FIXME: Obviously incomplete.
11391139

11401140
#undef BUILTIN
1141+
#undef TARGET_BUILTIN
11411142
#undef CUSTOM_BUILTIN
11421143
#undef UNALIASED_CUSTOM_BUILTIN

0 commit comments

Comments
 (0)
Please sign in to comment.