llvm学习笔记(3)

2.2.2. 参数描述

Instruction定义中的OutOperandList与InOperandList分别是这样的dag:(outs op1, op2, …),(ins op1, op2, …)。Op可以是寄存器(这时通过RegisterClass来说明其寄存器类型)。

2.2.2.1. 寄存器

2.2.2.1.1. Register

今天,可用的目标机器一定有寄存器(完全依靠栈运行的机器已经没有了)。在LLVM的后端,对寄存器的描述也是一个重要的工作。这些寄存器定义保存在目标机器文件 Target
RegisterInfo.td中,它们都用到下面的基类(target.td)。


80

class
Register
<string n, list altNames =[]> {


81

string Namespace = “”;


82

string AsmName = n;


83

list AltNames = altNames;

84


85

// Aliases – Alist of registers that this register overlaps with. A read or


86

// modificationof this register can potentially read or modify the aliased


87

// registers.


88

list Aliases = [];

89


90

// SubRegs – Alist of registers that are parts of this register. Note these


91

// are”immediate” sub-registers and the registers within the list do not


92

// themselvesoverlap. e.g. For X86, EAX’s SubRegs list contains only [AX],


93

// not [AX, AH,AL].


94

list SubRegs = [];

95


96

// SubRegIndices- For each register in SubRegs, specify the SubRegIndex used


97

// to address it.Sub-sub-register indices are automatically inherited from


98

// SubRegs.


99

list SubRegIndices = [];

100


101

//RegAltNameIndices – The alternate name indices which are valid for this


102

// register.


103

list RegAltNameIndices= [];

104


105

// DwarfNumbers -Numbers used internally by gcc/gdb to identify the register.


106

// These valuescan be determined by locating the .h file in the


107

// directoryllvmgcc/gcc/config// and looking for REGISTER_NAMES. The


108

// order of thesenames correspond to the enumeration used by gcc. A value of


109

// -1 indicatesthat the gcc number is undefined and -2 that register number


110

// is invalid forthis mode/flavour.


111

list DwarfNumbers = [];

112


113

// CostPerUse -Additional cost of instructions using this register compared


114

// to otherregisters in its class. The register allocator will try to


115

// minimize thenumber of instructions using a register with a CostPerUse.


116

// This is usedby the x86-64 and ARM Thumb targets where some registers


117

// require largerinstruction encodings.


118

int CostPerUse = 0;

119


120

//CoveredBySubRegs – When this bit is set, the value of this register is


121

// completelydetermined by the value of its sub-registers. For example, the


122

// x86 registerAX is covered by its sub-registers AL and AH, but EAX is not


123

// covered by itssub-register AX.


124

bit CoveredBySubRegs = 0;

125


126

// HWEncoding -The target specific hardware encoding for this register.


127

bits HWEncoding = 0;


128

}

目标机器根据需要进一步派生。如X86机器的派生定义是X86Reg(X86RegisterInfo.td):


16

class
X86Reg
<string n, bits Enc, listsubregs = []> : Register
{


17

let
Namespace= “X86”;


18

let
HWEncoding = Enc;


19

let
SubRegs =subregs;


20

}

Register的定义比想象的要复杂,这是因为有些寄存器是有可援引部分的。比如X86的EAX,AX与AL都是可援引寄存器,在TD文件里都有一个Register定义。但实际上EAX,AX与AL援引的都是同一个寄存器,而且EAX包含了AX,AX包含了AL。为了更好地使用寄存器,后端需要知道这些关系。因此,需要用到上面的SubRegs、SubRegIndices。其中SubRegIndices是一个SubRegIndex类型列表(target.td),它给出了一个偏移及大小,唯一确定了一个寄存器中的可援引部分(子寄存器)。注意,寄存器索引与具体的寄存器是无关的,一个指定的偏移与大小,只需要一个寄存器索引定义来表述。SubRegs与SubRegIndices必须是一一对应的。


25

class
SubRegIndex
{


26

string Namespace = “”;

27


28

// Size – Size(in bits) of the sub-registers represented by this index.


29

int Size = size;

30


31

// Offset – Offsetof the first bit that is part of this sub-register index.


32

// Set it to -1if the same index is used to represent sub-registers that can


33

// be atdifferent offsets (for example when using an index to access an


34

// element in aregister tuple).


35

int Offset = offset;

36


37

// ComposedOf – Alist of two SubRegIndex instances, [A, B].


38

// This indicatesthat this SubRegIndex is the result of composing A and B.


39

// SeeComposedSubRegIndex.


40

list ComposedOf = [];

41


42

//CoveringSubRegIndices – A list of two or more sub-register indexes that


43

// cover thissub-register.


44

//


45

// This fieldshould normally be left blank as TableGen can infer it.


46

//


47

// TableGenautomatically detects sub-registers that straddle the registers


48

// in the SubRegsfield of a Register definition. For example:


49

//


50

// Q0 = dsub_0 -> D0, dsub_1 -> D1


51

// Q1 = dsub_0 -> D2, dsub_1 -> D3


52

// D1_D2 = dsub_0 -> D1, dsub_1 -> D2


53

// QQ0 = qsub_0 -> Q0, qsub_1 -> Q1


54

//


55

// TableGen willinfer that D1_D2 is a sub-register of QQ0. It will be given


56

// the syntheticindex dsub_1_dsub_2 unless some SubRegIndex is defined with


57

//CoveringSubRegIndices = [dsub_1, dsub_2].


58

list CoveringSubRegIndices= [];


59

}

TableGen能根据Register定义中SubRegIndices的定义自动推导寄存器索引间的关系。但存在复杂的、难以描述与推导的情形(参考下面ARM的例子)。这时需要使用SubRegIndex的ComposedOf以及CoveringSubRegIndices域来描述(LLVM-3.6实际上还没有使用CoveringSubRegIndices)。

例如,索引Id1援引寄存器R的子寄存器Rr,索引Id2援引Rr的子寄存器r,如果索引Id3援引R对应r的部分,Id3等效于对R先后施行Id1与Id2(下面称为复合,Compose)。通常,TableGen能推导出这样的关系,否则需要在ComposedOf域中明确指出。

2.2.2.1.2. X86的例子

举例来说,在X86RegisterInfo.td文件中对X86目标机器定义了以下的SubRegIndex:


23

let
Namespace =”X86″ in
{


24

def
sub_8bit : SubRegIndex;


25

def
sub_8bit_hi : SubRegIndex;


26

def
sub_16bit : SubRegIndex;


27

def
sub_32bit : SubRegIndex;


28

def
sub_xmm : SubRegIndex;


29

def
sub_ymm : SubRegIndex;


30

}

那么X86目标机器可以使用的寄存器具有以下定义。48~74行是X86中最小的可援引寄存器,它们不需要使用索引。但从78行的AX开始的寄存器都存在可援引部分,每个可援引部分都定义一个索引,以及该部分的定义,比如索引sub_8bit与sub_8bit_hi分别援引AX中的AL与AH。

不过,这里的描述是不完整的。比如:R8àR8DàR8WàR8B,R8B在R8中是可直接援引的,但相应的索引没有直接定义。推导这样的索引是TableGen的工作,下面可以看到。


46

// 8-bit registers


47

// Low registers


48

def
AL : X86Reg;


49

def
DL : X86Reg;


50

def
CL : X86Reg;


51

def
BL : X86Reg;

52


53

// High registers. On x86-64, these cannot be used in anyinstruction


54

// with a REX prefix.


55

def
AH : X86Reg;


56

def
DH : X86Reg;


57

def
CH : X86Reg;


58

def
BH : X86Reg;

59


60

// X86-64 only, requires REX.


61

let
CostPerUse = 1 in
{


62

def
SIL :X86Reg;


63

def
DIL :X86Reg;


64

def
BPL :X86Reg;


65

def
SPL :X86Reg;


66

def
R8B :X86Reg;


67

def
R9B :X86Reg;


68

def
R10B : X86Reg;


69

def
R11B : X86Reg;


70

def
R12B : X86Reg;


71

def
R13B : X86Reg;


72

def
R14B : X86Reg;


73

def
R15B : X86Reg;


74

}

75


76

// 16-bit registers


77

let
SubRegIndices = [sub_8bit, sub_8bit_hi],CoveredBySubRegs = 1 in
{


78

def
AX : X86Reg;


79

def
DX : X86Reg;


80

def
CX : X86Reg;


81

def
BX : X86Reg;


82

}


83

let
SubRegIndices = [sub_8bit] in
{


84

def
SI : X86Reg;


85

def
DI : X86Reg;


86

def
BP : X86Reg;


87

def
SP : X86Reg;


88

}


89

def
IP : X86Reg;

90


91

// X86-64 only, requires REX.


92

let
SubRegIndices = [sub_8bit], CostPerUse = 1 in
{


93

def
R8W :X86Reg;


94

def
R9W :X86Reg;


95

def
R10W : X86Reg;


96

def
R11W : X86Reg;


97

def
R12W : X86Reg;


98

def
R13W : X86Reg;


99

def
R14W : X86Reg;


100

def
R15W : X86Reg;


101

}

102


103

// 32-bit registers


104

let
SubRegIndices = [sub_16bit] in
{


105

def
EAX : X86Reg,DwarfRegNum;


106

def
EDX : X86Reg, DwarfRegNum;


107

def
ECX : X86Reg,DwarfRegNum;


108

def
EBX : X86Reg,DwarfRegNum;


109

def
ESI : X86Reg,DwarfRegNum;


110

def
EDI : X86Reg,DwarfRegNum;


111

def
EBP : X86Reg,DwarfRegNum;


112

def
ESP : X86Reg,DwarfRegNum;


113

def
EIP : X86Reg,DwarfRegNum;

114


115

// X86-64 only, requires REX


116

let
CostPerUse = 1 in
{


117

def
R8D :X86Reg;


118

def
R9D :X86Reg;


119

def
R10D : X86Reg;


120

def
R11D : X86Reg;


121

def
R12D : X86Reg;


122

def
R13D : X86Reg;


123

def
R14D : X86Reg;


124

def
R15D : X86Reg;


125

}}

126


127

// 64-bit registers, X86-64 only


128

let
SubRegIndices = [sub_32bit] in
{


129

def
RAX : X86Reg,DwarfRegNum;


130

def
RDX : X86Reg,DwarfRegNum;


131

def
RCX : X86Reg,DwarfRegNum;


132

def
RBX : X86Reg,DwarfRegNum;


133

def
RSI : X86Reg,DwarfRegNum;


134

def
RDI : X86Reg,DwarfRegNum;


135

def
RBP : X86Reg,DwarfRegNum;


136

def
RSP : X86Reg,DwarfRegNum;

137


138

// These also require REX.


139

let
CostPerUse = 1 in
{


140

def
R8 :X86Reg, DwarfRegNum;


141

def
R9 :X86Reg, DwarfRegNum;


142

def
R10 : X86Reg,DwarfRegNum;


143

def
R11 : X86Reg,DwarfRegNum;


144

def
R12 : X86Reg,DwarfRegNum;


145

def
R13 : X86Reg,DwarfRegNum;


146

def
R14 : X86Reg,DwarfRegNum;


147

def
R15 : X86Reg,DwarfRegNum;


148

def
RIP : X86Reg, DwarfRegNum;


149

}}

150


151

// MMX Registers. These are actually aliased to ST0 ..ST7


152

def
MM0 : X86Reg,DwarfRegNum;


153

def
MM1 : X86Reg,DwarfRegNum;


154

def
MM2 : X86Reg,DwarfRegNum;


155

def
MM3 : X86Reg,DwarfRegNum;


156

def
MM4 : X86Reg,DwarfRegNum;


157

def
MM5 : X86Reg, DwarfRegNum;


158

def
MM6 : X86Reg,DwarfRegNum;


159

def
MM7 : X86Reg,DwarfRegNum;

160


161

// Pseudo Floating Point registers


162

def
FP0 : X86Reg;


163

def
FP1 : X86Reg;


164

def
FP2 : X86Reg;


165

def
FP3 : X86Reg;


166

def
FP4 : X86Reg;


167

def
FP5 : X86Reg;


168

def
FP6 : X86Reg;


169

def
FP7 : X86Reg;

170


171

// XMM Registers, used by the various SSE instruction setextensions.


172

def
XMM0: X86Reg,DwarfRegNum;


173

def
XMM1: X86Reg,DwarfRegNum;


174

def
XMM2: X86Reg,DwarfRegNum;


175

def
XMM3: X86Reg,DwarfRegNum;


176

def
XMM4: X86Reg,DwarfRegNum;


177

def
XMM5: X86Reg,DwarfRegNum;


178

def
XMM6: X86Reg,DwarfRegNum;


179

def
XMM7: X86Reg,DwarfRegNum;

180


181

// X86-64 only


182

let
CostPerUse = 1 in
{


183

def
XMM8: X86Reg, DwarfRegNum;


184

def
XMM9: X86Reg, DwarfRegNum;


185

def
XMM10: X86Reg,DwarfRegNum;


186

def
XMM11: X86Reg, DwarfRegNum;


187

def
XMM12: X86Reg,DwarfRegNum;


188

def
XMM13: X86Reg,DwarfRegNum;


189

def
XMM14: X86Reg,DwarfRegNum;


190

def
XMM15: X86Reg, DwarfRegNum;

191


192

def
XMM16: X86Reg, DwarfRegNum;


193

def
XMM17: X86Reg, DwarfRegNum;


194

def
XMM18: X86Reg, DwarfRegNum;


195

def
XMM19: X86Reg, DwarfRegNum;


196

def
XMM20: X86Reg, DwarfRegNum;


197

def
XMM21: X86Reg, DwarfRegNum;


198

def
XMM22: X86Reg, DwarfRegNum;


199

def
XMM23: X86Reg, DwarfRegNum;


200

def
XMM24: X86Reg, DwarfRegNum;


201

def
XMM25: X86Reg, DwarfRegNum;


202

def
XMM26: X86Reg, DwarfRegNum;


203

def
XMM27: X86Reg, DwarfRegNum;


204

def
XMM28: X86Reg, DwarfRegNum;


205

def
XMM29: X86Reg, DwarfRegNum;


206

def
XMM30: X86Reg, DwarfRegNum;


207

def
XMM31: X86Reg, DwarfRegNum;

208


209

} // CostPerUse

210


211

// YMM0-15 registers, used by AVX instructions and


212

// YMM16-31 registers, used by AVX-512 instructions
.


213

let
SubRegIndices = [sub_xmm] in
{


214

foreach
Index = 0-31 in
{


215

def
YMM#Index: X86Reg<"ymm"#Index, Index, [! cast
(“XMM”#Index)]>,


216

DwarfRegAlias<! cast
(“XMM”#Index)>;


217

}


218

}

219


220

// ZMM Registers, used by AVX-512 instructions.


221

let
SubRegIndices = [sub_ymm] in
{


222

foreach
Index = 0-31 in
{


223

def
ZMM#Index : X86Reg<"zmm"#Index, Index, [! cast
(“YMM”#Index)]>,


224

DwarfRegAlias<! cast
(“XMM”#Index)>;


225

}


226

}

227


228

// MaskRegisters, used by AVX-512 instructions.


229

def
K0 :X86Reg, DwarfRegNum;


230

def
K1 :X86Reg, DwarfRegNum;


231

def
K2 :X86Reg, DwarfRegNum;


232

def
K3 :X86Reg, DwarfRegNum;


233

def
K4 :X86Reg, DwarfRegNum;


234

def
K5 :X86Reg, DwarfRegNum;


235

def
K6 :X86Reg, DwarfRegNum;


236

def
K7 :X86Reg, DwarfRegNum;

237


238

// Floating point stack registers. These don’t mapone-to-one to the FP


239

// pseudo registers, but we still mark them as aliasingFP registers. That


240

// way both kinds can be live without exceeding the stackdepth. ST registers


241

// are only live around inline assembly.


242

def
ST0 : X86Reg,DwarfRegNum;


243

def
ST1 : X86Reg,DwarfRegNum;


244

def
ST2 : X86Reg,DwarfRegNum;


245

def
ST3 : X86Reg,DwarfRegNum;


246

def
ST4 : X86Reg,DwarfRegNum;


247

def
ST5 : X86Reg,DwarfRegNum;


248

def
ST6 : X86Reg,DwarfRegNum;


249

def
ST7 : X86Reg,DwarfRegNum;

250


251

// Floating-point status word


252

def
FPSW : X86Reg;

253


254

// Status flags register


255

def
EFLAGS : X86Reg;

256


257

// Segment registers


258

def
CS : X86Reg;


259

def
DS : X86Reg;


260

def
SS : X86Reg;


261

def
ES : X86Reg;


262

def
FS : X86Reg;


263

def
GS : X86Reg;

264


265

// Debug registers


266

def
DR0 :X86Reg;


267

def
DR1 :X86Reg;


268

def
DR2 :X86Reg;


269

def
DR3 :X86Reg;


270

def
DR4 :X86Reg;


271

def
DR5 :X86Reg;


272

def
DR6 :X86Reg;


273

def
DR7 :X86Reg;


274

def
DR8 :X86Reg;


275

def
DR9 :X86Reg;


276

def
DR10 : X86Reg;


277

def
DR11 : X86Reg;


278

def
DR12 : X86Reg;


279

def
DR13 : X86Reg;


280

def
DR14 : X86Reg;


281

def
DR15 : X86Reg;

282


283

// Control registers


284

def
CR0 :X86Reg;


285

def
CR1 :X86Reg;


286

def
CR2 : X86Reg;


287

def
CR3 :X86Reg;


288

def
CR4 :X86Reg;


289

def
CR5 :X86Reg;


290

def
CR6 :X86Reg;


291

def
CR7 :X86Reg;


292

def
CR8 :X86Reg;


293

def
CR9 :X86Reg;


294

def
CR10 : X86Reg;


295

def
CR11 : X86Reg;


296

def
CR12 : X86Reg;


297

def
CR13 : X86Reg;


298

def
CR14 : X86Reg;


299

def
CR15 : X86Reg;

300


301

// Pseudo index registers


302

def
EIZ : X86Reg;


303

def
RIZ : X86Reg;

304


305

// Bound registers, used in MPX instructions


306

def
BND0 : X86Reg;


307

def
BND1 : X86Reg;


308

def
BND2 : X86Reg;


309

def
BND3 : X86Reg;

这些定义可以参考《Intel® 64 and IA-32Architectures Software Developer’s Manual, Volume 1》中的表3-2。

寄存器类型

没有REX前缀

有REX前缀

字节寄存器

AL, BL, CL, DL, AH, BH, CH, DH

AL, BL, CL, DL, DIL, SIL, BPL, SPL, R8L-R15L

字寄存器

AX, BX, CX, DX, DI, SI, BP, SP

AX, BX, CX, DX, DI, SI, BP, SP, R8W-R15W

双字寄存器

EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP

EAX, EBX, ECX, EDX, EDI, ESI, EBP, ESP, R8D-R15D

四字寄存器

N.A.

RAX, RBX, RCX, RDX, RDI, RSI, RBP, RSP, R8-R15

2.2.2.1.3. RegisterClass

寄存器也是有类型的,比如浮点值不能存入EBX这样的通用寄存器,但与普通的类型系统又有所不同,比如MMX寄存器就可以放入整数或浮点值,在通用寄存器不够用时,可以用来存放整数值,反之不可。因此,为了描述寄存器的用途,LLVM定义了RegisterClass。相同用途的寄存器归入同一个RegisterClass,同一个RegisterClass中的寄存器可以互换。


151

class
RegisterClass
<string namespace, listregTypes, int alignment,


152

dag
regList, RegAltNameIndex idx = NoRegAltName>


153

: DAGOperand {


154

string Namespace = namespace;

155


156

// RegType -Specify the list ValueType of the registers in this register


157

// class. Note that all registers in a register classmust have the same


158

//ValueTypes. This is a list because sometargets permit storing different


159

// types in sameregister, for example vector values with 128-bit total size,


160

// but differentcount/size of items, like SSE on x86.


161

//


162

list RegTypes = regTypes;

163


164

// Size – Specifythe spill size in bits of the registers. A default value of


165

// zero letstablgen pick an appropriate size.


166

int Size = 0;

167


168

// Alignment -Specify the alignment required of the registers when they are


169

// stored orloaded to memory.


170

//


171

int Alignment = alignment;

172


173

// CopyCost -This value is used to specify the cost of copying a value


174

// between tworegisters in this register class. The default value is one


175

// meaning ittakes a single instruction to perform the copying. A negative


176

// value meanscopying is extremely expensive or impossible.


177

int CopyCost = 1;

178


179

// MemberList -Specify which registers are in this class. If the


180

//allocation_order_* method are not specified, this also defines the order of


181

// allocationused by the register allocator.


182

//


183

dag
MemberList = regList;

184


185

// AltNameIndex -The alternate register name to use when printing operands


186

// of thisregister class. Every register in the register class must have


187

// a validalternate name for the given index.


188

RegAltNameIndex altNameIndex = idx;

189


190

// isAllocatable- Specify that the register class can be used for virtual


191

// registers andregister allocation. Some registerclasses are only used to


192

// modelinstruction operand constraints, and should have isAllocatable = 0.


193

bit isAllocatable = 1;

194


195

// AltOrders -List of alternative allocation orders. The default order is


196

// MemberListitself, and that is good enough for most targets since the


197

// registerallocators automatically remove reserved registers and move


198

// callee-savedregisters to the end.


199

list< dag
>AltOrders = [];

200


201

// AltOrderSelect- The body of a function that selects the allocation order


202

// to use in agiven machine function. The code will be inserted in a


203

// function likethis:


204

//


205

// static inline unsigned f(constMachineFunction &MF) { … }


206

//


207

// The functionshould return 0 to select the default order defined by


208

// MemberList, 1to select the first AltOrders entry and so on.


209

code AltOrderSelect = [{}];

210


211

// Specifyallocation priority for register allocators using a greedy


212

// heuristic.Classes with higher priority values are assigned first. This is


213

// useful as itis sometimes beneficial to assign registers to highly


214

// constrainedclasses first. The value has to be in the range [0,63].


215

int AllocationPriority = 0;


216

}

162行的RegTypes是该类别寄存器支持的类型,支持的类型可有多个,因此需要list。183行的MemberList指定同一个RegisterClass中寄存器的分配顺序(在前的先用)。但是,对某些处理器家族,比如X86,不同类型CPU的寄存器类型、数量有很大的差异,MemberList只适用其中的部分CPU,对其他的CPU需要另一个序列,这就是AltOrders。这时需要一个方法,指明到底用谁(返回0使用MemberList的顺序,返回1使用AltOrders的顺序),因此AltOrderSelect用于封装嵌入的选择函数的代码片段。

X86目标机器定义了这些RegisterClass:


328

def
GR8 :RegisterClass<"X86", [i8], 8,


329

( add
AL, CL, DL, AH, CH, DH, BL, BH, SIL, DIL, BPL,SPL,


330

R8B, R9B, R10B,R11B, R14B, R15B, R12B, R13B)> {


331

let
AltOrders= [( sub
GR8, AH, BH, CH, DH)];


332

let
AltOrderSelect = [{


333

returnMF.getSubtarget().is64Bit();


334

}];


335

}

336


337

def
GR16 : RegisterClass<"X86", [i16],16,


338

( add
AX, CX, DX, SI, DI, BX, BP, SP,


339

R8W, R9W, R10W,R11W, R14W, R15W, R12W, R13W)>;

340


341

def
GR32 : RegisterClass<"X86", [i32],32,


342

( add
EAX, ECX, EDX, ESI, EDI, EBX, EBP, ESP,


343

R8D, R9D, R10D,R11D, R14D, R15D, R12D, R13D)>;

344


345

// GR64 – 64-bit GPRs. This oddly includes RIP, whichisn’t accurate, since


346

// RIP isn’t really a register and it can’t be usedanywhere except in an


347

// address, but it doesn’t cause trouble.


348

def
GR64 : RegisterClass<"X86", [i64],64,


349

( add
RAX, RCX, RDX, RSI, RDI, R8, R9, R10, R11,


350

RBX, R14, R15,R12, R13, RBP, RSP, RIP)>;

351


352

// Segment registers for use by MOV instructions (andothers) that have a


353

// segmentregister as one operand. Always containa 16-bit segment


354

// descriptor.


355

def
SEGMENT_REG : RegisterClass<"X86",[i16], 16, ( add
CS, DS, SS, ES, FS, GS)>;

356


357

// Debug registers.


358

def
DEBUG_REG : RegisterClass<"X86",[i32], 32, ( sequence
“DR%u”, 0,7)>;

359


360

// Control registers.


361

def
CONTROL_REG : RegisterClass<"X86",[i64], 64, ( sequence
“CR%u”, 0,15)>;

362


363

// GR8_ABCD_L, GR8_ABCD_H, GR16_ABCD, GR32_ABCD,GR64_ABCD – Subclasses of


364

// GR8, GR16, GR32, and GR64 which contain just the”a” “b”, “c”, and “d”


365

// registers. On x86-32, GR16_ABCD and GR32_ABCD areclasses for registers


366

// that support 8-bit subreg operations. On x86-64,GR16_ABCD, GR32_ABCD,


367

// and GR64_ABCD are classes for registers that support8-bit h-register


368

// operations.


369

def
GR8_ABCD_L : RegisterClass<"X86",[i8], 8, ( add
AL, CL, DL, BL)>;


370

def
GR8_ABCD_H : RegisterClass<"X86",[i8], 8, ( add
AH, CH, DH, BH)>;


371

def
GR16_ABCD : RegisterClass<"X86",[i16], 16, ( add
AX, CX, DX, BX)>;


372

def
GR32_ABCD : RegisterClass<"X86",[i32], 32, ( add
EAX, ECX, EDX, EBX)>;


373

def
GR64_ABCD : RegisterClass<"X86",[i64], 64, ( add
RAX, RCX, RDX, RBX)>;


374

def
GR32_TC :RegisterClass<"X86", [i32], 32, ( add
EAX, ECX, EDX)>;


375

def
GR64_TC :RegisterClass<"X86", [i64], 64, ( add
RAX, RCX, RDX, RSI, RDI,


376

R8, R9, R11, RIP)>;


377

def
GR64_TCW64 : RegisterClass<"X86",[i64], 64, ( add
RAX, RCX, RDX,


378

R8, R9, R11)>;

379


380

// GR8_NOREX – GR8 registers which do not require a REXprefix.


381

def
GR8_NOREX : RegisterClass<"X86",[i8], 8,


382

( add
AL, CL, DL, AH, CH, DH, BL, BH)> {


383

let
AltOrders= [( sub
GR8_NOREX, AH, BH, CH, DH)];


384

let
AltOrderSelect = [{


385

returnMF.getSubtarget().is64Bit();


386

}];


387

}


388

// GR16_NOREX – GR16 registers which do not require a REXprefix.


389

def
GR16_NOREX : RegisterClass<"X86",[i16], 16,


390

( add
AX, CX, DX, SI, DI, BX, BP, SP)>;


391

// GR32_NOREX – GR32 registers which do not require a REXprefix.


392

def
GR32_NOREX : RegisterClass<"X86",[i32], 32,


393

( add
EAX, ECX, EDX, ESI, EDI, EBX, EBP, ESP)>;


394

// GR64_NOREX – GR64 registers which do not require a REXprefix.


395

def
GR64_NOREX : RegisterClass<"X86",[i64], 64,


396

( add
RAX, RCX, RDX, RSI, RDI, RBX, RBP, RSP, RIP)>;

397


398

// GR32_NOAX – GR32 registers except EAX. Used byAddRegFrm of XCHG32 in 64-bit


399

// mode to prevent encoding using the 0x90 NOP encoding.xchg %eax, %eax needs


400

// to clear upper 32-bits of RAX so is not a NOP.


401

def
GR32_NOAX : RegisterClass<"X86",[i32], 32, ( sub
GR32, EAX)>;

402


403

// GR32_NOSP – GR32 registers except ESP.


404

def
GR32_NOSP : RegisterClass<"X86",[i32], 32, ( sub
GR32, ESP)>;

405


406

// GR64_NOSP – GR64 registers except RSP (and RIP).


407

def
GR64_NOSP : RegisterClass<"X86",[i64], 64, ( sub
GR64, RSP, RIP)>;

408


409

// GR32_NOREX_NOSP – GR32 registers which do not requirea REX prefix except


410

// ESP.


411

def
GR32_NOREX_NOSP :RegisterClass<"X86", [i32], 32,


412

( and
GR32_NOREX, GR32_NOSP)>;

413


414

// GR64_NOREX_NOSP – GR64_NOREX registers except RSP.


415

def
GR64_NOREX_NOSP :RegisterClass<"X86", [i64], 64,


416

( and
GR64_NOREX, GR64_NOSP)>;

417


418

// A class to support the ‘A’ assembler constraint: EAXthen EDX.


419

def
GR32_AD : RegisterClass<"X86",[i32], 32, ( add
EAX, EDX)>;

420


421

// Scalar SSE2 floating point registers.


422

def
FR32 : RegisterClass<"X86", [f32],32, ( sequence
“XMM%u”, 0, 15)>;

423


424

def
FR64 : RegisterClass<"X86", [f64],64, ( add
FR32)>;

425

426


427

// FIXME: This sets up the floating point register filesas though they are f64


428

// values, though they really are f80 values. This will cause us to spill


429

// values as 64-bit quantities instead of 80-bitquantities, which is much much


430

// faster on common hardware. In reality, this should be controlled by a


431

// command line option or something.

432


433

def
RFP32 : RegisterClass<"X86",[f32],32, ( sequence
“FP%u”, 0, 6)>;


434

def
RFP64 : RegisterClass<"X86",[f64],32, ( add
RFP32)>;


435

def
RFP80 : RegisterClass<"X86",[f80],32, ( add
RFP32)>;

436


437

// Floating point stack registers (these are notallocatable by the


438

// register allocator – the floating point stackifier isresponsible


439

// for transforming FPn allocations to STn registers)


440

def
RST : RegisterClass<"X86", [f80,f64, f32], 32, ( sequence
“ST%u”, 0,7)> {


441

let
isAllocatable = 0;


442

}

443


444

// Generic vector registers: VR64 and VR128.


445

def
VR64: RegisterClass<"X86", [x86mmx],64, ( sequence
“MM%u”, 0, 7)>;


446

def
VR128 : RegisterClass<"X86", [v16i8,v8i16, v4i32, v2i64, v4f32, v2f64],


447

128, ( add
FR32)>;


448

def
VR256 : RegisterClass<"X86", [v32i8,v16i16, v8i32, v4i64, v8f32, v4f64],


449

256, ( sequence
“YMM%u”, 0, 15)>;

450


451

// Status flags registers.


452

def
CCR : RegisterClass<"X86", [i32],32, ( add
EFLAGS)> {


453

let
CopyCost= -1; //Don’t allow copying of status registers.


454

let
isAllocatable = 0;


455

}


456

def
FPCCR : RegisterClass<"X86", [i16],16, ( add
FPSW)> {


457

let
CopyCost= -1; //Don’t allow copying of status registers.


458

let
isAllocatable = 0;


459

}

460


461

// AVX-512 vector/mask registers.


462

def
VR512 : RegisterClass<"X86",[v16f32, v8f64, v64i8, v32i16, v16i32, v8i64], 512,


463

( sequence
“ZMM%u”, 0, 31)>;

464


465

// Scalar AVX-512 floating point registers.


466

def
FR32X : RegisterClass<"X86", [f32],32, ( sequence
“XMM%u”, 0, 31)>;

467


468

def
FR64X : RegisterClass<"X86", [f64],64, ( add
FR32X)>;

469


470

// Extended VR128 and VR256 for AVX-512 instructions


471

def
VR128X : RegisterClass<"X86",[v16i8, v8i16, v4i32, v2i64, v4f32, v2f64],


472

128, ( add
FR32X)>;


473

def
VR256X : RegisterClass<"X86",[v32i8, v16i16, v8i32, v4i64, v8f32, v4f64],


474

256, ( sequence
“YMM%u”, 0, 31)>;

475


476

// Mask registers


477

def
VK1 :RegisterClass<"X86", [i1], 8, ( sequence
“K%u”, 0, 7)> { let
Size = 8;}


478

def
VK2 :RegisterClass<"X86", [v2i1], 8, ( add
VK1)> { let
Size = 8;}


479

def
VK4 :RegisterClass<"X86", [v4i1], 8, ( add
VK2)> { let
Size = 8;}


480

def
VK8 :RegisterClass<"X86", [v8i1], 8, ( add
VK4)> { let
Size = 8;}


481

def
VK16 :RegisterClass<"X86", [v16i1], 16, ( add
VK8)> { let
Size = 16;}


482

def
VK32 :RegisterClass<"X86", [v32i1], 32, ( add
VK16)> { let
Size = 32;}


483

def
VK64 :RegisterClass<"X86", [v64i1], 64, ( add
VK32)> { let
Size = 64;}

484


485

def
VK1WM :RegisterClass<"X86", [i1], 8, ( sub
VK1, K0)> { let
Size = 8;}


486

def
VK2WM :RegisterClass<"X86", [v2i1], 8, ( sub
VK2, K0)> { let
Size = 8;}


487

def
VK4WM :RegisterClass<"X86", [v4i1], 8, ( sub
VK4, K0)> { let
Size = 8;}


488

def
VK8WM :RegisterClass<"X86", [v8i1], 8, ( sub
VK8, K0)> { let
Size = 8;}


489

def
VK16WM :RegisterClass<"X86", [v16i1], 16, ( add
VK8WM)> { let
Size = 16;}


490

def
VK32WM :RegisterClass<"X86", [v32i1], 32, ( add
VK16WM)> { let
Size = 32;}


491

def
VK64WM :RegisterClass<"X86", [v64i1], 64, ( add
VK32WM)> { let
Size = 64;}

492


493

// Bound registers


494

def
BNDR : RegisterClass<"X86", [v2i64], 128, ( sequence
“BND%u”, 0, 3)>;

TD语言提供支持简单集合操作的关键字。像上面的add可向当前的dag添加指定的集合成员,sub则从指定的dag集合删除指定的成员。而sequence则可以方便地生成一系列成员并加入当前的dag,比如最后一行的 sequence
“BND%u”, 0, 3将创建一个包含BND0,BND1,BND2与BND3的集合。Tablegen的语法解析器在遇到这些关键字时完成相关的操作。

具体来说,在64位机器里,GR8与GR8_NOREX都要排除掉AH,BH,CH与DH,具体原因是:它们不能在一个要求REX前缀的指令里编码,而SIL,DIL,BPL,R8D等要求一个REX前缀。例如,addb%ah, %dil与movzbl%ah, %r8d不能被编码。

2.2.2.1.4. ARM的例子

X86的寄存器描述不算复杂。ARM则是一个极端。ARM架构有16个统一的(uniform)32位寄存器,另外,它的featureVPF与NEON还有额外的16✕64位寄存器。VPF与NEON可以把这些寄存器视为不同的大小(32,64, 128, 256比特)。由于存在这样复杂的关系,因此,不像X86那样一段一段地使用SubRegIndex来描述,Tablegen采用了比较自动化的方式。


29

let
Namespace =”ARM” in
{


30

def
qqsub_0 : SubRegIndex;


31

def
qqsub_1 : SubRegIndex;

32


33

// Note: Code depends on these having consecutivenumbers.


34

def
qsub_0 : SubRegIndex;


35

def
qsub_1 : SubRegIndex;


36

def
qsub_2 : ComposedSubRegIndex; //
偏移
256
,长度
128


37

def
qsub_3 : ComposedSubRegIndex; //
偏移
384
,长度
128

38


39

def
dsub_0 : SubRegIndex;


40

def
dsub_1 : SubRegIndex;


41

def
dsub_2 : ComposedSubRegIndex
; //
偏移
128
,长度
64


42

def
dsub_3 : ComposedSubRegIndex; //
偏移
192
,长度
64


43

def
dsub_4 : ComposedSubRegIndex; //
偏移
256
,长度
64


44

def
dsub_5 : ComposedSubRegIndex; //
偏移
320
,长度
64


45

def
dsub_6 : ComposedSubRegIndex; //
偏移
384
,长度
64


46

def
dsub_7 : ComposedSubRegIndex; //
偏移
448
,长度
64

47


48

def
ssub_0 :SubRegIndex;


49

def
ssub_1 :SubRegIndex;


50

def
ssub_2 :ComposedSubRegIndex; //
偏移
64
,长度
32


51

def
ssub_3 :ComposedSubRegIndex; //
偏移
96
,长度
32

52


53

def
gsub_0 :SubRegIndex;


54

def
gsub_1 :SubRegIndex;


55

// Let TableGen synthesize the remaining 12 ssub_*indices.


56

// We don’t need to name them.


57

}

在TD描述中,ARM的16个Q寄存器(64位)分为两组,每组512个比特。这样分是因为这些寄存器还可以用作D寄存器(32位),而不同的ARM版本看到16或32个D寄存器。因此,定义上面所示的SubRegIndex。

上面的ComposedSubRegIndex派生定义描述了SubRegIndex之间的复合关系。


63

class
ComposedSubRegIndex


64

: SubRegIndex<B.Size, ! if
(! eq
(A.Offset, -1),-1,


65

! if
(! eq
(B.Offset, -1),-1,


66

! add
(A.Offset, B.Offset)))> {


67

// SeeSubRegIndex.


68

let
ComposedOf = [A, B];


69

}

复合出来的SubRegIndex的大小等于B的大小,偏移是A与B偏移的和(如果这两个偏移都不是-1,否则就是-1)。在上面的代码片段里,特别给出了这些SubRegIndex所描述的寄存器片段的注释。这些片段需要对应的Register定义,因此Tablegen中也还有一个比较自动化描述Register的辅助类。


278

class
RegisterTuples
<list Indices,list< dag
> Regs> {


279

// SubRegs – Nlists of registers to be zipped up. Super-registers are


280

// synthesizedfrom the first element of each SubRegs list, the second


281

// element and soon.


282

list< dag
>SubRegs = Regs;

283


284

// SubRegIndices- N SubRegIndex instances. This provides the names of the


285

// sub-registersin the synthesized super-registers.


286

list SubRegIndices =Indices;


287

}

RegisterTuples的效果可用以下代码说明。

定义: def
EvenOdd : RegisterTuples<[sube, subo], [( add
R0, R2), ( add
R1, R3)]>;,将产生与下面代码等效的定义:

let
SubRegIndices = [sube, subo] in
{

def
R0_R1 : RegisterWithSubRegs;

def
R2_R3 : RegisterWithSubRegs;

}

以ARM本身来说,所采用的定义比上面的例子要更为复杂。比如ARM这样描述D类别的寄存器(双精度浮点或通用64位向量寄存器):


284

def
DPR : RegisterClass
<"ARM", [f64, v8i8, v4i16,v2i32, v1i64, v2f32], 64,


285

( sequence
“D%u”, 0, 31)> {


286

// Allocatenon-VFP2 registers D16-D31 first.


287

let
AltOrders= [( rotl
DPR, 16)];


288

let
AltOrderSelect = [{ return 1; }];


289

}

根据定义,属于DPR类别的寄存器为D0~D31。那么连续3个D寄存器所组成的超级寄存器的定义则是:


347

def
Tuples3D :RegisterTuples<[dsub_0, dsub_1, dsub_2],


348

[( shl
DPR, 0),


349

( shl
DPR, 1),


350

( shl
DPR, 2)]>;

( shl
DPR, N
)的含义是删除前N个成员。因此,( shl
DPR, 0)生成D0~D31,( shl
DPR, 1)生成D1~D31,( shl
DPR, 2)生成D2~D31。最终会生成这些超级寄存器[D0,D1, D2],[D1,D2, D3],…[D29,D30, D31],而它们的索引则分别由dsub_0,dsub_1与dsub_2描述。

显然,这样的做法比一个个来定义要紧凑得多,但TableGen的处理相应大大地复杂了。

稿源:wuhui_gdnt的专栏 (源链) | 关于 | 阅读提示

本站遵循[CC BY-NC-SA 4.0]。如您有版权、意见投诉等问题,请通过eMail联系我们处理。
酷辣虫 » 综合技术 » llvm学习笔记(3)

喜欢 (0)or分享给?

专业 x 专注 x 聚合 x 分享 CC BY-NC-SA 4.0

使用声明 | 英豪名录

登录

忘记密码 ?

切换登录

注册