Merge branch 'backend-O1-1' into backend
This commit is contained in:
@@ -74,7 +74,6 @@ graph TD
|
|||||||
- **消除 `fallthrough` 现象**:
|
- **消除 `fallthrough` 现象**:
|
||||||
通过确保所有基本块均以终结指令结尾,消除基本块间的 `fallthrough`,简化了控制流图(CFG)的构建和分析。这一做法提升了编译器整体质量,使中端各类 Pass 的编写和维护更加规范和高效。
|
通过确保所有基本块均以终结指令结尾,消除基本块间的 `fallthrough`,简化了控制流图(CFG)的构建和分析。这一做法提升了编译器整体质量,使中端各类 Pass 的编写和维护更加规范和高效。
|
||||||
|
|
||||||
|
|
||||||
### 3.2. 核心优化详解
|
### 3.2. 核心优化详解
|
||||||
|
|
||||||
编译器的分析和优化被组织成一系列独立的“遍”(Pass)。每个 Pass 都是一个独立的算法模块,对 IR 进行特定的分析或变换。这种设计具有高度的模块化和可扩展性。
|
编译器的分析和优化被组织成一系列独立的“遍”(Pass)。每个 Pass 都是一个独立的算法模块,对 IR 进行特定的分析或变换。这种设计具有高度的模块化和可扩展性。
|
||||||
@@ -116,24 +115,18 @@ graph TD
|
|||||||
|
|
||||||
#### 3.2.4. 其他优化
|
#### 3.2.4. 其他优化
|
||||||
|
|
||||||
- **LargeArrayToGlobal (`LargeArrayToGlobal.cpp`)**:
|
#### 3.3. 核心分析遍
|
||||||
- **目标**: 防止因大型局部数组导致的栈溢出,并可能改善数据局部性。
|
|
||||||
- **技术**: 遍历函数中的 `alloca` 指令,如果通过 `calculateTypeSize` 计算出其分配的内存大小超过一个阈值(如 1024 字节),则将其转换为一个全局变量。
|
|
||||||
- **实现**: `convertAllocaToGlobal` 函数负责创建一个新的 `GlobalValue`,并调用 `replaceAllUsesWith` 将原 `alloca` 的所有使用者重定向到新的全局变量,最后删除原 `alloca` 指令。
|
|
||||||
|
|
||||||
#### 3.3. 核心分析遍
|
|
||||||
|
|
||||||
为了为优化遍收集信息,最大程度发掘程序优化潜力,我们目前设计并实现了以下关键的分析遍:
|
为了为优化遍收集信息,最大程度发掘程序优化潜力,我们目前设计并实现了以下关键的分析遍:
|
||||||
|
|
||||||
- **支配树分析 (Dominator Tree Analysis)**:
|
- **支配树分析 (Dominator Tree Analysis)**:
|
||||||
- **技术**: 通过计算每个基本块的支配节点,构建出一棵支配树结构。我们在计算支配节点时采用了**逆后序遍历(RPO, Reverse Post Order)**,以保证数据流分析的收敛速度和正确性。在计算直接支配者(Idom, Immediate Dominator)时,采用了经典的**Lengauer-Tarjan(LT)算法**,该算法以高效的并查集和路径压缩技术著称,能够在线性时间内准确计算出每个基本块的直接支配者关系。
|
- **技术**: 通过计算每个基本块的支配节点,构建出一棵支配树结构。我们在计算支配节点时采用了**逆后序遍历(RPO, Reverse Post Order)**,以保证数据流分析的收敛速度和正确性。在计算直接支配者(Idom, Immediate Dominator)时,采用了经典的**Lengauer-Tarjan(LT)算法**,该算法以高效的并查集和路径压缩技术著称,能够在线性时间内准确计算出每个基本块的直接支配者关系。
|
||||||
- **实现**: `Dom.cpp` 实现了支配树分析。该分析为每个基本块分配其直接支配者,并递归构建整棵支配树。支配树是许多高级优化(尤其是 SSA 形式下的优化)的基础。例如,Mem2Reg 需要依赖支配树来正确插入 Phi 指令,并在变量重命名阶段高效遍历控制流图。此外,循环相关优化(如循环不变量外提)也依赖于支配树信息来识别循环头和循环体的关系。
|
- **实现**: `Dom.cpp` 实现了支配树分析。该分析为每个基本块分配其直接支配者,并递归构建整棵支配树。支配树是许多高级优化(尤其是 SSA 形式下的优化)的基础。例如,Mem2Reg 需要依赖支配树来正确插入 Phi 指令,并在变量重命名阶段高效遍历控制流图。此外,循环相关优化(如循环不变量外提)也依赖于支配树信息来识别循环头和循环体的关系。
|
||||||
|
|
||||||
- **活跃性分析 (Liveness Analysis)**:
|
|
||||||
- **技术**: 活跃性分析用于确定在程序的某一特定点上,哪些变量的值在未来会被用到。我们采用**经典的不动点迭代算法**,在数据流分析框架下,逆序遍历基本块,迭代计算每个基本块的 `live-in` 和 `live-out` 集合,直到收敛为止。这种方法简单且易于实现,能够满足大多数编译优化的需求。
|
|
||||||
- **未来规划**: 若后续对分析效率有更高要求,可考虑引入如**工作列表算法**或者**转化为基于SSA的图可达性分析**等更高效的算法,以进一步提升大型函数或复杂控制流下的分析性能。
|
|
||||||
- **实现**: `Liveness.cpp` 提供了活跃性分析。该分析采用经典的数据流分析框架,迭代计算每个基本块的 `live-in` 和 `live-out` 集合。活跃性信息是死代码消除(DCE)、寄存器分配等优化的必要前置步骤。通过准确的活跃性分析,可以识别出无用的变量和指令,从而为后续优化遍提供坚实的数据基础。
|
|
||||||
|
|
||||||
|
- **活跃性分析 (Liveness Analysis)**:
|
||||||
|
- **技术**: 活跃性分析用于确定在程序的某一特定点上,哪些变量的值在未来会被用到。我们采用**经典的不动点迭代算法**,在数据流分析框架下,逆序遍历基本块,迭代计算每个基本块的 `live-in` 和 `live-out` 集合,直到收敛为止。这种方法简单且易于实现,能够满足大多数编译优化的需求。
|
||||||
|
- **未来规划**: 若后续对分析效率有更高要求,可考虑引入如**工作列表算法**或者**转化为基于SSA的图可达性分析**等更高效的算法,以进一步提升大型函数或复杂控制流下的分析性能。
|
||||||
|
- **实现**: `Liveness.cpp` 提供了活跃性分析。该分析采用经典的数据流分析框架,迭代计算每个基本块的 `live-in` 和 `live-out` 集合。活跃性信息是死代码消除(DCE)、寄存器分配等优化的必要前置步骤。通过准确的活跃性分析,可以识别出无用的变量和指令,从而为后续优化遍提供坚实的数据基础。
|
||||||
|
|
||||||
### 3.4. 未来的规划
|
### 3.4. 未来的规划
|
||||||
|
|
||||||
@@ -145,6 +138,7 @@ graph TD
|
|||||||
函数内联能够将简单函数(可能需要收集更多信息)内联到call指令相应位置,减少栈空间相关变动,并且为其他遍发掘优化空间。
|
函数内联能够将简单函数(可能需要收集更多信息)内联到call指令相应位置,减少栈空间相关变动,并且为其他遍发掘优化空间。
|
||||||
- **`LLVM IR`格式化**:
|
- **`LLVM IR`格式化**:
|
||||||
我们将为所有的IR设计并实现通用的打印器方法,使得IR能够显式化为可编译运行的LLVM IR,通过编排脚本和调用llvm相关工具链,我们能够绕过后端编译运行中间代码,为验证中端正确性提供系统化的方法,同时减轻后端开发bug溯源的压力。
|
我们将为所有的IR设计并实现通用的打印器方法,使得IR能够显式化为可编译运行的LLVM IR,通过编排脚本和调用llvm相关工具链,我们能够绕过后端编译运行中间代码,为验证中端正确性提供系统化的方法,同时减轻后端开发bug溯源的压力。
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
## 4. 后端技术与优化 (Backend)
|
## 4. 后端技术与优化 (Backend)
|
||||||
@@ -215,16 +209,16 @@ graph TD
|
|||||||
end
|
end
|
||||||
```
|
```
|
||||||
|
|
||||||
1. **`analyzeLiveness()`**: 对机器指令进行数据流分析,计算出每个虚拟寄存器的活跃范围。
|
1. **`analyzeLiveness()`**: 对机器指令进行数据流分析,计算出每个虚拟寄存器的活跃范围。
|
||||||
2. **`build()`**: 根据活跃性信息构建**冲突图 (Interference Graph)**。如果两个虚拟寄存器同时活跃,则它们冲突,在图中连接一条边。
|
2. **`build()`**: 根据活跃性信息构建**冲突图 (Interference Graph)**。如果两个虚拟寄存器同时活跃,则它们冲突,在图中连接一条边。
|
||||||
3. **`makeWorklist()`**: 将图节点(虚拟寄存器)根据其度数放入不同的工作列表,为着色做准备。
|
3. **`makeWorklist()`**: 将图节点(虚拟寄存器)根据其度数放入不同的工作列表,为着色做准备。
|
||||||
4. **核心着色阶段 (The Loop)**:
|
4. **核心着色阶段 (The Loop)**:
|
||||||
- **`simplify()`**: 贪心地移除图中度数小于物理寄存器数量的节点,并将其压入栈中。这些节点保证可以被成功着色。
|
- **`simplify()`**: 贪心地移除图中度数小于物理寄存器数量的节点,并将其压入栈中。这些节点保证可以被成功着色。
|
||||||
- **`coalesce()`**: 尝试将传送指令 (`MV`) 的源和目标节点合并,以消除这条指令。合并的条件基于 **Briggs** 或 **George** 启发式,以避免使图变得不可着色。
|
- **`coalesce()`**: 尝试将传送指令 (`MV`) 的源和目标节点合并,以消除这条指令。合并的条件基于 **Briggs** 或 **George** 启发式,以避免使图变得不可着色。
|
||||||
- **`freeze()`**: 当一个与传送指令相关的节点无法合并也无法简化时,放弃对该传送指令的合并希望,将其“冻结”为一个普通节点。
|
- **`freeze()`**: 当一个与传送指令相关的节点无法合并也无法简化时,放弃对该传送指令的合并希望,将其“冻结”为一个普通节点。
|
||||||
- **`selectSpill()`**: 当所有节点都无法进行上述操作时(即图中只剩下高度数的节点),必须选择一个节点进行**溢出 (Spill)**,即决定将其存放在内存中。
|
- **`selectSpill()`**: 当所有节点都无法进行上述操作时(即图中只剩下高度数的节点),必须选择一个节点进行**溢出 (Spill)**,即决定将其存放在内存中。
|
||||||
5. **`assignColors()`**: 在所有节点都被处理后,从栈中依次弹出节点,并根据其已着色邻居的颜色,为它选择一个可用的物理寄存器。
|
5. **`assignColors()`**: 在所有节点都被处理后,从栈中依次弹出节点,并根据其已着色邻居的颜色,为它选择一个可用的物理寄存器。
|
||||||
6. **`rewriteProgram()`**: 如果 `assignColors()` 阶段发现有节点被标记为溢出,此函数会被调用。它会修改机器指令,为溢出的虚拟寄存器插入从内存加载(`lw`/`ld`)和存入内存(`sw`/`sd`)的代码。然后,整个分配过程从步骤 1 重新开始。
|
6. **`rewriteProgram()`**: 如果 `assignColors()` 阶段发现有节点被标记为溢出,此函数会被调用。它会修改机器指令,为溢出的虚拟寄存器插入从内存加载(`lw`/`ld`)和存入内存(`sw`/`sd`)的代码。然后,整个分配过程从步骤 1 重新开始。
|
||||||
|
|
||||||
### 4.4. 后端特定优化
|
### 4.4. 后端特定优化
|
||||||
|
|
||||||
|
|||||||
Binary file not shown.
@@ -103,6 +103,81 @@ void RISCv64ISel::select() {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
// 仅当函数满足特定条件时,才需要保存参数寄存器,应用更精细的过滤规则
|
||||||
|
// 1. 函数包含call指令 (非叶子函数): 参数寄存器(a0-a7)是调用者保存的,
|
||||||
|
// call指令可能会覆盖这些寄存器,因此必须保存。
|
||||||
|
// 2. 函数包含alloca指令 (需要栈分配)。
|
||||||
|
// 3. 函数的指令数量超过一个阈值(如20),意味着它是一个复杂的叶子函数,
|
||||||
|
// 为安全起见,保存其参数。
|
||||||
|
// 简单的叶子函数 (如min) 则可以跳过这个步骤进行优化。
|
||||||
|
auto shouldSaveArgs = [](Function* func) {
|
||||||
|
if (!func) return false;
|
||||||
|
int instruction_count = 0;
|
||||||
|
for (const auto& bb : func->getBasicBlocks()) {
|
||||||
|
for (const auto& inst : bb->getInstructions()) {
|
||||||
|
if (dynamic_cast<CallInst*>(inst.get()) || dynamic_cast<AllocaInst*>(inst.get())) {
|
||||||
|
return true; // 发现call或alloca,立即返回true
|
||||||
|
}
|
||||||
|
instruction_count++;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
// 如果没有call或alloca,则检查指令数量
|
||||||
|
return instruction_count > 45;
|
||||||
|
};
|
||||||
|
|
||||||
|
if (optLevel > 0 && shouldSaveArgs(F)) {
|
||||||
|
if (F && !F->getBasicBlocks().empty()) {
|
||||||
|
// 定位到第一个MachineBasicBlock,也就是函数入口
|
||||||
|
BasicBlock* first_ir_block = F->getBasicBlocks_NoRange().front().get();
|
||||||
|
CurMBB = bb_map.at(first_ir_block);
|
||||||
|
|
||||||
|
int int_arg_idx = 0;
|
||||||
|
int fp_arg_idx = 0;
|
||||||
|
|
||||||
|
for (Argument* arg : F->getArguments()) {
|
||||||
|
Type* arg_type = arg->getType();
|
||||||
|
|
||||||
|
// --- 处理整数/指针参数 ---
|
||||||
|
if (!arg_type->isFloat() && int_arg_idx < 8) {
|
||||||
|
// 1. 获取参数原始的、将被预着色为 a0-a7 的 vreg
|
||||||
|
unsigned original_vreg = getVReg(arg);
|
||||||
|
|
||||||
|
// 2. 创建一个新的、安全的 vreg 来持有参数的值
|
||||||
|
unsigned saved_vreg = getNewVReg(arg_type);
|
||||||
|
|
||||||
|
// 3. 生成 mv saved_vreg, original_vreg 指令
|
||||||
|
auto mv = std::make_unique<MachineInstr>(RVOpcodes::MV);
|
||||||
|
mv->addOperand(std::make_unique<RegOperand>(saved_vreg));
|
||||||
|
mv->addOperand(std::make_unique<RegOperand>(original_vreg));
|
||||||
|
CurMBB->addInstruction(std::move(mv));
|
||||||
|
|
||||||
|
MFunc->addProtectedArgumentVReg(saved_vreg);
|
||||||
|
// 4.【关键】更新vreg映射表,将arg的vreg指向新的、安全的vreg
|
||||||
|
// 这样,后续所有对该参数的 getVReg(arg) 调用都会自动获得 saved_vreg,
|
||||||
|
// 使得函数体内的代码都使用这个被保存过的值。
|
||||||
|
vreg_map[arg] = saved_vreg;
|
||||||
|
int_arg_idx++;
|
||||||
|
}
|
||||||
|
// --- 处理浮点参数 ---
|
||||||
|
else if (arg_type->isFloat() && fp_arg_idx < 8) {
|
||||||
|
unsigned original_vreg = getVReg(arg);
|
||||||
|
unsigned saved_vreg = getNewVReg(arg_type);
|
||||||
|
|
||||||
|
// 对于浮点数,使用 fmv.s 指令
|
||||||
|
auto fmv = std::make_unique<MachineInstr>(RVOpcodes::FMV_S);
|
||||||
|
fmv->addOperand(std::make_unique<RegOperand>(saved_vreg));
|
||||||
|
fmv->addOperand(std::make_unique<RegOperand>(original_vreg));
|
||||||
|
CurMBB->addInstruction(std::move(fmv));
|
||||||
|
|
||||||
|
MFunc->addProtectedArgumentVReg(saved_vreg);
|
||||||
|
vreg_map[arg] = saved_vreg;
|
||||||
|
fp_arg_idx++;
|
||||||
|
}
|
||||||
|
// 对于栈传递的参数,则无需处理
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// 遍历基本块,进行指令选择
|
// 遍历基本块,进行指令选择
|
||||||
for (const auto& bb_ptr : F->getBasicBlocks()) {
|
for (const auto& bb_ptr : F->getBasicBlocks()) {
|
||||||
selectBasicBlock(bb_ptr.get());
|
selectBasicBlock(bb_ptr.get());
|
||||||
@@ -501,6 +576,14 @@ void RISCv64ISel::selectNode(DAGNode* node) {
|
|||||||
CurMBB->addInstruction(std::move(instr));
|
CurMBB->addInstruction(std::move(instr));
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
case BinaryInst::kMulh: {
|
||||||
|
auto instr = std::make_unique<MachineInstr>(RVOpcodes::MULH);
|
||||||
|
instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
|
||||||
|
instr->addOperand(std::make_unique<RegOperand>(lhs_vreg));
|
||||||
|
instr->addOperand(std::make_unique<RegOperand>(rhs_vreg));
|
||||||
|
CurMBB->addInstruction(std::move(instr));
|
||||||
|
break;
|
||||||
|
}
|
||||||
case Instruction::kDiv: {
|
case Instruction::kDiv: {
|
||||||
auto instr = std::make_unique<MachineInstr>(RVOpcodes::DIVW);
|
auto instr = std::make_unique<MachineInstr>(RVOpcodes::DIVW);
|
||||||
instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
|
instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
|
||||||
@@ -612,6 +695,22 @@ void RISCv64ISel::selectNode(DAGNode* node) {
|
|||||||
CurMBB->addInstruction(std::move(xori));
|
CurMBB->addInstruction(std::move(xori));
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
case BinaryInst::kAnd: {
|
||||||
|
auto instr = std::make_unique<MachineInstr>(RVOpcodes::AND);
|
||||||
|
instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
|
||||||
|
instr->addOperand(std::make_unique<RegOperand>(lhs_vreg));
|
||||||
|
instr->addOperand(std::make_unique<RegOperand>(rhs_vreg));
|
||||||
|
CurMBB->addInstruction(std::move(instr));
|
||||||
|
break;
|
||||||
|
}
|
||||||
|
case BinaryInst::kOr: {
|
||||||
|
auto instr = std::make_unique<MachineInstr>(RVOpcodes::OR);
|
||||||
|
instr->addOperand(std::make_unique<RegOperand>(dest_vreg));
|
||||||
|
instr->addOperand(std::make_unique<RegOperand>(lhs_vreg));
|
||||||
|
instr->addOperand(std::make_unique<RegOperand>(rhs_vreg));
|
||||||
|
CurMBB->addInstruction(std::move(instr));
|
||||||
|
break;
|
||||||
|
}
|
||||||
default:
|
default:
|
||||||
throw std::runtime_error("Unsupported binary instruction in ISel");
|
throw std::runtime_error("Unsupported binary instruction in ISel");
|
||||||
}
|
}
|
||||||
@@ -1257,6 +1356,7 @@ void RISCv64ISel::selectNode(DAGNode* node) {
|
|||||||
auto gep = dynamic_cast<GetElementPtrInst*>(node->value);
|
auto gep = dynamic_cast<GetElementPtrInst*>(node->value);
|
||||||
auto result_vreg = getVReg(gep);
|
auto result_vreg = getVReg(gep);
|
||||||
|
|
||||||
|
if (optLevel == 0) {
|
||||||
// --- Step 1: 获取基地址 (此部分逻辑正确,保持不变) ---
|
// --- Step 1: 获取基地址 (此部分逻辑正确,保持不变) ---
|
||||||
auto base_ptr_node = node->operands[0];
|
auto base_ptr_node = node->operands[0];
|
||||||
auto current_addr_vreg = getNewVReg(gep->getType());
|
auto current_addr_vreg = getNewVReg(gep->getType());
|
||||||
@@ -1363,6 +1463,106 @@ void RISCv64ISel::selectNode(DAGNode* node) {
|
|||||||
final_mv->addOperand(std::make_unique<RegOperand>(current_addr_vreg));
|
final_mv->addOperand(std::make_unique<RegOperand>(current_addr_vreg));
|
||||||
CurMBB->addInstruction(std::move(final_mv));
|
CurMBB->addInstruction(std::move(final_mv));
|
||||||
break;
|
break;
|
||||||
|
} else {
|
||||||
|
// 对于-O1时的处理逻辑
|
||||||
|
// --- Step 1: 获取基地址 ---
|
||||||
|
auto base_ptr_node = node->operands[0];
|
||||||
|
auto base_ptr_val = base_ptr_node->value;
|
||||||
|
|
||||||
|
// last_step_addr_vreg 保存上一步计算的结果。
|
||||||
|
// 它首先被初始化为GEP的初始基地址。
|
||||||
|
unsigned last_step_addr_vreg;
|
||||||
|
|
||||||
|
if (auto alloca_base = dynamic_cast<AllocaInst*>(base_ptr_val)) {
|
||||||
|
last_step_addr_vreg = getNewVReg(gep->getType());
|
||||||
|
auto frame_addr_instr = std::make_unique<MachineInstr>(RVOpcodes::FRAME_ADDR);
|
||||||
|
frame_addr_instr->addOperand(std::make_unique<RegOperand>(last_step_addr_vreg));
|
||||||
|
frame_addr_instr->addOperand(std::make_unique<RegOperand>(getVReg(alloca_base)));
|
||||||
|
CurMBB->addInstruction(std::move(frame_addr_instr));
|
||||||
|
} else if (auto global_base = dynamic_cast<GlobalValue*>(base_ptr_val)) {
|
||||||
|
last_step_addr_vreg = getNewVReg(gep->getType());
|
||||||
|
auto la_instr = std::make_unique<MachineInstr>(RVOpcodes::LA);
|
||||||
|
la_instr->addOperand(std::make_unique<RegOperand>(last_step_addr_vreg));
|
||||||
|
la_instr->addOperand(std::make_unique<LabelOperand>(global_base->getName()));
|
||||||
|
CurMBB->addInstruction(std::move(la_instr));
|
||||||
|
} else {
|
||||||
|
// 对于函数参数或来自其他指令的指针,直接获取其vreg。
|
||||||
|
// 这个vreg必须被保护,不能在计算中被修改。
|
||||||
|
last_step_addr_vreg = getVReg(base_ptr_val);
|
||||||
|
}
|
||||||
|
|
||||||
|
// --- Step 2: 遵循LLVM GEP语义迭代计算地址 ---
|
||||||
|
Type* current_type = gep->getBasePointer()->getType()->as<PointerType>()->getBaseType();
|
||||||
|
|
||||||
|
for (size_t i = 0; i < gep->getNumIndices(); ++i) {
|
||||||
|
Value* indexValue = gep->getIndex(i);
|
||||||
|
unsigned stride = getTypeSizeInBytes(current_type);
|
||||||
|
|
||||||
|
if (stride != 0) {
|
||||||
|
// --- 为当前索引和步长生成偏移计算指令 ---
|
||||||
|
auto offset_vreg = getNewVReg(Type::getIntType());
|
||||||
|
|
||||||
|
unsigned index_vreg;
|
||||||
|
if (auto const_index = dynamic_cast<ConstantValue*>(indexValue)) {
|
||||||
|
index_vreg = getNewVReg(Type::getIntType());
|
||||||
|
auto li = std::make_unique<MachineInstr>(RVOpcodes::LI);
|
||||||
|
li->addOperand(std::make_unique<RegOperand>(index_vreg));
|
||||||
|
li->addOperand(std::make_unique<ImmOperand>(const_index->getInt()));
|
||||||
|
CurMBB->addInstruction(std::move(li));
|
||||||
|
} else {
|
||||||
|
index_vreg = getVReg(indexValue);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (stride == 1) {
|
||||||
|
auto mv = std::make_unique<MachineInstr>(RVOpcodes::MV);
|
||||||
|
mv->addOperand(std::make_unique<RegOperand>(offset_vreg));
|
||||||
|
mv->addOperand(std::make_unique<RegOperand>(index_vreg));
|
||||||
|
CurMBB->addInstruction(std::move(mv));
|
||||||
|
} else {
|
||||||
|
auto size_vreg = getNewVReg(Type::getIntType());
|
||||||
|
auto li_size = std::make_unique<MachineInstr>(RVOpcodes::LI);
|
||||||
|
li_size->addOperand(std::make_unique<RegOperand>(size_vreg));
|
||||||
|
li_size->addOperand(std::make_unique<ImmOperand>(stride));
|
||||||
|
CurMBB->addInstruction(std::move(li_size));
|
||||||
|
|
||||||
|
auto mul = std::make_unique<MachineInstr>(RVOpcodes::MULW);
|
||||||
|
mul->addOperand(std::make_unique<RegOperand>(offset_vreg));
|
||||||
|
mul->addOperand(std::make_unique<RegOperand>(index_vreg));
|
||||||
|
mul->addOperand(std::make_unique<RegOperand>(size_vreg));
|
||||||
|
CurMBB->addInstruction(std::move(mul));
|
||||||
|
}
|
||||||
|
|
||||||
|
// --- 关键修复点 ---
|
||||||
|
// 创建一个新的vreg来保存本次加法的结果。
|
||||||
|
unsigned current_step_addr_vreg = getNewVReg(gep->getType());
|
||||||
|
|
||||||
|
// 执行 add current_step, last_step, offset
|
||||||
|
// 这确保了 last_step_addr_vreg (输入) 永远不会被直接修改。
|
||||||
|
auto add = std::make_unique<MachineInstr>(RVOpcodes::ADD);
|
||||||
|
add->addOperand(std::make_unique<RegOperand>(current_step_addr_vreg));
|
||||||
|
add->addOperand(std::make_unique<RegOperand>(last_step_addr_vreg));
|
||||||
|
add->addOperand(std::make_unique<RegOperand>(offset_vreg));
|
||||||
|
CurMBB->addInstruction(std::move(add));
|
||||||
|
|
||||||
|
// 本次的结果成为下一次计算的输入。
|
||||||
|
last_step_addr_vreg = current_step_addr_vreg;
|
||||||
|
}
|
||||||
|
|
||||||
|
// --- 为下一次迭代更新类型 ---
|
||||||
|
if (auto array_type = current_type->as<ArrayType>()) {
|
||||||
|
current_type = array_type->getElementType();
|
||||||
|
} else if (auto ptr_type = current_type->as<PointerType>()) {
|
||||||
|
current_type = ptr_type->getBaseType();
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// --- Step 3: 将最终计算出的地址存入GEP的目标虚拟寄存器 ---
|
||||||
|
auto final_mv = std::make_unique<MachineInstr>(RVOpcodes::MV);
|
||||||
|
final_mv->addOperand(std::make_unique<RegOperand>(result_vreg));
|
||||||
|
final_mv->addOperand(std::make_unique<RegOperand>(last_step_addr_vreg));
|
||||||
|
CurMBB->addInstruction(std::move(final_mv));
|
||||||
|
break;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
default:
|
default:
|
||||||
|
|||||||
@@ -98,6 +98,7 @@ bool RISCv64RegAlloc::doAllocation() {
|
|||||||
precolorByCallingConvention();
|
precolorByCallingConvention();
|
||||||
analyzeLiveness();
|
analyzeLiveness();
|
||||||
build();
|
build();
|
||||||
|
protectCrossCallVRegs();
|
||||||
makeWorklist();
|
makeWorklist();
|
||||||
|
|
||||||
while (!simplifyWorklist.empty() || !worklistMoves.empty() || !freezeWorklist.empty() || !spillWorklist.empty()) {
|
while (!simplifyWorklist.empty() || !worklistMoves.empty() || !freezeWorklist.empty() || !spillWorklist.empty()) {
|
||||||
@@ -127,20 +128,46 @@ void RISCv64RegAlloc::precolorByCallingConvention() {
|
|||||||
int int_arg_idx = 0;
|
int int_arg_idx = 0;
|
||||||
int float_arg_idx = 0;
|
int float_arg_idx = 0;
|
||||||
|
|
||||||
for (Argument* arg : F->getArguments()) {
|
if (optLevel > 0)
|
||||||
unsigned vreg = ISel->getVReg(arg);
|
{
|
||||||
|
for (const auto& pair : vreg_to_value_map) {
|
||||||
|
unsigned vreg = pair.first;
|
||||||
|
Value* val = pair.second;
|
||||||
|
|
||||||
if (arg->getType()->isFloat()) {
|
// 检查这个 Value* 是不是一个 Argument 对象
|
||||||
if (float_arg_idx < 8) { // fa0-fa7
|
if (auto arg = dynamic_cast<Argument*>(val)) {
|
||||||
auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::F10) + float_arg_idx);
|
// 如果是,那么 vreg 就是最初分配给这个参数的 vreg
|
||||||
color_map[vreg] = preg;
|
int arg_idx = arg->getIndex();
|
||||||
float_arg_idx++;
|
|
||||||
|
if (arg->getType()->isFloat()) {
|
||||||
|
if (arg_idx < 8) { // fa0-fa7
|
||||||
|
auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::F10) + arg_idx);
|
||||||
|
color_map[vreg] = preg;
|
||||||
|
}
|
||||||
|
} else { // 整数或指针
|
||||||
|
if (arg_idx < 8) { // a0-a7
|
||||||
|
auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::A0) + arg_idx);
|
||||||
|
color_map[vreg] = preg;
|
||||||
|
}
|
||||||
|
}
|
||||||
}
|
}
|
||||||
} else { // 整数或指针
|
}
|
||||||
if (int_arg_idx < 8) { // a0-a7
|
} else {
|
||||||
auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::A0) + int_arg_idx);
|
for (Argument* arg : F->getArguments()) {
|
||||||
color_map[vreg] = preg;
|
unsigned vreg = ISel->getVReg(arg);
|
||||||
int_arg_idx++;
|
|
||||||
|
if (arg->getType()->isFloat()) {
|
||||||
|
if (float_arg_idx < 8) { // fa0-fa7
|
||||||
|
auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::F10) + float_arg_idx);
|
||||||
|
color_map[vreg] = preg;
|
||||||
|
float_arg_idx++;
|
||||||
|
}
|
||||||
|
} else { // 整数或指针
|
||||||
|
if (int_arg_idx < 8) { // a0-a7
|
||||||
|
auto preg = static_cast<PhysicalReg>(static_cast<int>(PhysicalReg::A0) + int_arg_idx);
|
||||||
|
color_map[vreg] = preg;
|
||||||
|
int_arg_idx++;
|
||||||
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
@@ -159,6 +186,57 @@ void RISCv64RegAlloc::precolorByCallingConvention() {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void RISCv64RegAlloc::protectCrossCallVRegs() {
|
||||||
|
// 从ISel获取被标记为需要保护的参数副本vreg集合
|
||||||
|
const auto& vregs_to_protect_potentially = MFunc->getProtectedArgumentVRegs();
|
||||||
|
if (vregs_to_protect_potentially.empty()) {
|
||||||
|
return; // 如果没有需要保护的vreg,直接返回
|
||||||
|
}
|
||||||
|
|
||||||
|
// VRegSet live_across_call_vregs;
|
||||||
|
// // 遍历所有指令,找出哪些被标记的vreg其生命周期确实跨越了call指令
|
||||||
|
// for (const auto& mbb_ptr : MFunc->getBlocks()) {
|
||||||
|
// for (const auto& instr_ptr : mbb_ptr->getInstructions()) {
|
||||||
|
// if (instr_ptr->getOpcode() == RVOpcodes::CALL) {
|
||||||
|
// const VRegSet& live_out_after_call = live_out_map.at(instr_ptr.get());
|
||||||
|
// for (unsigned vreg : vregs_to_protect_potentially) {
|
||||||
|
// if (live_out_after_call.count(vreg)) {
|
||||||
|
// live_across_call_vregs.insert(vreg);
|
||||||
|
// }
|
||||||
|
// }
|
||||||
|
// }
|
||||||
|
// }
|
||||||
|
// }
|
||||||
|
|
||||||
|
// if (live_across_call_vregs.empty()) {
|
||||||
|
// return; // 如果被标记的vreg没有一个跨越call,也无需操作
|
||||||
|
// }
|
||||||
|
|
||||||
|
// if (DEEPDEBUG) {
|
||||||
|
// std::cerr << "--- [FIX] Applying protection for argument vregs that live across calls: ";
|
||||||
|
// for(unsigned v : live_across_call_vregs) std::cerr << regIdToString(v) << " ";
|
||||||
|
// std::cerr << "\n";
|
||||||
|
// }
|
||||||
|
|
||||||
|
// 获取所有调用者保存寄存器
|
||||||
|
const auto& caller_saved_int = getCallerSavedIntRegs();
|
||||||
|
const auto& caller_saved_fp = getCallerSavedFpRegs();
|
||||||
|
const unsigned offset = static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID);
|
||||||
|
|
||||||
|
// 为每个确认跨越call的vreg,添加与所有调用者保存寄存器的冲突
|
||||||
|
for (unsigned vreg : vregs_to_protect_potentially) {
|
||||||
|
if (isFPVReg(vreg)) { // 如果是浮点vreg
|
||||||
|
for (auto preg : caller_saved_fp) {
|
||||||
|
addEdge(vreg, offset + static_cast<unsigned>(preg));
|
||||||
|
}
|
||||||
|
} else { // 如果是整数vreg
|
||||||
|
for (auto preg : caller_saved_int) {
|
||||||
|
addEdge(vreg, offset + static_cast<unsigned>(preg));
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
// 初始化/重置所有数据结构
|
// 初始化/重置所有数据结构
|
||||||
void RISCv64RegAlloc::initialize() {
|
void RISCv64RegAlloc::initialize() {
|
||||||
initial.clear();
|
initial.clear();
|
||||||
@@ -477,16 +555,26 @@ void RISCv64RegAlloc::coalesce() {
|
|||||||
unsigned x = getAlias(*def.begin());
|
unsigned x = getAlias(*def.begin());
|
||||||
unsigned y = getAlias(*use.begin());
|
unsigned y = getAlias(*use.begin());
|
||||||
unsigned u, v;
|
unsigned u, v;
|
||||||
if (precolored.count(y)) { u = y; v = x; } else { u = x; v = y; }
|
|
||||||
|
// 总是将待合并的虚拟寄存器赋给 v,将合并目标赋给 u。
|
||||||
|
// 优先级: 物理寄存器 (precolored) > 已着色的虚拟寄存器 (coloredNodes) > 普通虚拟寄存器。
|
||||||
|
if (precolored.count(y)) {
|
||||||
|
u = y;
|
||||||
|
v = x;
|
||||||
|
} else if (precolored.count(x)) {
|
||||||
|
u = x;
|
||||||
|
v = y;
|
||||||
|
} else if (coloredNodes.count(y)) {
|
||||||
|
u = y;
|
||||||
|
v = x;
|
||||||
|
} else {
|
||||||
|
u = x;
|
||||||
|
v = y;
|
||||||
|
}
|
||||||
|
|
||||||
// 防御性检查,处理物理寄存器之间的传送指令
|
// 防御性检查,处理物理寄存器之间的传送指令
|
||||||
if (precolored.count(u) && precolored.count(v)) {
|
if (precolored.count(u) && precolored.count(v)) {
|
||||||
// 如果 u 和 v 都是物理寄存器,我们不能合并它们。
|
|
||||||
// 这通常是一条寄存器拷贝指令,例如 `mv a2, a1`。
|
|
||||||
// 把它加入 constrainedMoves 列表,然后直接返回,不再处理。
|
|
||||||
constrainedMoves.insert(move);
|
constrainedMoves.insert(move);
|
||||||
// addWorklist(u) 和 addWorklist(v) 在这里也不需要调用,
|
|
||||||
// 因为它们只对虚拟寄存器有意义。
|
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -498,9 +586,77 @@ void RISCv64RegAlloc::coalesce() {
|
|||||||
if (DEEPERDEBUG) std::cerr << " -> Trivial coalesce (u == v).\n";
|
if (DEEPERDEBUG) std::cerr << " -> Trivial coalesce (u == v).\n";
|
||||||
coalescedMoves.insert(move);
|
coalescedMoves.insert(move);
|
||||||
addWorklist(u);
|
addWorklist(u);
|
||||||
return; // 处理完毕,提前返回
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
bool is_conflicting = false;
|
||||||
|
// 检查1:u 和 v 在冲突图中是否直接相连
|
||||||
|
if ((adjList.count(v) && adjList.at(v).count(u)) || (adjList.count(u) && adjList.at(u).count(v))) {
|
||||||
|
if (DEEPERDEBUG) std::cerr << " -> [Check] Nodes interfere directly.\n";
|
||||||
|
is_conflicting = true;
|
||||||
|
}
|
||||||
|
// 检查2:如果节点不直接相连,则检查是否存在间接的颜色冲突
|
||||||
|
else {
|
||||||
|
// 获取 u 和 v 的颜色(如果它们有的话)
|
||||||
|
unsigned u_color_id = 0, v_color_id = 0;
|
||||||
|
if (precolored.count(u)) {
|
||||||
|
u_color_id = u;
|
||||||
|
} else if (coloredNodes.count(u) || color_map.count(u)) { // color_map.count(u) 是更可靠的检查
|
||||||
|
u_color_id = static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID) + static_cast<unsigned>(color_map.at(u));
|
||||||
|
}
|
||||||
|
|
||||||
|
if (precolored.count(v)) {
|
||||||
|
v_color_id = v;
|
||||||
|
} else if (coloredNodes.count(v) || color_map.count(v)) {
|
||||||
|
v_color_id = static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID) + static_cast<unsigned>(color_map.at(v));
|
||||||
|
}
|
||||||
|
|
||||||
|
// 如果 u 有颜色,检查 v 是否与该颜色代表的物理寄存器冲突
|
||||||
|
if (u_color_id != 0 && adjList.count(v) && adjList.at(v).count(u_color_id)) {
|
||||||
|
if (DEEPERDEBUG) std::cerr << " -> [Check] Node " << regIdToString(v) << " interferes with the color of " << regIdToString(u) << " (" << regIdToString(u_color_id) << ").\n";
|
||||||
|
is_conflicting = true;
|
||||||
|
}
|
||||||
|
// 如果 v 有颜色,检查 u 是否与该颜色代表的物理寄存器冲突
|
||||||
|
else if (v_color_id != 0 && adjList.count(u) && adjList.at(u).count(v_color_id)) {
|
||||||
|
if (DEEPERDEBUG) std::cerr << " -> [Check] Node " << regIdToString(u) << " interferes with the color of " << regIdToString(v) << " (" << regIdToString(v_color_id) << ").\n";
|
||||||
|
is_conflicting = true;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (is_conflicting) {
|
||||||
|
if (DEEPERDEBUG) std::cerr << " -> Constrained (nodes interfere directly or via pre-coloring).\n";
|
||||||
|
constrainedMoves.insert(move);
|
||||||
|
addWorklist(u);
|
||||||
|
addWorklist(v);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
bool u_is_colored = precolored.count(u) || coloredNodes.count(u);
|
||||||
|
bool v_is_colored = precolored.count(v) || coloredNodes.count(v);
|
||||||
|
|
||||||
|
if (u_is_colored && v_is_colored) {
|
||||||
|
PhysicalReg u_color = precolored.count(u)
|
||||||
|
? static_cast<PhysicalReg>(u - static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID))
|
||||||
|
: color_map.at(u);
|
||||||
|
PhysicalReg v_color = precolored.count(v)
|
||||||
|
? static_cast<PhysicalReg>(v - static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID))
|
||||||
|
: color_map.at(v);
|
||||||
|
|
||||||
|
if (u_color != v_color) {
|
||||||
|
if (DEEPERDEBUG) std::cerr << " -> Constrained (move between two different precolored nodes: "
|
||||||
|
<< regToString(u_color) << " and " << regToString(v_color) << ").\n";
|
||||||
|
constrainedMoves.insert(move);
|
||||||
|
return;
|
||||||
|
} else {
|
||||||
|
if (DEEPERDEBUG) std::cerr << " -> Trivial coalesce (move between same precolored nodes).\n";
|
||||||
|
coalescedMoves.insert(move);
|
||||||
|
combine(u, v);
|
||||||
|
addWorklist(u);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
// 类型检查
|
||||||
if (isFPVReg(u) != isFPVReg(v)) {
|
if (isFPVReg(u) != isFPVReg(v)) {
|
||||||
if (DEEPERDEBUG) std::cerr << " -> Constrained (type mismatch: " << regIdToString(u) << " is "
|
if (DEEPERDEBUG) std::cerr << " -> Constrained (type mismatch: " << regIdToString(u) << " is "
|
||||||
<< (isFPVReg(u) ? "float" : "int") << ", " << regIdToString(v) << " is "
|
<< (isFPVReg(u) ? "float" : "int") << ", " << regIdToString(v) << " is "
|
||||||
@@ -508,27 +664,16 @@ void RISCv64RegAlloc::coalesce() {
|
|||||||
constrainedMoves.insert(move);
|
constrainedMoves.insert(move);
|
||||||
addWorklist(u);
|
addWorklist(u);
|
||||||
addWorklist(v);
|
addWorklist(v);
|
||||||
return; // 立即返回,不再进行后续检查
|
|
||||||
}
|
|
||||||
|
|
||||||
bool pre_interfere = adjList.at(v).count(u);
|
|
||||||
|
|
||||||
if (pre_interfere) {
|
|
||||||
if (DEEPERDEBUG) std::cerr << " -> Constrained (nodes already interfere).\n";
|
|
||||||
constrainedMoves.insert(move);
|
|
||||||
addWorklist(u);
|
|
||||||
addWorklist(v);
|
|
||||||
return;
|
return;
|
||||||
}
|
}
|
||||||
|
|
||||||
bool is_u_precolored = precolored.count(u);
|
// 启发式判断逻辑
|
||||||
|
bool u_is_effectively_precolored = precolored.count(u) || coloredNodes.count(u);
|
||||||
bool can_coalesce = false;
|
bool can_coalesce = false;
|
||||||
|
|
||||||
if (is_u_precolored) {
|
if (u_is_effectively_precolored) {
|
||||||
// --- 场景1:u是物理寄存器,使用 George 启发式 ---
|
if (DEEPERDEBUG) std::cerr << " -> Trying George Heuristic (u is effectively precolored)...\n";
|
||||||
if (DEEPERDEBUG) std::cerr << " -> Trying George Heuristic (u is precolored)...\n";
|
|
||||||
|
|
||||||
// 步骤 1: 独立调用 adjacent(v) 获取邻居集合
|
|
||||||
VRegSet neighbors_of_v = adjacent(v);
|
VRegSet neighbors_of_v = adjacent(v);
|
||||||
if (DEEPERDEBUG) {
|
if (DEEPERDEBUG) {
|
||||||
std::cerr << " - Neighbors of " << regIdToString(v) << " to check are (" << neighbors_of_v.size() << "): { ";
|
std::cerr << " - Neighbors of " << regIdToString(v) << " to check are (" << neighbors_of_v.size() << "): { ";
|
||||||
@@ -536,48 +681,35 @@ void RISCv64RegAlloc::coalesce() {
|
|||||||
std::cerr << "}\n";
|
std::cerr << "}\n";
|
||||||
}
|
}
|
||||||
|
|
||||||
// 步骤 2: 使用显式的 for 循环来代替 std::all_of
|
bool george_ok = true;
|
||||||
bool george_ok = true; // 默认假设成功,任何一个邻居失败都会将此设为 false
|
|
||||||
for (unsigned t : neighbors_of_v) {
|
for (unsigned t : neighbors_of_v) {
|
||||||
if (DEEPERDEBUG) {
|
if (DEEPERDEBUG) std::cerr << " - Checking neighbor " << regIdToString(t) << ":\n";
|
||||||
std::cerr << " - Checking neighbor " << regIdToString(t) << ":\n";
|
|
||||||
}
|
|
||||||
|
|
||||||
// 步骤 3: 独立调用启发式函数
|
unsigned u_phys_id = precolored.count(u) ? u : (static_cast<unsigned>(PhysicalReg::PHYS_REG_START_ID) + static_cast<unsigned>(color_map.at(u)));
|
||||||
bool heuristic_result = georgeHeuristic(t, u);
|
bool heuristic_result = georgeHeuristic(t, u_phys_id);
|
||||||
|
|
||||||
if (DEEPERDEBUG) {
|
if (DEEPERDEBUG) {
|
||||||
std::cerr << " - georgeHeuristic(" << regIdToString(t) << ", " << regIdToString(u) << ") -> " << (heuristic_result ? "OK" : "FAIL") << "\n";
|
std::cerr << " - georgeHeuristic(" << regIdToString(t) << ", " << regIdToString(u_phys_id) << ") -> " << (heuristic_result ? "OK" : "FAIL") << "\n";
|
||||||
}
|
}
|
||||||
|
|
||||||
if (!heuristic_result) {
|
if (!heuristic_result) {
|
||||||
george_ok = false; // 只要有一个邻居不满足条件,整个检查就失败
|
george_ok = false;
|
||||||
break; // 并且可以立即停止检查其他邻居
|
break;
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
|
||||||
if (DEEPERDEBUG) {
|
if (DEEPERDEBUG) std::cerr << " -> George Heuristic final result: " << (george_ok ? "OK" : "FAIL") << "\n";
|
||||||
std::cerr << " -> George Heuristic final result: " << (george_ok ? "OK" : "FAIL") << "\n";
|
if (george_ok) can_coalesce = true;
|
||||||
}
|
|
||||||
|
|
||||||
if (george_ok) {
|
|
||||||
can_coalesce = true;
|
|
||||||
}
|
|
||||||
|
|
||||||
} else {
|
} else {
|
||||||
// --- 场景2:u和v都是虚拟寄存器,使用 Briggs 启发式 ---
|
// --- 场景2:u和v都是未着色的虚拟寄存器,使用 Briggs 启发式 ---
|
||||||
if (DEEPERDEBUG) std::cerr << " -> Trying Briggs Heuristic (u and v are virtual)...\n";
|
if (DEEPERDEBUG) std::cerr << " -> Trying Briggs Heuristic (u and v are virtual)...\n";
|
||||||
|
|
||||||
bool briggs_ok = briggsHeuristic(u, v);
|
bool briggs_ok = briggsHeuristic(u, v);
|
||||||
if (DEEPERDEBUG) std::cerr << " - briggsHeuristic(" << regIdToString(u) << ", " << regIdToString(v) << ") -> " << (briggs_ok ? "OK" : "FAIL") << "\n";
|
if (DEEPERDEBUG) std::cerr << " - briggsHeuristic(" << regIdToString(u) << ", " << regIdToString(v) << ") -> " << (briggs_ok ? "OK" : "FAIL") << "\n";
|
||||||
|
if (briggs_ok) can_coalesce = true;
|
||||||
if (briggs_ok) {
|
|
||||||
can_coalesce = true;
|
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// --- 根据启发式结果进行最终决策 ---
|
|
||||||
|
|
||||||
if (can_coalesce) {
|
if (can_coalesce) {
|
||||||
if (DEEPERDEBUG) std::cerr << " -> Heuristic OK. Combining " << regIdToString(v) << " into " << regIdToString(u) << ".\n";
|
if (DEEPERDEBUG) std::cerr << " -> Heuristic OK. Combining " << regIdToString(v) << " into " << regIdToString(u) << ".\n";
|
||||||
coalescedMoves.insert(move);
|
coalescedMoves.insert(move);
|
||||||
@@ -1133,7 +1265,7 @@ unsigned RISCv64RegAlloc::getAlias(unsigned n) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
void RISCv64RegAlloc::addWorklist(unsigned u) {
|
void RISCv64RegAlloc::addWorklist(unsigned u) {
|
||||||
if (precolored.count(u)) return;
|
if (precolored.count(u) || color_map.count(u)) return;
|
||||||
|
|
||||||
int K = isFPVReg(u) ? K_fp : K_int;
|
int K = isFPVReg(u) ? K_fp : K_int;
|
||||||
if (!moveRelated(u) && degree.at(u) < K) {
|
if (!moveRelated(u) && degree.at(u) < K) {
|
||||||
@@ -1208,8 +1340,8 @@ bool RISCv64RegAlloc::georgeHeuristic(unsigned t, unsigned u) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
int K = isFPVReg(t) ? K_fp : K_int;
|
int K = isFPVReg(t) ? K_fp : K_int;
|
||||||
// adjList.at(t) 现在是安全的,因为 degree.count(t) > 0 保证了 adjList.count(t) > 0
|
|
||||||
return degree.at(t) < K || precolored.count(u) || adjList.at(t).count(u);
|
return degree.at(t) < K || adjList.at(t).count(u);
|
||||||
}
|
}
|
||||||
|
|
||||||
void RISCv64RegAlloc::combine(unsigned u, unsigned v) {
|
void RISCv64RegAlloc::combine(unsigned u, unsigned v) {
|
||||||
@@ -1257,7 +1389,7 @@ void RISCv64RegAlloc::freezeMoves(unsigned u) {
|
|||||||
activeMoves.erase(move);
|
activeMoves.erase(move);
|
||||||
frozenMoves.insert(move);
|
frozenMoves.insert(move);
|
||||||
|
|
||||||
if (!precolored.count(v_alias) && nodeMoves(v_alias).empty() && degree.at(v_alias) < (isFPVReg(v_alias) ? K_fp : K_int)) {
|
if (!precolored.count(v_alias) && !coloredNodes.count(v_alias) && nodeMoves(v_alias).empty() && degree.at(v_alias) < (isFPVReg(v_alias) ? K_fp : K_int)) {
|
||||||
freezeWorklist.erase(v_alias);
|
freezeWorklist.erase(v_alias);
|
||||||
simplifyWorklist.insert(v_alias);
|
simplifyWorklist.insert(v_alias);
|
||||||
if (DEEPERDEBUG) {
|
if (DEEPERDEBUG) {
|
||||||
|
|||||||
@@ -11,6 +11,7 @@ namespace sysy {
|
|||||||
|
|
||||||
extern int DEBUG;
|
extern int DEBUG;
|
||||||
extern int DEEPDEBUG;
|
extern int DEEPDEBUG;
|
||||||
|
extern int optLevel;
|
||||||
|
|
||||||
namespace sysy {
|
namespace sysy {
|
||||||
|
|
||||||
|
|||||||
@@ -326,12 +326,19 @@ public:
|
|||||||
void addBlock(std::unique_ptr<MachineBasicBlock> block) {
|
void addBlock(std::unique_ptr<MachineBasicBlock> block) {
|
||||||
blocks.push_back(std::move(block));
|
blocks.push_back(std::move(block));
|
||||||
}
|
}
|
||||||
|
void addProtectedArgumentVReg(unsigned vreg) {
|
||||||
|
protected_argument_vregs.insert(vreg);
|
||||||
|
}
|
||||||
|
const std::set<unsigned>& getProtectedArgumentVRegs() const {
|
||||||
|
return protected_argument_vregs;
|
||||||
|
}
|
||||||
private:
|
private:
|
||||||
Function* F;
|
Function* F;
|
||||||
RISCv64ISel* isel; // 指向创建它的ISel,用于获取vreg映射等信息
|
RISCv64ISel* isel; // 指向创建它的ISel,用于获取vreg映射等信息
|
||||||
std::string name;
|
std::string name;
|
||||||
std::vector<std::unique_ptr<MachineBasicBlock>> blocks;
|
std::vector<std::unique_ptr<MachineBasicBlock>> blocks;
|
||||||
StackFrameInfo frame_info;
|
StackFrameInfo frame_info;
|
||||||
|
std::set<unsigned> protected_argument_vregs;
|
||||||
};
|
};
|
||||||
inline bool isMemoryOp(RVOpcodes opcode) {
|
inline bool isMemoryOp(RVOpcodes opcode) {
|
||||||
switch (opcode) {
|
switch (opcode) {
|
||||||
|
|||||||
@@ -12,6 +12,7 @@ extern int DEBUG;
|
|||||||
extern int DEEPDEBUG;
|
extern int DEEPDEBUG;
|
||||||
extern int DEBUGLENGTH; // 用于限制调试输出的长度
|
extern int DEBUGLENGTH; // 用于限制调试输出的长度
|
||||||
extern int DEEPERDEBUG; // 用于更深层次的调试输出
|
extern int DEEPERDEBUG; // 用于更深层次的调试输出
|
||||||
|
extern int optLevel;
|
||||||
|
|
||||||
namespace sysy {
|
namespace sysy {
|
||||||
|
|
||||||
@@ -44,12 +45,11 @@ private:
|
|||||||
void rewriteProgram();
|
void rewriteProgram();
|
||||||
bool doAllocation();
|
bool doAllocation();
|
||||||
void applyColoring();
|
void applyColoring();
|
||||||
|
|
||||||
void dumpState(const std::string &stage);
|
|
||||||
|
|
||||||
void precolorByCallingConvention();
|
void precolorByCallingConvention();
|
||||||
|
void protectCrossCallVRegs();
|
||||||
|
|
||||||
// --- 辅助函数 ---
|
// --- 辅助函数 ---
|
||||||
|
void dumpState(const std::string &stage);
|
||||||
void getInstrUseDef(const MachineInstr* instr, VRegSet& use, VRegSet& def);
|
void getInstrUseDef(const MachineInstr* instr, VRegSet& use, VRegSet& def);
|
||||||
void getInstrUseDef_Liveness(const MachineInstr *instr, VRegSet &use, VRegSet &def);
|
void getInstrUseDef_Liveness(const MachineInstr *instr, VRegSet &use, VRegSet &def);
|
||||||
void addEdge(unsigned u, unsigned v);
|
void addEdge(unsigned u, unsigned v);
|
||||||
|
|||||||
@@ -1007,6 +1007,7 @@ class PhiInst : public Instruction {
|
|||||||
void replaceIncomingBlock(BasicBlock *oldBlock, BasicBlock *newBlock, Value *newValue);
|
void replaceIncomingBlock(BasicBlock *oldBlock, BasicBlock *newBlock, Value *newValue);
|
||||||
void refreshMap() {
|
void refreshMap() {
|
||||||
blk2val.clear();
|
blk2val.clear();
|
||||||
|
vsize = getNumOperands() / 2;
|
||||||
for (unsigned i = 0; i < vsize; ++i) {
|
for (unsigned i = 0; i < vsize; ++i) {
|
||||||
blk2val[getIncomingBlock(i)] = getIncomingValue(i);
|
blk2val[getIncomingBlock(i)] = getIncomingValue(i);
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1,24 +0,0 @@
|
|||||||
#pragma once
|
|
||||||
|
|
||||||
#include "../Pass.h"
|
|
||||||
|
|
||||||
namespace sysy {
|
|
||||||
|
|
||||||
class LargeArrayToGlobalPass : public OptimizationPass {
|
|
||||||
public:
|
|
||||||
static void *ID;
|
|
||||||
|
|
||||||
LargeArrayToGlobalPass() : OptimizationPass("LargeArrayToGlobal", Granularity::Module) {}
|
|
||||||
|
|
||||||
bool runOnModule(Module *M, AnalysisManager &AM) override;
|
|
||||||
void *getPassID() const override {
|
|
||||||
return &ID;
|
|
||||||
}
|
|
||||||
|
|
||||||
private:
|
|
||||||
unsigned calculateTypeSize(Type *type);
|
|
||||||
void convertAllocaToGlobal(AllocaInst *alloca, Function *F, Module *M);
|
|
||||||
std::string generateUniqueGlobalName(AllocaInst *alloca, Function *F);
|
|
||||||
};
|
|
||||||
|
|
||||||
} // namespace sysy
|
|
||||||
@@ -109,6 +109,34 @@ public:
|
|||||||
}
|
}
|
||||||
|
|
||||||
|
|
||||||
|
// PHI指令消除相关方法
|
||||||
|
static bool eliminateRedundantPhisInFunction(Function* func){
|
||||||
|
bool changed = false;
|
||||||
|
std::vector<Instruction *> toDelete;
|
||||||
|
for (auto &bb : func->getBasicBlocks()) {
|
||||||
|
for (auto &inst : bb->getInstructions()) {
|
||||||
|
if (auto phi = dynamic_cast<PhiInst *>(inst.get())) {
|
||||||
|
auto incoming = phi->getIncomingValues();
|
||||||
|
if(DEBUG){
|
||||||
|
std::cout << "Checking Phi: " << phi->getName() << " with " << incoming.size() << " incoming values." << std::endl;
|
||||||
|
}
|
||||||
|
if (incoming.size() == 1) {
|
||||||
|
Value *singleVal = incoming[0].second;
|
||||||
|
inst->replaceAllUsesWith(singleVal);
|
||||||
|
toDelete.push_back(inst.get());
|
||||||
|
}
|
||||||
|
}
|
||||||
|
else
|
||||||
|
break; // 只处理Phi指令
|
||||||
|
}
|
||||||
|
}
|
||||||
|
for (auto *phi : toDelete) {
|
||||||
|
usedelete(phi);
|
||||||
|
changed = true; // 标记为已更改
|
||||||
|
}
|
||||||
|
return changed; // 返回是否有删除发生
|
||||||
|
}
|
||||||
|
|
||||||
//该实现参考了libdivide的算法
|
//该实现参考了libdivide的算法
|
||||||
static std::pair<int, int> computeMulhMagicNumbers(int divisor) {
|
static std::pair<int, int> computeMulhMagicNumbers(int divisor) {
|
||||||
|
|
||||||
|
|||||||
@@ -51,6 +51,7 @@ public:
|
|||||||
Module *pModule, IRBuilder *pBuilder);
|
Module *pModule, IRBuilder *pBuilder);
|
||||||
|
|
||||||
static void initExternalFunction(Module *pModule, IRBuilder *pBuilder);
|
static void initExternalFunction(Module *pModule, IRBuilder *pBuilder);
|
||||||
|
static void modify_timefuncname(Module *pModule);
|
||||||
};
|
};
|
||||||
|
|
||||||
class SysYIRGenerator : public SysYBaseVisitor {
|
class SysYIRGenerator : public SysYBaseVisitor {
|
||||||
|
|||||||
@@ -24,7 +24,6 @@ add_library(midend_lib STATIC
|
|||||||
Pass/Optimize/InductionVariableElimination.cpp
|
Pass/Optimize/InductionVariableElimination.cpp
|
||||||
Pass/Optimize/GlobalStrengthReduction.cpp
|
Pass/Optimize/GlobalStrengthReduction.cpp
|
||||||
Pass/Optimize/BuildCFG.cpp
|
Pass/Optimize/BuildCFG.cpp
|
||||||
Pass/Optimize/LargeArrayToGlobal.cpp
|
|
||||||
Pass/Optimize/TailCallOpt.cpp
|
Pass/Optimize/TailCallOpt.cpp
|
||||||
)
|
)
|
||||||
|
|
||||||
|
|||||||
@@ -757,7 +757,7 @@ void BinaryInst::print(std::ostream &os) const {
|
|||||||
auto lhs_hash = std::hash<const void*>{}(static_cast<const void*>(getLhs()));
|
auto lhs_hash = std::hash<const void*>{}(static_cast<const void*>(getLhs()));
|
||||||
auto rhs_hash = std::hash<const void*>{}(static_cast<const void*>(getRhs()));
|
auto rhs_hash = std::hash<const void*>{}(static_cast<const void*>(getRhs()));
|
||||||
size_t combined_hash = inst_hash ^ (lhs_hash << 1) ^ (rhs_hash << 2);
|
size_t combined_hash = inst_hash ^ (lhs_hash << 1) ^ (rhs_hash << 2);
|
||||||
std::string tmpName = "tmp_icmp_" + std::to_string(combined_hash % 1000000);
|
std::string tmpName = "tmp_icmp_" + std::to_string(combined_hash % 1000000007);
|
||||||
os << "%" << tmpName << " = " << getKindString() << " " << *getLhs()->getType() << " ";
|
os << "%" << tmpName << " = " << getKindString() << " " << *getLhs()->getType() << " ";
|
||||||
printOperand(os, getLhs());
|
printOperand(os, getLhs());
|
||||||
os << ", ";
|
os << ", ";
|
||||||
@@ -772,7 +772,7 @@ void BinaryInst::print(std::ostream &os) const {
|
|||||||
auto lhs_hash = std::hash<const void*>{}(static_cast<const void*>(getLhs()));
|
auto lhs_hash = std::hash<const void*>{}(static_cast<const void*>(getLhs()));
|
||||||
auto rhs_hash = std::hash<const void*>{}(static_cast<const void*>(getRhs()));
|
auto rhs_hash = std::hash<const void*>{}(static_cast<const void*>(getRhs()));
|
||||||
size_t combined_hash = inst_hash ^ (lhs_hash << 1) ^ (rhs_hash << 2);
|
size_t combined_hash = inst_hash ^ (lhs_hash << 1) ^ (rhs_hash << 2);
|
||||||
std::string tmpName = "tmp_fcmp_" + std::to_string(combined_hash % 1000000);
|
std::string tmpName = "tmp_fcmp_" + std::to_string(combined_hash % 1000000007);
|
||||||
os << "%" << tmpName << " = " << getKindString() << " " << *getLhs()->getType() << " ";
|
os << "%" << tmpName << " = " << getKindString() << " " << *getLhs()->getType() << " ";
|
||||||
printOperand(os, getLhs());
|
printOperand(os, getLhs());
|
||||||
os << ", ";
|
os << ", ";
|
||||||
@@ -834,7 +834,7 @@ void CondBrInst::print(std::ostream &os) const {
|
|||||||
if (condName.empty()) {
|
if (condName.empty()) {
|
||||||
// 使用条件值地址的哈希值作为唯一标识
|
// 使用条件值地址的哈希值作为唯一标识
|
||||||
auto ptr_hash = std::hash<const void*>{}(static_cast<const void*>(condition));
|
auto ptr_hash = std::hash<const void*>{}(static_cast<const void*>(condition));
|
||||||
condName = "const_" + std::to_string(ptr_hash % 100000);
|
condName = "const_" + std::to_string(ptr_hash % 1000000007);
|
||||||
}
|
}
|
||||||
|
|
||||||
// 组合指令地址、条件地址和目标块地址的哈希来确保唯一性
|
// 组合指令地址、条件地址和目标块地址的哈希来确保唯一性
|
||||||
@@ -843,7 +843,7 @@ void CondBrInst::print(std::ostream &os) const {
|
|||||||
auto then_hash = std::hash<const void*>{}(static_cast<const void*>(getThenBlock()));
|
auto then_hash = std::hash<const void*>{}(static_cast<const void*>(getThenBlock()));
|
||||||
auto else_hash = std::hash<const void*>{}(static_cast<const void*>(getElseBlock()));
|
auto else_hash = std::hash<const void*>{}(static_cast<const void*>(getElseBlock()));
|
||||||
size_t combined_hash = inst_hash ^ (cond_hash << 1) ^ (then_hash << 2) ^ (else_hash << 3);
|
size_t combined_hash = inst_hash ^ (cond_hash << 1) ^ (then_hash << 2) ^ (else_hash << 3);
|
||||||
std::string uniqueSuffix = std::to_string(combined_hash % 1000000);
|
std::string uniqueSuffix = std::to_string(combined_hash % 1000000007);
|
||||||
|
|
||||||
os << "%tmp_cond_" << condName << "_" << uniqueSuffix << " = icmp ne i32 ";
|
os << "%tmp_cond_" << condName << "_" << uniqueSuffix << " = icmp ne i32 ";
|
||||||
printOperand(os, condition);
|
printOperand(os, condition);
|
||||||
|
|||||||
@@ -74,6 +74,7 @@ void DCEContext::run(Function *func, AnalysisManager *AM, bool &changed) {
|
|||||||
}
|
}
|
||||||
}
|
}
|
||||||
}
|
}
|
||||||
|
changed |= SysYIROptUtils::eliminateRedundantPhisInFunction(func); // 如果有活跃指令,则标记为已更改
|
||||||
}
|
}
|
||||||
|
|
||||||
// 判断指令是否是"天然活跃"的实现
|
// 判断指令是否是"天然活跃"的实现
|
||||||
|
|||||||
@@ -39,7 +39,7 @@ bool GVN::runOnFunction(Function *func, AnalysisManager &AM) {
|
|||||||
}
|
}
|
||||||
std::cout << "=== GVN completed for function: " << func->getName() << " ===" << std::endl;
|
std::cout << "=== GVN completed for function: " << func->getName() << " ===" << std::endl;
|
||||||
}
|
}
|
||||||
|
changed |= SysYIROptUtils::eliminateRedundantPhisInFunction(func);
|
||||||
return changed;
|
return changed;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -671,13 +671,13 @@ bool GlobalStrengthReductionContext::reduceDivision(BinaryInst *inst) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// x / c = x * magic_number (魔数乘法优化 - 使用libdivide算法)
|
// x / c = x * magic_number (魔数乘法优化 - 使用libdivide算法)
|
||||||
if (isConstantInt(rhs, constVal) && constVal > 1 && constVal != (uint32_t)(-1)) {
|
// if (isConstantInt(rhs, constVal) && constVal > 1 && constVal != (uint32_t)(-1)) {
|
||||||
// auto magicPair = computeMulhMagicNumbers(static_cast<int>(constVal));
|
// // auto magicPair = computeMulhMagicNumbers(static_cast<int>(constVal));
|
||||||
Value* magicResult = createMagicDivisionLibdivide(inst, static_cast<int>(constVal));
|
// Value* magicResult = createMagicDivisionLibdivide(inst, static_cast<int>(constVal));
|
||||||
replaceWithOptimized(inst, magicResult);
|
// replaceWithOptimized(inst, magicResult);
|
||||||
divisionOptCount++;
|
// divisionOptCount++;
|
||||||
return true;
|
// return true;
|
||||||
}
|
// }
|
||||||
|
|
||||||
return false;
|
return false;
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -133,6 +133,7 @@ bool InductionVariableEliminationContext::run(Function* F, AnalysisManager& AM)
|
|||||||
printDebugInfo();
|
printDebugInfo();
|
||||||
}
|
}
|
||||||
|
|
||||||
|
modified |= SysYIROptUtils::eliminateRedundantPhisInFunction(F);
|
||||||
return modified;
|
return modified;
|
||||||
}
|
}
|
||||||
|
|
||||||
|
|||||||
@@ -1,145 +0,0 @@
|
|||||||
#include "../../include/midend/Pass/Optimize/LargeArrayToGlobal.h"
|
|
||||||
#include "../../IR.h"
|
|
||||||
#include <unordered_map>
|
|
||||||
#include <sstream>
|
|
||||||
#include <string>
|
|
||||||
|
|
||||||
namespace sysy {
|
|
||||||
|
|
||||||
// Helper function to convert type to string
|
|
||||||
static std::string typeToString(Type *type) {
|
|
||||||
if (!type) return "null";
|
|
||||||
|
|
||||||
switch (type->getKind()) {
|
|
||||||
case Type::kInt:
|
|
||||||
return "int";
|
|
||||||
case Type::kFloat:
|
|
||||||
return "float";
|
|
||||||
case Type::kPointer:
|
|
||||||
return "ptr";
|
|
||||||
case Type::kArray: {
|
|
||||||
auto *arrayType = type->as<ArrayType>();
|
|
||||||
return "[" + std::to_string(arrayType->getNumElements()) + " x " +
|
|
||||||
typeToString(arrayType->getElementType()) + "]";
|
|
||||||
}
|
|
||||||
default:
|
|
||||||
return "unknown";
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
void *LargeArrayToGlobalPass::ID = &LargeArrayToGlobalPass::ID;
|
|
||||||
|
|
||||||
bool LargeArrayToGlobalPass::runOnModule(Module *M, AnalysisManager &AM) {
|
|
||||||
bool changed = false;
|
|
||||||
|
|
||||||
if (!M) {
|
|
||||||
return false;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Collect all alloca instructions from all functions
|
|
||||||
std::vector<std::pair<AllocaInst*, Function*>> allocasToConvert;
|
|
||||||
|
|
||||||
for (auto &funcPair : M->getFunctions()) {
|
|
||||||
Function *F = funcPair.second.get();
|
|
||||||
if (!F || F->getBasicBlocks().begin() == F->getBasicBlocks().end()) {
|
|
||||||
continue;
|
|
||||||
}
|
|
||||||
|
|
||||||
for (auto &BB : F->getBasicBlocks()) {
|
|
||||||
for (auto &inst : BB->getInstructions()) {
|
|
||||||
if (auto *alloca = dynamic_cast<AllocaInst*>(inst.get())) {
|
|
||||||
Type *allocatedType = alloca->getAllocatedType();
|
|
||||||
|
|
||||||
// Calculate the size of the allocated type
|
|
||||||
unsigned size = calculateTypeSize(allocatedType);
|
|
||||||
if(DEBUG){
|
|
||||||
// Debug: print size information
|
|
||||||
std::cout << "LargeArrayToGlobalPass: Found alloca with size " << size
|
|
||||||
<< " for type " << typeToString(allocatedType) << std::endl;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Convert arrays of 1KB (1024 bytes) or larger to global variables
|
|
||||||
if (size >= 1024) {
|
|
||||||
if(DEBUG)
|
|
||||||
std::cout << "LargeArrayToGlobalPass: Converting array of size " << size << " to global" << std::endl;
|
|
||||||
allocasToConvert.emplace_back(alloca, F);
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
// Convert the collected alloca instructions to global variables
|
|
||||||
for (auto [alloca, F] : allocasToConvert) {
|
|
||||||
convertAllocaToGlobal(alloca, F, M);
|
|
||||||
changed = true;
|
|
||||||
}
|
|
||||||
|
|
||||||
return changed;
|
|
||||||
}
|
|
||||||
|
|
||||||
unsigned LargeArrayToGlobalPass::calculateTypeSize(Type *type) {
|
|
||||||
if (!type) return 0;
|
|
||||||
|
|
||||||
switch (type->getKind()) {
|
|
||||||
case Type::kInt:
|
|
||||||
case Type::kFloat:
|
|
||||||
return 4;
|
|
||||||
case Type::kPointer:
|
|
||||||
return 8;
|
|
||||||
case Type::kArray: {
|
|
||||||
auto *arrayType = type->as<ArrayType>();
|
|
||||||
return arrayType->getNumElements() * calculateTypeSize(arrayType->getElementType());
|
|
||||||
}
|
|
||||||
default:
|
|
||||||
return 0;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
void LargeArrayToGlobalPass::convertAllocaToGlobal(AllocaInst *alloca, Function *F, Module *M) {
|
|
||||||
Type *allocatedType = alloca->getAllocatedType();
|
|
||||||
|
|
||||||
// Create a unique name for the global variable
|
|
||||||
std::string globalName = generateUniqueGlobalName(alloca, F);
|
|
||||||
|
|
||||||
// Create the global variable - GlobalValue expects pointer type
|
|
||||||
Type *pointerType = Type::getPointerType(allocatedType);
|
|
||||||
GlobalValue *globalVar = M->createGlobalValue(globalName, pointerType);
|
|
||||||
|
|
||||||
if (!globalVar) {
|
|
||||||
return;
|
|
||||||
}
|
|
||||||
|
|
||||||
// Replace all uses of the alloca with the global variable
|
|
||||||
alloca->replaceAllUsesWith(globalVar);
|
|
||||||
|
|
||||||
// Remove the alloca instruction from its basic block
|
|
||||||
for (auto &BB : F->getBasicBlocks()) {
|
|
||||||
auto &instructions = BB->getInstructions();
|
|
||||||
for (auto it = instructions.begin(); it != instructions.end(); ++it) {
|
|
||||||
if (it->get() == alloca) {
|
|
||||||
instructions.erase(it);
|
|
||||||
break;
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
}
|
|
||||||
|
|
||||||
std::string LargeArrayToGlobalPass::generateUniqueGlobalName(AllocaInst *alloca, Function *F) {
|
|
||||||
std::string baseName = alloca->getName();
|
|
||||||
if (baseName.empty()) {
|
|
||||||
baseName = "array";
|
|
||||||
}
|
|
||||||
|
|
||||||
// Ensure uniqueness by appending function name and counter
|
|
||||||
static std::unordered_map<std::string, int> nameCounter;
|
|
||||||
std::string key = F->getName() + "." + baseName;
|
|
||||||
|
|
||||||
int counter = nameCounter[key]++;
|
|
||||||
std::ostringstream oss;
|
|
||||||
oss << key << "." << counter;
|
|
||||||
|
|
||||||
return oss.str();
|
|
||||||
}
|
|
||||||
|
|
||||||
} // namespace sysy
|
|
||||||
@@ -661,9 +661,9 @@ bool StrengthReductionContext::replaceOriginalInstruction(StrengthReductionCandi
|
|||||||
|
|
||||||
case StrengthReductionCandidate::DIVIDE_CONST: {
|
case StrengthReductionCandidate::DIVIDE_CONST: {
|
||||||
// 任意常数除法
|
// 任意常数除法
|
||||||
builder->setPosition(candidate->containingBlock,
|
// builder->setPosition(candidate->containingBlock,
|
||||||
candidate->containingBlock->findInstIterator(candidate->originalInst));
|
// candidate->containingBlock->findInstIterator(candidate->originalInst));
|
||||||
replacementValue = generateConstantDivisionReplacement(candidate, builder);
|
// replacementValue = generateConstantDivisionReplacement(candidate, builder);
|
||||||
break;
|
break;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -683,17 +683,19 @@ bool StrengthReductionContext::replaceOriginalInstruction(StrengthReductionCandi
|
|||||||
);
|
);
|
||||||
|
|
||||||
// 检查原值是否为负数
|
// 检查原值是否为负数
|
||||||
Value* zero = ConstantInteger::get(0);
|
Value* shift31condidata = builder->createBinaryInst(
|
||||||
Value* isNegative = builder->createICmpLTInst(candidate->inductionVar, zero);
|
Instruction::Kind::kSra, candidate->inductionVar->getType(),
|
||||||
|
candidate->inductionVar, ConstantInteger::get(31)
|
||||||
|
);
|
||||||
|
|
||||||
// 如果为负数,需要调整结果
|
// 如果为负数,需要调整结果
|
||||||
Value* adjustment = ConstantInteger::get(candidate->multiplier);
|
Value* adjustment = builder->createAndInst(shift31condidata, maskConstant);
|
||||||
Value* adjustedTemp = builder->createAddInst(temp, adjustment);
|
Value* adjustedTemp = builder->createAddInst(candidate->inductionVar, adjustment);
|
||||||
|
Value* adjustedResult = builder->createBinaryInst(
|
||||||
// 使用条件分支来模拟select操作
|
Instruction::Kind::kAnd, candidate->inductionVar->getType(),
|
||||||
// 为简化起见,这里先用一个更复杂但可工作的方式
|
adjustedTemp, maskConstant
|
||||||
// 实际应该创建条件分支,但这里先简化处理
|
);
|
||||||
replacementValue = temp; // 简化版本,假设大多数情况下不是负数
|
replacementValue = adjustedResult;
|
||||||
} else {
|
} else {
|
||||||
// 非负数的取模,直接使用位与
|
// 非负数的取模,直接使用位与
|
||||||
replacementValue = builder->createBinaryInst(
|
replacementValue = builder->createBinaryInst(
|
||||||
|
|||||||
@@ -70,20 +70,20 @@ void Reg2MemContext::allocateMemoryForSSAValues(Function *func) {
|
|||||||
|
|
||||||
// 1. 为函数参数分配内存
|
// 1. 为函数参数分配内存
|
||||||
builder->setPosition(entryBlock, entryBlock->begin()); // 确保在入口块的开始位置插入
|
builder->setPosition(entryBlock, entryBlock->begin()); // 确保在入口块的开始位置插入
|
||||||
for (auto arg : func->getArguments()) {
|
// for (auto arg : func->getArguments()) {
|
||||||
// 默认情况下,将所有参数是提升到内存
|
// // 默认情况下,将所有参数是提升到内存
|
||||||
if (isPromotableToMemory(arg)) {
|
// if (isPromotableToMemory(arg)) {
|
||||||
// 参数的类型就是 AllocaInst 需要分配的类型
|
// // 参数的类型就是 AllocaInst 需要分配的类型
|
||||||
AllocaInst *alloca = builder->createAllocaInst(Type::getPointerType(arg->getType()), arg->getName() + ".reg2mem");
|
// AllocaInst *alloca = builder->createAllocaInst(Type::getPointerType(arg->getType()), arg->getName() + ".reg2mem");
|
||||||
// 将参数值 store 到 alloca 中 (这是 Mem2Reg 逆转的关键一步)
|
// // 将参数值 store 到 alloca 中 (这是 Mem2Reg 逆转的关键一步)
|
||||||
valueToAllocaMap[arg] = alloca;
|
// valueToAllocaMap[arg] = alloca;
|
||||||
|
|
||||||
// 确保 alloca 位于入口块的顶部,但在所有参数的 store 指令之前
|
// // 确保 alloca 位于入口块的顶部,但在所有参数的 store 指令之前
|
||||||
// 通常 alloca 都在 entry block 的最开始
|
// // 通常 alloca 都在 entry block 的最开始
|
||||||
// 这里我们只是创建,并让 builder 决定插入位置 (通常在当前插入点)
|
// // 这里我们只是创建,并让 builder 决定插入位置 (通常在当前插入点)
|
||||||
// 如果需要严格控制顺序,可能需要手动 insert 到 instruction list
|
// // 如果需要严格控制顺序,可能需要手动 insert 到 instruction list
|
||||||
}
|
// }
|
||||||
}
|
// }
|
||||||
|
|
||||||
// 2. 为指令结果分配内存
|
// 2. 为指令结果分配内存
|
||||||
// 遍历所有基本块和指令,找出所有需要分配 Alloca 的指令结果
|
// 遍历所有基本块和指令,找出所有需要分配 Alloca 的指令结果
|
||||||
@@ -123,11 +123,11 @@ void Reg2MemContext::allocateMemoryForSSAValues(Function *func) {
|
|||||||
}
|
}
|
||||||
|
|
||||||
// 插入所有参数的初始 Store 指令
|
// 插入所有参数的初始 Store 指令
|
||||||
for (auto arg : func->getArguments()) {
|
// for (auto arg : func->getArguments()) {
|
||||||
if (valueToAllocaMap.count(arg)) { // 检查是否为其分配了 alloca
|
// if (valueToAllocaMap.count(arg)) { // 检查是否为其分配了 alloca
|
||||||
builder->createStoreInst(arg, valueToAllocaMap[arg]);
|
// builder->createStoreInst(arg, valueToAllocaMap[arg]);
|
||||||
}
|
// }
|
||||||
}
|
// }
|
||||||
|
|
||||||
builder->setPosition(entryBlock, entryBlock->terminator());
|
builder->setPosition(entryBlock, entryBlock->terminator());
|
||||||
}
|
}
|
||||||
|
|||||||
@@ -1357,9 +1357,8 @@ void SCCPContext::run(Function *func, AnalysisManager &AM) {
|
|||||||
bool changed_control_flow = SimplifyControlFlow(func);
|
bool changed_control_flow = SimplifyControlFlow(func);
|
||||||
|
|
||||||
// 如果任何一个阶段修改了 IR,标记分析结果为失效
|
// 如果任何一个阶段修改了 IR,标记分析结果为失效
|
||||||
if (changed_constant_propagation || changed_control_flow) {
|
bool changed = changed_constant_propagation || changed_control_flow;
|
||||||
// AM.invalidate(); // 假设有这样的方法来使所有分析结果失效
|
changed |= SysYIROptUtils::eliminateRedundantPhisInFunction(func);
|
||||||
}
|
|
||||||
}
|
}
|
||||||
|
|
||||||
// SCCP Pass methods
|
// SCCP Pass methods
|
||||||
|
|||||||
@@ -13,7 +13,6 @@
|
|||||||
#include "GVN.h"
|
#include "GVN.h"
|
||||||
#include "SCCP.h"
|
#include "SCCP.h"
|
||||||
#include "BuildCFG.h"
|
#include "BuildCFG.h"
|
||||||
#include "LargeArrayToGlobal.h"
|
|
||||||
#include "LoopNormalization.h"
|
#include "LoopNormalization.h"
|
||||||
#include "LICM.h"
|
#include "LICM.h"
|
||||||
#include "LoopStrengthReduction.h"
|
#include "LoopStrengthReduction.h"
|
||||||
@@ -61,8 +60,6 @@ void PassManager::runOptimizationPipeline(Module* moduleIR, IRBuilder* builderIR
|
|||||||
|
|
||||||
// 注册优化遍
|
// 注册优化遍
|
||||||
registerOptimizationPass<BuildCFG>();
|
registerOptimizationPass<BuildCFG>();
|
||||||
registerOptimizationPass<LargeArrayToGlobalPass>();
|
|
||||||
|
|
||||||
registerOptimizationPass<GVN>();
|
registerOptimizationPass<GVN>();
|
||||||
|
|
||||||
registerOptimizationPass<SysYDelInstAfterBrPass>();
|
registerOptimizationPass<SysYDelInstAfterBrPass>();
|
||||||
@@ -98,7 +95,6 @@ void PassManager::runOptimizationPipeline(Module* moduleIR, IRBuilder* builderIR
|
|||||||
|
|
||||||
this->clearPasses();
|
this->clearPasses();
|
||||||
this->addPass(&BuildCFG::ID);
|
this->addPass(&BuildCFG::ID);
|
||||||
this->addPass(&LargeArrayToGlobalPass::ID);
|
|
||||||
this->run();
|
this->run();
|
||||||
|
|
||||||
this->clearPasses();
|
this->clearPasses();
|
||||||
@@ -185,19 +181,19 @@ void PassManager::runOptimizationPipeline(Module* moduleIR, IRBuilder* builderIR
|
|||||||
printPasses();
|
printPasses();
|
||||||
}
|
}
|
||||||
|
|
||||||
// this->clearPasses();
|
this->clearPasses();
|
||||||
// this->addPass(&LoopStrengthReduction::ID);
|
this->addPass(&LoopStrengthReduction::ID);
|
||||||
// this->run();
|
this->run();
|
||||||
|
|
||||||
if(DEBUG) {
|
if(DEBUG) {
|
||||||
std::cout << "=== IR After Loop Normalization, and Strength Reduction Optimizations ===\n";
|
std::cout << "=== IR After Loop Normalization, and Strength Reduction Optimizations ===\n";
|
||||||
printPasses();
|
printPasses();
|
||||||
}
|
}
|
||||||
|
|
||||||
// // 全局强度削弱优化,包括代数优化和魔数除法
|
// 全局强度削弱优化,包括代数优化和魔数除法
|
||||||
// this->clearPasses();
|
this->clearPasses();
|
||||||
// this->addPass(&GlobalStrengthReduction::ID);
|
this->addPass(&GlobalStrengthReduction::ID);
|
||||||
// this->run();
|
this->run();
|
||||||
|
|
||||||
if(DEBUG) {
|
if(DEBUG) {
|
||||||
std::cout << "=== IR After Global Strength Reduction Optimizations ===\n";
|
std::cout << "=== IR After Global Strength Reduction Optimizations ===\n";
|
||||||
|
|||||||
@@ -674,6 +674,8 @@ std::any SysYIRGenerator::visitCompUnit(SysYParser::CompUnitContext *ctx) {
|
|||||||
pModule->enterNewScope();
|
pModule->enterNewScope();
|
||||||
visitChildren(ctx);
|
visitChildren(ctx);
|
||||||
pModule->leaveScope();
|
pModule->leaveScope();
|
||||||
|
|
||||||
|
Utils::modify_timefuncname(pModule);
|
||||||
return pModule;
|
return pModule;
|
||||||
}
|
}
|
||||||
|
|
||||||
@@ -2403,4 +2405,12 @@ void Utils::initExternalFunction(Module *pModule, IRBuilder *pBuilder) {
|
|||||||
|
|
||||||
}
|
}
|
||||||
|
|
||||||
|
void Utils::modify_timefuncname(Module *pModule){
|
||||||
|
auto starttimeFunc = pModule->getExternalFunction("starttime");
|
||||||
|
auto stoptimeFunc = pModule->getExternalFunction("stoptime");
|
||||||
|
starttimeFunc->setName("_sysy_starttime");
|
||||||
|
stoptimeFunc->setName("_sysy_stoptime");
|
||||||
|
|
||||||
|
}
|
||||||
|
|
||||||
} // namespace sysy
|
} // namespace sysy
|
||||||
@@ -28,7 +28,7 @@ static string argStopAfter;
|
|||||||
static string argInputFile;
|
static string argInputFile;
|
||||||
static bool argFormat = false; // 目前未使用,但保留
|
static bool argFormat = false; // 目前未使用,但保留
|
||||||
static string argOutputFilename;
|
static string argOutputFilename;
|
||||||
static int optLevel = 0; // 优化级别,默认为0 (不加-O参数时)
|
int optLevel = 0; // 优化级别,默认为0 (不加-O参数时)
|
||||||
|
|
||||||
void usage(int code) {
|
void usage(int code) {
|
||||||
const char *msg = "Usage: sysyc [options] inputfile\n\n"
|
const char *msg = "Usage: sysyc [options] inputfile\n\n"
|
||||||
|
|||||||
Reference in New Issue
Block a user