x86 LVI hardening #58

jethrogb · 2020-05-12T11:45:23Z

These patches are needed to mitigate the Load Value Injection (LVI) vulnerability in certain Intel processors.

The patches add several off-by-default passes to the x86 backend. The passes may be enabled as follows:
x86 Target features: lvi-cfi, lvi-load-hardening
LLVM argument: -x86-experimental-lvi-inline-asm-hardening

This is cherry-picking commits 080dd10 71e8021 b1d5810 5b519cf f95a67d e97a3e5 8ce078c 08b8b72 from upstream.

cc @raoulstrackx

llvm/lib/Target/X86/X86IndirectThunks.cpp

jethrogb · 2020-05-13T12:20:36Z

This is fixing a known, public, security vulnerability so I'd appreciate a quick review.

jethrogb · 2020-05-14T14:08:57Z

cc @nikomatsakis

nikic · 2020-05-14T14:48:14Z

@cuviper Does RedHat backport this?

I think this needs to bake upstream some more, this has landed upstream only a couple of days ago (after sitting in review for a few months). I'm also still waiting for some compile-time regression mitigations for this change (llvm@de92dc2 is a partial mitigation).

jethrogb · 2020-05-14T15:00:09Z

bake upstream some more

Can you elaborate on the timeline you have in mind?

The changes just landed upstream but the review process was very thorough. The patches were posted publicly in March but they have been developed and tested privately months prior. We've been using some version of these patches on a fork of Rust for a couple of months without issues.

Since these passes are opt-in, most users will not be affected by landing this. On the other hand, users who want to enable these passes need these patches yesterday to patch known, public, security vulnerabilities.

The patch you linked affects a file that doesn't exist in LLVM 9.

cuviper · 2020-05-14T20:51:09Z

@cuviper Does RedHat backport this?

No. Our official statement on CVE-2020-0551 is that "no additional changes are deemed necessary or practical to address this flaw at the software layer."

But that's specific to our products, and I do see how a project like Fortanix would need extra care.

llvm/test/CodeGen/X86/O3-pipeline.ll

cuviper · 2020-05-25T15:17:10Z

Please note that we've moved to LLVM 10 since rust-lang/rust#67759, so we'll need this PR rebased onto rustc/10.0-2020-05-05 to reach rust master.

nikic · 2020-05-25T15:18:05Z

Nightly has recently updated to LLVM 10, so this would need to go into the https://github.com/rust-lang/llvm-project/tree/rustc/10.0-2020-05-05 branch now.

@cuviper Does RedHat backport this?

No. Our official statement on CVE-2020-0551 is that "no additional changes are deemed necessary or practical to address this flaw at the software layer."

But that's specific to our products, and I do see how a project like Fortanix would need extra care.

That's about what I expected. Can you check with Tom Stellard whether there's any plans to backport this upstream for the LLVM 10.0.1 release? (Independently of RHs own policy on this issue.)

Apart from that, I guess it would be fine to merge this. This has been in LLVM master for a while now without any issues I'm aware of, and most people will never run this code. The only impact for most people would be a slight compile-time regression for debug builds.

RDF is designed to be target agnostic. Therefore it would be useful to have it available for other targets, such as X86. Based on a previous patch by Krzysztof Parzyszek Differential Revision: https://reviews.llvm.org/D75932

…de to "Indirect Thunks" There are applications for indirect call/branch thunks other than retpoline for Spectre v2, e.g., https://software.intel.com/security-software-guidance/software-guidance/load-value-injection Therefore it makes sense to refactor X86RetpolineThunks as a more general capability. Differential Revision: https://reviews.llvm.org/D76810

… than Retpoline Introduce a ThunkInserter CRTP base class from which new thunk types can inherit, e.g., thunks to mitigate https://software.intel.com/security-software-guidance/software-guidance/load-value-injection. Differential Revision: https://reviews.llvm.org/D76811

…ion (LVI) This pass replaces each indirect call/jump with a direct call to a thunk that looks like: lfence jmpq *%r11 This ensures that if the value in register %r11 was loaded from memory, then the value in %r11 is (architecturally) correct prior to the jump. Also adds a new target feature to X86: +lvi-cfi ("cfi" meaning control-flow integrity) The feature can be added via clang CLI using -mlvi-cfi. This is an alternate implementation to https://reviews.llvm.org/D75934 That merges the thunk insertion functionality with the existing X86 retpoline code. Differential Revision: https://reviews.llvm.org/D76812

Adding a pass that replaces every ret instruction with the sequence: pop <scratch-reg> lfence jmp *<scratch-reg> where <scratch-reg> is some available scratch register, according to the calling convention of the function being mitigated. Differential Revision: https://reviews.llvm.org/D75935

… (LVI) Gadgets Adds a new data structure, ImmutableGraph, and uses RDF to find LVI gadgets and add them to a MachineGadgetGraph. More specifically, a new X86 machine pass finds Load Value Injection (LVI) gadgets consisting of a load from memory (i.e., SOURCE), and any operation that may transmit the value loaded from memory over a covert channel, or use the value loaded from memory to determine a branch/call target (i.e., SINK). Also adds a new target feature to X86: +lvi-load-hardening The feature can be added via the clang CLI using -mlvi-hardening. Differential Revision: https://reviews.llvm.org/D75936

… (LVI) After finding all such gadgets in a given function, the pass minimally inserts LFENCE instructions in such a manner that the following property is satisfied: for all SOURCE+SINK pairs, all paths in the CFG from SOURCE to SINK contain at least one LFENCE instruction. The algorithm that implements this minimal insertion is influenced by an academic paper that minimally inserts memory fences for high-performance concurrent programs: http://www.cs.ucr.edu/~lesani/companion/oopsla15/OOPSLA15.pdf The algorithm implemented in this pass is as follows: 1. Build a condensed CFG (i.e., a GadgetGraph) consisting only of the following components: -SOURCE instructions (also includes function arguments) -SINK instructions -Basic block entry points -Basic block terminators -LFENCE instructions 2. Analyze the GadgetGraph to determine which SOURCE+SINK pairs (i.e., gadgets) are already mitigated by existing LFENCEs. If all gadgets have been mitigated, go to step 6. 3. Use a heuristic or plugin to approximate minimal LFENCE insertion. 4. Insert one LFENCE along each CFG edge that was cut in step 3. 5. Go to step 2. 6. If any LFENCEs were inserted, return true from runOnFunction() to tell LLVM that the function was modified. By default, the heuristic used in Step 3 is a greedy heuristic that avoids inserting LFENCEs into loops unless absolutely necessary. There is also a CLI option to load a plugin that can provide even better optimization, inserting fewer fences, while still mitigating all of the LVI gadgets. The plugin can be found here: https://github.com/intel/lvi-llvm-optimization-plugin, and a description of the pass's behavior with the plugin can be found here: https://software.intel.com/security-software-guidance/insights/optimized-mitigation-approach-load-value-injection. Differential Revision: https://reviews.llvm.org/D75937

…jection (LVI) Added code to X86AsmParser::emitInstruction() to add an LFENCE after each instruction that may load, and emit a warning if it encounters an instruction that may be vulnerable, but cannot be automatically mitigated. Differential Revision: https://reviews.llvm.org/D76158

jethrogb · 2020-05-25T17:20:45Z

Updated. Nice bonus is that LLVM 10 is on C++14 so the backport was easier.

jethrogb · 2020-05-25T17:21:29Z

llvm/test/CodeGen/X86/O3-pipeline.ll

@@ -138,9 +138,11 @@
 ; CHECK-NEXT:       Machine Loop Invariant Code Motion
 ; CHECK-NEXT:       Bundle Machine CFG Edges
 ; CHECK-NEXT:       X86 FP Stackifier
+; CHECK-NEXT:       MachineDominator Tree Construction


This is moved from a couple of lines down, so I don't think something like de92dc2 is needed.

jethrogb · 2020-05-26T08:37:11Z

So can I get a review then?

tstellar · 2020-05-26T17:53:50Z

@nikic If you file a bug requesting this backport at bugs.llvm.org and mark it as a blocker of release-10.0.1, I will add it to my review queue.

tstellar · 2020-05-26T17:55:12Z

@nikic I should add this doesn't guarantee I'll merge it. The patch needs to be reviewed to make sure it meets the release criteria and get the necessary approvals. But step 1 for this process is filing the bug.

…callback The `TypeSystemMap::m_mutex` guards against concurrent modifications of members of `TypeSystemMap`. In particular, `m_map`. `TypeSystemMap::ForEach` iterates through the entire `m_map` calling a user-specified callback for each entry. This is all done while `m_mutex` is locked. However, there's nothing that guarantees that the callback itself won't call back into `TypeSystemMap` APIs on the same thread. This lead to double-locking `m_mutex`, which is undefined behaviour. We've seen this cause a deadlock in the swift plugin with following backtrace: ``` int main() { std::unique_ptr<int> up = std::make_unique<int>(5); volatile int val = *up; return val; } clang++ -std=c++2a -g -O1 main.cpp ./bin/lldb -o “br se -p return” -o run -o “v *up” -o “expr *up” -b ``` ``` frame rust-lang#4: std::lock_guard<std::mutex>::lock_guard frame rust-lang#5: lldb_private::TypeSystemMap::GetTypeSystemForLanguage <<<< Lock rust-lang#2 frame rust-lang#6: lldb_private::TypeSystemMap::GetTypeSystemForLanguage frame rust-lang#7: lldb_private::Target::GetScratchTypeSystemForLanguage ... frame rust-lang#26: lldb_private::SwiftASTContext::LoadLibraryUsingPaths frame rust-lang#27: lldb_private::SwiftASTContext::LoadModule frame rust-lang#30: swift::ModuleDecl::collectLinkLibraries frame rust-lang#31: lldb_private::SwiftASTContext::LoadModule frame rust-lang#34: lldb_private::SwiftASTContext::GetCompileUnitImportsImpl frame rust-lang#35: lldb_private::SwiftASTContext::PerformCompileUnitImports frame rust-lang#36: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetSwiftASTContext frame rust-lang#37: lldb_private::TypeSystemSwiftTypeRefForExpressions::GetPersistentExpressionState frame rust-lang#38: lldb_private::Target::GetPersistentSymbol frame rust-lang#41: lldb_private::TypeSystemMap::ForEach <<<< Lock #1 frame rust-lang#42: lldb_private::Target::GetPersistentSymbol frame rust-lang#43: lldb_private::IRExecutionUnit::FindInUserDefinedSymbols frame rust-lang#44: lldb_private::IRExecutionUnit::FindSymbol frame rust-lang#45: lldb_private::IRExecutionUnit::MemoryManager::GetSymbolAddressAndPresence frame rust-lang#46: lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame rust-lang#47: non-virtual thunk to lldb_private::IRExecutionUnit::MemoryManager::findSymbol frame rust-lang#48: llvm::LinkingSymbolResolver::findSymbol frame rust-lang#49: llvm::LegacyJITSymbolResolver::lookup frame rust-lang#50: llvm::RuntimeDyldImpl::resolveExternalSymbols frame rust-lang#51: llvm::RuntimeDyldImpl::resolveRelocations frame rust-lang#52: llvm::MCJIT::finalizeLoadedModules frame rust-lang#53: llvm::MCJIT::finalizeObject frame rust-lang#54: lldb_private::IRExecutionUnit::ReportAllocations frame rust-lang#55: lldb_private::IRExecutionUnit::GetRunnableInfo frame rust-lang#56: lldb_private::ClangExpressionParser::PrepareForExecution frame rust-lang#57: lldb_private::ClangUserExpression::TryParse frame rust-lang#58: lldb_private::ClangUserExpression::Parse ``` Our solution is to simply iterate over a local copy of `m_map`. **Testing** * Confirmed on manual reproducer (would reproduce 100% of the time before the patch) Differential Revision: https://reviews.llvm.org/D149949

) Currently, process of replacing bitwise operations consisting of `LSR`/`LSL` with `And` is performed by `DAGCombiner`. However, in certain cases, the `AND` generated by this process can be removed. Consider following case: ``` lsr x8, x8, rust-lang#56 and x8, x8, #0xfc ldr w0, [x2, x8] ret ``` In this case, we can remove the `AND` by changing the target of `LDR` to `[X2, X8, LSL rust-lang#2]` and right-shifting amount change to 56 to 58. after changed: ``` lsr x8, x8, rust-lang#58 ldr w0, [x2, x8, lsl rust-lang#2] ret ``` This patch checks to see if the `SHIFTING` + `AND` operation on load target can be optimized and optimizes it if it can.

jethrogb force-pushed the lvi-hardening branch from 93a608a to 12ae505 Compare May 13, 2020 12:17

jethrogb commented May 13, 2020

View reviewed changes

llvm/lib/Target/X86/X86IndirectThunks.cpp Outdated Show resolved Hide resolved

jethrogb force-pushed the lvi-hardening branch from 12ae505 to 9dbdda5 Compare May 14, 2020 18:06

jethrogb commented May 15, 2020

View reviewed changes

llvm/test/CodeGen/X86/O3-pipeline.ll Show resolved Hide resolved

scottconstable and others added 8 commits May 25, 2020 16:10

Move RDF from Hexagon to Codegen

31537e6

RDF is designed to be target agnostic. Therefore it would be useful to have it available for other targets, such as X86. Based on a previous patch by Krzysztof Parzyszek Differential Revision: https://reviews.llvm.org/D75932

jethrogb force-pushed the lvi-hardening branch from 9dbdda5 to fb13965 Compare May 25, 2020 17:20

jethrogb changed the base branch from rustc/9.0-2019-12-19 to rustc/10.0-2020-05-05 May 25, 2020 17:20

jethrogb commented May 25, 2020

View reviewed changes

nikic merged commit e6e8011 into rust-lang:rustc/10.0-2020-05-05 May 26, 2020

jethrogb mentioned this pull request May 27, 2020

Manually patch ret instruction for LVI rust-lang/compiler-builtins#359

Merged

sbellem mentioned this pull request Mar 5, 2022

nix: pass compiler args sbellem/cipher-paratime#4

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

x86 LVI hardening #58

x86 LVI hardening #58

jethrogb commented May 12, 2020

jethrogb commented May 13, 2020

jethrogb commented May 14, 2020

nikic commented May 14, 2020

jethrogb commented May 14, 2020 •

edited

Loading

cuviper commented May 14, 2020

cuviper commented May 25, 2020

nikic commented May 25, 2020

jethrogb commented May 25, 2020

jethrogb May 25, 2020

jethrogb commented May 26, 2020

tstellar commented May 26, 2020

tstellar commented May 26, 2020

x86 LVI hardening #58

x86 LVI hardening #58

Conversation

jethrogb commented May 12, 2020

jethrogb commented May 13, 2020

jethrogb commented May 14, 2020

nikic commented May 14, 2020

jethrogb commented May 14, 2020 • edited Loading

cuviper commented May 14, 2020

cuviper commented May 25, 2020

nikic commented May 25, 2020

jethrogb commented May 25, 2020

jethrogb May 25, 2020

Choose a reason for hiding this comment

jethrogb commented May 26, 2020

tstellar commented May 26, 2020

tstellar commented May 26, 2020

jethrogb commented May 14, 2020 •

edited

Loading