Fix AMD boot freeze #1015

serban300 · 2019-03-15T15:43:49Z

Issue #, if available: #815

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

cpuid/src/common.rs

andreeaflorescu · 2019-03-18T13:30:11Z

cpuid/src/common.rs

+use kvm::CpuId;
+use kvm_bindings::kvm_cpuid_entry2;
+
+const INTEL: &[u8; 12] = b"GenuineIntel";


nit: I would prefix these two with VENDOR_ID_* so we have a better understanding of what are we using them for.

andreeaflorescu · 2019-03-18T13:33:18Z

cpuid/src/common.rs

+const INTEL: &[u8; 12] = b"GenuineIntel";
+const AMD: &[u8; 12] = b"AuthenticAMD";
+
+const EXT_FUNCTION: u32 = 0x80000000;


This is HIGHEST_EXTENDED_FUNCTION. We should name constants here as close as possible to the names that they have in the Intel/AMD manual so it is easier to correlate the code with the software development manuals.

andreeaflorescu · 2019-03-18T13:41:50Z

cpuid/src/common.rs

+        }
+
+    // this is safe because the host supports the `cpuid` instruction
+    let max_function = unsafe { __get_cpuid_max(function & EXT_FUNCTION).0 };


I am wondering if instead of always querying the maximum cpuid leaf supported, we can instead save the max_function somewhere and then just do the comparison from the following line.

I don't know if it's worth it. This is really fast. The entire call to filter_cpuid takes less then 15 microseconds

andreeaflorescu · 2019-03-18T13:48:33Z

cpuid/src/common.rs

+}
+
+#[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
+fn get_cpuid(function: u32, count: u32) -> Result<CpuidResult, Error> {


nit: In Intel terminology I believe count is actually sub_leaf.

I would keep count since it's named the same in multiple places, for example:

Kernel

GCC

andreeaflorescu · 2019-03-18T14:15:00Z

cpuid/src/common.rs

+
+    #[test]
+    #[cfg(any(target_arch = "x86", target_arch = "x86_64"))]
+    fn get_cpu_id_test() {


Please prefix test function with the word test so it is easier to mentally separate the tests from the helper functions.

andreeaflorescu · 2019-03-18T14:19:48Z

cpuid/src/common.rs

@@ -16,32 +16,28 @@ use std::arch::x86::__cpuid_count;
 #[cfg(target_arch = "x86_64")]


Looks like this commit (ba3d225) is just moving the code from the previous commit. I understand that the previous commit was already reviewed as part of the #882 PR, but other than that is there any reason for not squashing these two commits together?

There's no other reason. I squashed the commits.

andreeaflorescu · 2019-03-18T14:25:19Z

cpuid/src/cpu_leaf.rs

@@ -3,6 +3,8 @@

 // Basic CPUID Information
 pub mod leaf_0x1 {
+    pub const FUNCTION: u32 = 0x1;


I know leaf and function are used interchangeably everywhere, but to not confuse people that look at this code for the first time, I would say to stick with only one name: either leaf or function.

If we want to still call them leaves, than we can rename FUNCTION to LEAF_NR. What do you say?

P.S. Nice improvement from having the values of the leaves hardcoded.

andreeaflorescu · 2019-03-18T14:27:18Z

cpuid/src/lib.rs

+    let cpuid_transformer: &dyn CpuidTransformer = match &vendor_id {
+        INTEL => &intel::IntelCpuidTransformer {},
+        AMD => &amd::AmdCpuidTransformer {},
+        _ => &other::OtherCpuidTransformer {},


What is the purpose of having OtherCpuidTransformer? When we add a new platform shouldn't we just add the corresponding transformer and not have a generic one here?

I removed it

andreeaflorescu · 2019-03-18T14:28:23Z

cpuid/src/transformer/mod.rs

@@ -66,7 +69,23 @@ pub enum Error {
    VcpuCountOverflow,
    /// The max size has been exceeded
    SizeLimitExceeded,
+    /// A call to an internal helper method failed
+    InternalError(super::common::Error),


nit: use super::common and then just write common::Error here.

If I do this there will be conflicts between the Error defined here and common::Error. I will have to use self::Error instead of Error. I would keep it as it is

serban300 · 2019-03-18T16:29:53Z

@andreeaflorescu I adressed all the comments. Please take another look.

dhrgit · 2019-03-19T14:44:32Z

cpuid/src/cpu_leaf.rs

@@ -3,6 +3,8 @@

 // Basic CPUID Information
 pub mod leaf_0x1 {
+    pub const LEAF_NR: u32 = 0x1;


Nit: "nr" is not short for number. I suggest either "num" or "no".

dhrgit · 2019-03-19T15:43:02Z

cpuid/src/cpu_leaf.rs

@@ -63,19 +65,41 @@ pub mod leaf_0x1 {
    }
 }

-// Deterministic Cache Parameters Leaf
-pub mod leaf_0x4 {
+pub mod leaf_cache_parameters {


Shouldn't this node have a leaf number const as well?

I believe this is in fact a sub-leaf of the 0x4 leaf.

This is not an actual leaf. Since 0x4 and 0x8000_001d are very similar, but not completely the same (see ox4 - page 32 and 0x8000_001d - page 76 ), I created this "parent leaf" in order to hold the common properties. 0x4 and 0x8000_001d inherit these common properties, and then add their specific ones.

dhrgit · 2019-03-19T16:07:05Z

cpuid/src/transformer/common.rs

+    }
+
+    let cpuid2 = CpuId::from_entries(&entries);
+    *cpuid = cpuid2;


This function completely replaces the cpuid struct with a new copy, which is not something the caller might expect. I.e. if, in the future, we decide to extend the CpuId struct to hold more data, that data would be lost upon returning from this function.

I created #1019 in order to address this issue. I would like to tackle it later with lower priority. Also see here: #1015 (comment)

dhrgit · 2019-03-19T16:32:10Z

cpuid/src/transformer/mod.rs

+    ) -> Result<(), Error> {
+        Ok(())
+    }
+}


I find this two-step logic (pre-process entry list + transform individual entries) a bit difficult to follow.

Why not have a single transforming / filtering trait, that ingests a CpuId struct and outputs a "fixed" CpuId? Then, different vendors / templates / whatever else might affect cpuid, could provide an implementation for that trait, and we could chain these transformations together, to produce the final emulated cpuid.

I.e. the flow I'm imagining looks something like this:

vcpu::configure() ... vendor_filter <- pick_vendor_filter_based_on_host_cpu() template_filter <- pick_template_filter_based_on_microvm_config() emulated_cpuid <- template_filter.apply( vendor_filter.apply( get_emulated_cpuid_from_kvm() ) ) ...

In general I agree with having a single transforming / filtering trait, that ingests a CpuId struct and outputs a "fixed" CpuId. It seems like a more consistent approach, but I'm not sure it would be that easy. There are more things to take into account here. Like the fact that certain templates can be applied only to certain vendors. For example C3 can only be applied to Intel. I would like to dive deeper into it and take my time on implementing this. Also, since I have limited access to AMD hardware I would need this PR merged as soon as possible in order to continue the work on CPU feature masking for AMD.

So is it ok if I create another issue for this and address it probably next week ? We should also update the CPU templates in order to use the BitRange structure, I can do that as part of the same issue.

Okay, then. We can take it offline and discuss this approach - I can't see many issues arising.

I created #1025

copied from firecracker-microvm#882 Signed-off-by: Serban Iorga <[email protected]>

Signed-off-by: Serban Iorga <[email protected]>

serban300 · 2019-03-20T08:34:28Z

@dhrgit I answered to all the comments. Please take another look.

serban300 added the Feature: CPU Support: AMD label Mar 15, 2019

serban300 self-assigned this Mar 15, 2019

andreeaflorescu self-requested a review March 15, 2019 16:04

serban300 force-pushed the serban300-amd branch from 2af8500 to e8e5bfd Compare March 16, 2019 13:53

serban300 added this to the AMD Support milestone Mar 18, 2019

serban300 requested a review from dhrgit March 18, 2019 11:05

andreeaflorescu reviewed Mar 18, 2019

View reviewed changes

serban300 force-pushed the serban300-amd branch 2 times, most recently from 37a38dc to 7b86f2e Compare March 18, 2019 16:27

andreeaflorescu previously approved these changes Mar 18, 2019

View reviewed changes

serban300 mentioned this pull request Mar 18, 2019

Scope use_host_cpuid_function refactoring #1019

Closed

dhrgit reviewed Mar 19, 2019

View reviewed changes

Serban Iorga added 4 commits March 20, 2019 10:12

implement cpuid helper methods

90ddb76

copied from firecracker-microvm#882 Signed-off-by: Serban Iorga <[email protected]>

use constants for cpuid functions

dbf62ba

Signed-off-by: Serban Iorga <[email protected]>

add support for processing AMD cpuid

eea90f6

Signed-off-by: Serban Iorga <[email protected]>

fix AMD boot freeze

be42217

Signed-off-by: Serban Iorga <[email protected]>

serban300 dismissed andreeaflorescu’s stale review via be42217 March 20, 2019 08:33

serban300 force-pushed the serban300-amd branch from 7b86f2e to be42217 Compare March 20, 2019 08:33

dhrgit approved these changes Mar 20, 2019

View reviewed changes

andreeaflorescu approved these changes Mar 20, 2019

View reviewed changes

andreeaflorescu merged commit 5e06e11 into firecracker-microvm:master Mar 20, 2019

serban300 mentioned this pull request Mar 20, 2019

Refactor CPU templates logic #1025

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix AMD boot freeze #1015

Fix AMD boot freeze #1015

serban300 commented Mar 15, 2019

andreeaflorescu Mar 18, 2019

serban300 Mar 18, 2019

andreeaflorescu Mar 18, 2019

serban300 Mar 18, 2019

andreeaflorescu Mar 18, 2019

serban300 Mar 18, 2019

andreeaflorescu Mar 18, 2019

serban300 Mar 18, 2019

andreeaflorescu Mar 18, 2019

serban300 Mar 18, 2019

andreeaflorescu Mar 18, 2019

serban300 Mar 18, 2019

andreeaflorescu Mar 18, 2019

serban300 Mar 18, 2019

andreeaflorescu Mar 18, 2019

serban300 Mar 18, 2019

andreeaflorescu Mar 18, 2019

serban300 Mar 18, 2019

serban300 commented Mar 18, 2019

dhrgit Mar 19, 2019

serban300 Mar 20, 2019

dhrgit Mar 19, 2019

andreeaflorescu Mar 19, 2019

serban300 Mar 20, 2019

dhrgit Mar 19, 2019

serban300 Mar 20, 2019

dhrgit Mar 19, 2019

serban300 Mar 20, 2019

dhrgit Mar 20, 2019

serban300 Mar 20, 2019

serban300 commented Mar 20, 2019

		@@ -16,32 +16,28 @@ use std::arch::x86::__cpuid_count;
		#[cfg(target_arch = "x86_64")]

Fix AMD boot freeze #1015

Fix AMD boot freeze #1015

Conversation

serban300 commented Mar 15, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serban300 commented Mar 18, 2019

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

serban300 commented Mar 20, 2019