This repository has been archived by the owner on Dec 3, 2024. It is now read-only.
generated from riscv-admin/template-group-admin
-
Notifications
You must be signed in to change notification settings - Fork 0
a review of v0.1 focusing on VMs and forward compatibility #2
Comments
This is great, detailed feedback, thanks very much. Some responses below.
(looks like embedded comments mess up the numbering, sorry)
One broader response: your feedback has made me realize that we have some
inconsistencies in the spec w.r.t. virtualization. As specified,
vsctrcontrol is a physically distinct register, unlike sctrcontrol which
simply provides S-mode access to a subset of mctrcontrol. It is active
when V=1, meaning the CTR config in vsctrcontrol is used in place of that
in mctrcontrol. Thus if, say, M-mode wants to use CTR to record all
transfers in all modes, it will need to replicate the CTR config in both
mctrcontrol and vsctrcontrol. We can consider whether that's the behavior
we want, but that's what we have now.
1.
"CTR" is likely to be confused with "counter" in the abbreviations of
various CSRs. I suppose it's too late to use "CXR" or something else?
FWIW, RISC-V tends to shorten counter to cntr. E.g., Zicntr, Smcntrpmf.
I can add this to our list of opens, to see if others share this concern.
1.
We need clear guidance about whether bits of mctrcontrol which are not
defined in the current specification should be set to 0 or 1 by software.
The forward compatibility requirements are presently inconsistent between
hypothetical new control transfer types (which are 1 to ignore) and
hypothetical new modes (0 to ignore). Current examples of unused portions
of multi-function bitfield registers are either WPRI (mstatus, hstatus,
menvcfg, mseccfg, fcsr) or have ad-hoc reserved rules (effectively WLRL);
WARL seems inappropriate in either case.
Default behavior (which includes behavior if transfer type filtering is
not implemented) is to record all control flow transfers within enabled
privilege modes. If a new transfer type is defined, it will be recorded by
default but may warrant a new (optional) inhibit bit. So the spec assumes
software will set all bits to zero unless it wants some non-default
behavior. I can make that more explicit in non-normative text.
It's true that the External Trap Enable (ETEN) bit is opt-in, unlike the
other types, which are opt-out. This is because recording transfers that
lead to another privilege mode is not default behavior. So, once again, a
zero is expected for most cases.
1.
SEN has the function of a mstateen bit and should be required to be 0
at reset for the reasons described in the Smstateen specification.
I can call that out. I suppose we should also consider whether SEN would
be better placed in mstateen0, and VSEN in hstateen0?
1.
VSTE is the only control bit not allocated an index in mctrcontrol. If
it had one, M-mode code would not need to reference vsctrcontrol for
emulation or context swaps.
Since vsctrcontrol is a separate register, I think M-mode would need to
swap it even if we put VSTE in mctrcontrol. Otherwise the CTR config for
while V=1 would be lost.
1.
We don't need names for VS or VU in sctrcontrol since they are also
available in vsctrcontrol. Bits [4:3] in both CSRs can be defined as
read-only zero, saving two of the three fake-virtualization bits.
Good catch! In fact, since vsctrcontrol is what is active when V=1, I
think we *need* to remove VS/VU from mctrcontrol. Setting
mctrcontrol.V[US] will have no effect.
1.
sctrcontrol.VSEN should always exist as a writable field, regardless
of the H extension, as part of the minimal support required to emulate the
H-extension using M-mode primitives. Alternatively, VSEN can be moved
to a CSR accessible only in HS-mode (a new hctrcontrol, or vsctrcontrol,
assuming in the latter case that the new bit is hidden when it is accessed
as sctrcontrol in VS-mode).
That's reasonable. I already specify that for VSEN, VS, and VU in
vsctrcontrol, for nested virtualization. Can do the same for mctrcontrol.
If we move VSEN to hstateen0 I assume that also resolves it, since those
writes will trap.
1.
I think that the current handling of DEPTH in vsctrcontrol and
sctrcontrol is fine, since it can be configured in a privilege mode
iff filtering can be configured in that privilege mode.
2.
Clarify accessibility of vsctrcontrol. Since it contains the only name
of VSTE, it must be accessible for higher modes even if access to the
CTR unit is disabled or mediated by VSEN=0. I think the correct check is prv
= M OR (prv = HS AND mctrcontrol.SEN).
The priv spec dictates that vs* CSRs are always accessible from M-mode and
HS-mode.
1.
We should specify the behavior of attempting to access entry registers
using vsireg. (Always illegal instruction?)
Another good catch. Yes, we'll use the usual "will typically raise an
illegal instruction exception" language.
1.
WPRI should only apply to the undefined portions of the Entry
Registers, not the whole registers.
Agreed, will fix.
1.
Why are ctrsource.V and ctrtarget.MISP in address fields when we have
a perfectly good ctrdata?
Bit 0 in ctrsource and ctrtarget will always be 0 otherwise, so I figured
we might as well use them for something else. And since V is required,
this allows an implementation that supports only 2 registers (ctrsource and
ctrtarget) per entry. (ctrdata must be accessible, but it can be hardwired
to 0 for all entries)
1.
Is there a difference between TYPE 0 or 4 (Undefined) and 7 (Reserved)?
Not really. In the E-trace itype definition, 4 corresponds to not-taken
branches (which CTR doesn't record), while 7 is reserved. I should
probably just call them both reserved in the CTR spec.
1.
It could be a little clearer (in the ctrdata and mctrcontrol
definitions) that the numeric values and precise definitions of the control
transfer types are from "4.1D Instruction Trace Interface requirements" and
"Table 4.4: Instruction Type (itype) encoding" in the E-trace spec,
although the names are slightly different. In particular, there is no
indication in this draft of which registers are used to distinguish calls,
returns, and jumps.
In section 4.2 (Transfer Type Filtering) I make the following reference:
The transfer type inhibit bits leverage the type definitions specified in
Table 4.4, and described in Section 4.1.1, of the RISC-V Efficient Trace
Spec v2.0. An exception is the ETEN bit, discussed in External Traps below.
I link to that section in the text at the bottom of the mctrcontrol
definition, but perhaps it would be more clear if there were a link to it
in the desc for each *INH bit.
For ctrdata.TYPE, there is this non-normative text immediately following
the table with bit definitions:
Like the transfer type inhibit bits in mctrcontrol, the ctrdata.TYPE bits
leverage the E-trace itype encodings, with the addition of External Trap
1.
If one of ctrtarget and ctrdata.CC is going to be optional, only one
of them can be inferred from ctrsource in most dynamical cases.
I'm not sure I follow. Only MISP in ctrtarget is optional, whereas all of
ctrdata is optional. I *think* you're saying that the target can be
inferred from the transfer instruction some of the time (for direct
transfers), while the cycle time can't ever be inferred? True, but for our
primary usage (FDO/PGO, per the charter
<https://github.com/riscv-admin/control-transfer-records/blob/main/CHARTER.md>),
the source & target are required while cycle time is not.
1.
Clarify that unsupported fields will read as zero in newly added
entries; the text as written implies that fields could be picked up from an
unrelated entry.
ctrtarget says: MISP is read-only 0 when not supported.
ctrdata says: Unsupported fields are read-only 0.
1.
"Target Mode > Source Mode" column header is unclear, suggest explicit
column headers.
Agreed, will fix
1.
Are the filtering bits in mctrcontrol honored when RASEMU=1?
No, assuming you mean transfer type filter bits (not priv mode
"filtering", which is really enable bits). I'll make that more explicit.
1.
A single trap can service multiple interrupts (e.g. using a loop on
mtopi). At what point do we consider "a LCOFI is taken"? Would it be
more appropriate for the freeze to occur at some point after the LCOFI
becomes pending?
Assuming LCOFI is not delegated, I would expect the hardware to set FROZEN
when a trap to M-mode is taken as a result of mip.LCOFI=mie.LCOFI=1. I
would expect software to clear FROZEN around the time that it clears
mip.LCOFI, once the sample data, including CTR, has been collected. I
would not want hardware to set FROZEN on mip.LCOFI assertion, because any
delay in the trap (due to uarch skid, or mie.LCOFI=0, or mstatus.MIE=0)
could result in not recording some transfers that immediately precede the
sampled instruction (mepc).
1.
If an ebreak instruction is executed with dcsr.ebreakx=1 and Debug
Mode is entered, is this considered a "breakpoint exception" for the
purposes of BPFRZ?
No. BPFRZ causes FROZEN to be set on breakpoint exception. This case
does not cause a breakpoint exception.
…
1.
"Writing all 1s to the Entry Registers at entry 0" is not permissible
because the Entry Registers are defined as WPRI, and therefore software may
only change the defined portions. "Writing 1s to all defined fields of the
Entry Registers at entry 0"?
2.
Discovering xctrcontrol features by writing all 1s is similarly
problematic if future fields have effects when written. Write all 1s to
defined fields of xctrcontrol?
Good points, will fix both.
|
status in 0.1.2:
new and continuing issues:
proposed clarifications:
proposals:
|
Hi Stefan, thanks again for the detailed and very good feedback.
Responses below.
new and continuing issues:
1. Smstateen requires extensions to document what types of implicit
accesses are allowed at lower privilege levels. We must document that
control transfers are recorded (affecting CTR state), using the current
mctrcontrol and vsctrcontrol, regardless of whether explicit access to CTRs
is allowed at the current privilege level.
Good suggestion, added to draft.
1. "Privilege mode transitions" does not address the issue of whether
traps and trap returns between VS-mode and V=0 modes use the transfer type
filtering bits from the new mode's ctrcontrol or the old mode's ctrcontrol.
Freeze likewise does not address whether the new or old mode's BPFRZ or
LCOFIFRZ is used.
Another good catch. Added some text saying that the ctrcontrol from the
originating mode is used, with some specific exceptions (e.g., priv mode
enables, *TE bits).
1. We document the effects of accessing mireg* in M-mode, sireg* in
HS-mode, and vsireg* (via sireg*) in VS-mode. Access to the vsireg* or
mireg* in VS-mode, or mireg* in M-mode, is forbidden by the rules in
Smcsrind. We do not clearly state that using vsireg* to access CTR entries
in HS-mode has exactly the same effect as using sireg*.
Added
1. "Behavior" should specify that when a new record is created, all
implemented fields are written with values for the new record. The current
text of "will update at least" does not make clear that the set of fields
with valid values is consistent and the same as the set of fields
determined by the discovery procedure.
Added
1. Defining ctrsource.PC and ctrtarget.PC fields as being XLEN wide is
problematic for profiling workloads which mix RV64 and RV32; when control
transfers into RV32, XLEN becomes 32, and the entry bits are truncated to
32 bits (CSR Width Modulation) before being re-zero-extended on entry to
RV64. Proposing that the width of PC fields in entry registers is the
current or maximum supported MXLEN, they are truncated on reads, and zero
extended on implicit writes. Do RV32 writes to sireg* preserve the upper
bits of ctr*?
Good point, though xireg* CSRs are XLEN that doesn't mean the underlying
registers have to be. We could say that ctrsource, ctrtarget, and ctrdata
are MXLEN, and if XLEN<MXLEN then register accesses access only the lower
bits. For implicit writes, I agree we should zero-extend.
1. We have a CSR number listed for xctrcontrol but not xctrcontrolh.
Added.
1. The semantics of WRAP seem not very useful for determining when
stitching is possible, since you cannot distinguish "recorded one value,
wrapped around" from "recorded many values, overwrote entire entry file".
Propose a 2 bit field encoding -1, 0, +1, sticky overflow.
I don't follow how you propose to use these 2 bits.
1. The statement about read-modify-writes to xctrstatus should only
apply to multi-instruction sequences. A control transfer is an
architectural event corresponding to an instruction and therefore cannot
occur in the middle of a single csrrs/csrrc/csrrw instruction.
Added
proposed clarifications:
1. Breakpoint exception refers only to architectural traps with a
cause of "breakpoint exception" as defined in the privileged spec,
regardless of source (ebreak, c.ebreak, Sdtrig); it does not include entry
into Debug Mode, even in cores where this is implemented as an exception.
Added this (non-normative): Breakpoint exception refers to synchronous
exceptions with a cause value of Breakpoint (3), regardless of source
(ebreak, c.ebreak, Sdtrig); it does not include entry into Debug Mode, even
in cores where this is implemented as an exception.
1. LCOFI interrupt refers only to architectural traps directly caused
by a local counter overflow interrupt. If a local counter overflow
interrupt is recognized without a trap, for instance by reading mip, FROZEN
is not automatically set.
Added
proposals:
1. VM migrations are extremely rare events compared to profile
samples, and profile samples are expected to occasionally be lost anyway,
so rather than writable DEPTH I would like to suggest a model where DEPTH
is not software writable at any mode but allowed to spontaneously change as
viewed by modes lower than M. The visible effect of a DEPTH change must be
specified in a way that allows software to identify potentially lost
events; I propose to redefine nonexistent CTR entries as ctrsource.PC=1,
ctrsource.V=everything else=0, and specify that all CTR entries keep their
values through a sequence of DEPTH changes unless they become nonexistent
or temporarily nonexistent, in which case they change to PC=1 V=0.
As mentioned in the email thread, I'm concerned about relying on tools to
handle depth changes, in particular because they are so rare, and will
never occur in development environments. That said, something that will
happen (occasionally) is that tools find that the CTR array isn't full.
This is because, when RASEMU=0, Linux will clear CTR on context switch. So
if we made reduced depth look like such a case, maybe that would work. But
that would mean that unimplemented vs unwritten entries have to look the
same.
1. Requiring WRPTR to always be 8 bits would simplify the handing of
DEPTH changes and wrapping detection.
I don't see how a WRPTR value >DEPTH-1 helps. Though I could see how it
could be useful for stack stitching.
1. We could avoid a number of edge cases in the privilege mode change
and freeze definitions if all fields in vsctrcontrol other than VS, VU, and
VSTE were made read-write aliases of the identically named fields in
mctrcontrol.
Yes, I would like that better. I didn't define it that way because,
AFAICT, all existing standard CSRs are either pure aliases (sometimes a
subset) or purely new state. But that may not be a hard-and-fast rule.
I'll check.
1. WPRI is normally used for fields that contain, or might contain in
the future, control fields, where writing garbage to them will cause the
hart to behave in unpredictable ways. The entry registers shouldn't behave
like that - writing anything to them should have no effect other than to
cause a value to be read back in the future - so it might make more sense
to make them WARL as a whole.
I may not be using WPRI correctly, but my goal is to indicate that new
fields may show up in these registers, and old software that isn't familiar
with them should ignore them (but preserve them). Which seems like the
definition of WPRI?
…
1. This year marks 20 years since the first sale of x86_64 processors.
Custom extensions that allow IALIGN=8 will happen with or without RVI's
blessing, all we can do is not make them needlessly painful by using pc[0]
for other purposes without a better reason than "it's free real estate". I
am unsure that allowing implementations to not decode ctrdata counts as a
better reason.
You're saying you think we need to support 1-byte instructions?
1. PGO uses both ctrsource and ctrtarget on x86_64 but I'd like to do
tests on a ctrsource-only PGO pipeline. I suspect it will be nearly
indistinguishable from the case with both.
That would be interesting. I suppose then ctrtarget could be optional.
—
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AXFCKADIVDA7PBB7I4DFTVDXQD3CBANCNFSM6AAAAAAYY7T5Q4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
responses below.
On Mon, Jul 24, 2023, at 7:39 PM, Beeman Strong wrote:
> new and continuing issues:
> 1. Defining ctrsource.PC and ctrtarget.PC fields as being XLEN wide is
> problematic for profiling workloads which mix RV64 and RV32; when control
> transfers into RV32, XLEN becomes 32, and the entry bits are truncated to
> 32 bits (CSR Width Modulation) before being re-zero-extended on entry to
> RV64. Proposing that the width of PC fields in entry registers is the
> current or maximum supported MXLEN, they are truncated on reads, and zero
> extended on implicit writes. Do RV32 writes to sireg* preserve the upper
> bits of ctr*?
Good point, though xireg* CSRs are XLEN that doesn't mean the underlying
registers have to be. We could say that ctrsource, ctrtarget, and ctrdata
are MXLEN, and if XLEN<MXLEN then register accesses access only the lower
bits. For implicit writes, I agree we should zero-extend.
Sounds good. For explicit writes, do we preserve the upper bits like we do when there exists a high register?
> The semantics of WRAP seem not very useful for determining when
> stitching is possible, since you cannot distinguish "recorded one value,
> wrapped around" from "recorded many values, overwrote entire entry file".
> Propose a 2 bit field encoding -1, 0, +1, sticky overflow.
I don't follow how you propose to use these 2 bits.
WRAP is a 2-bit signed integer. When WRPTR changes from DEPTH-1 to 0, increment WRAP (11 -> 00, 00 -> 01, 01 -> 10, 10 -> 10). When WRPTR changes from 0 to DEPTH-1, decrement WRAP (01 -> 00, 00 -> 11, 11 -> 10, 10 -> 10). 10 indicates data loss and is not automatically cleared.
> proposals:
>
> 1. VM migrations are extremely rare events compared to profile
> samples, and profile samples are expected to occasionally be lost anyway,
> so rather than writable DEPTH I would like to suggest a model where DEPTH
> is not software writable at any mode but allowed to spontaneously change as
> viewed by modes lower than M. The visible effect of a DEPTH change must be
> specified in a way that allows software to identify potentially lost
> events; I propose to redefine nonexistent CTR entries as ctrsource.PC=1,
> ctrsource.V=everything else=0, and specify that all CTR entries keep their
> values through a sequence of DEPTH changes unless they become nonexistent
> or temporarily nonexistent, in which case they change to PC=1 V=0.
>
> As mentioned in the email thread, I'm concerned about relying on tools to
handle depth changes, in particular because they are so rare, and will
never occur in development environments.
That said, something that will
happen (occasionally) is that tools find that the CTR array isn't full.
This is because, when RASEMU=0, Linux will clear CTR on context switch. So
if we made reduced depth look like such a case, maybe that would work. But
that would mean that unimplemented vs unwritten entries have to look the
same.
This is essentially what I was proposing - making reduced depth look like a random loss of CTR data, which would happen anyway if CTR entry migration is not implemented.
I think I misunderstood the purpose of CLR. I was thinking CLR was intended to create a _known_ empty state, from which 0 to DEPTH CTR entries can be created without resulting in ambiguity between new and existing entries; in which case "lost" entries must look different to avoid reintroducing the ambiguity. But if CLR is primarily used to sanitize state, making it unknown, then yes, the unimplemented and post-CLR CTR entry values should be identical.
Another option would be to ignore the issue entirely and make DEPTH an implementation-defined constant with all that entails. There are already a large number of implementation-defined parameters visible to VS-mode (likely including, for CTR use cases, the performance counter definitions), and migrating VMs between cores from different vendors or generations IMO realistically requires a future S-mode defeaturing extension, which could include a writable DEPTH at that time.
> 1. Requiring WRPTR to always be 8 bits would simplify the handing of
> DEPTH changes and wrapping detection.
>
> I don't see how a WRPTR value >DEPTH-1 helps. Though I could see how it
could be useful for stack stitching.
While the additional bits benefit stack stitching, the main purpose is so that if WRPTR does not need to change when DEPTH does and DEPTH changes can be ignored when accessing WRPTR.
> 1. We could avoid a number of edge cases in the privilege mode change
> and freeze definitions if all fields in vsctrcontrol other than VS, VU, and
> VSTE were made read-write aliases of the identically named fields in
> mctrcontrol.
>
> Yes, I would like that better. I didn't define it that way because,
AFAICT, all existing standard CSRs are either pure aliases (sometimes a
subset) or purely new state. But that may not be a hard-and-fast rule.
I'll check.
Let me know how that goes.
> 1. WPRI is normally used for fields that contain, or might contain in
> the future, control fields, where writing garbage to them will cause the
> hart to behave in unpredictable ways. The entry registers shouldn't behave
> like that - writing anything to them should have no effect other than to
> cause a value to be read back in the future - so it might make more sense
> to make them WARL as a whole.
>
> I may not be using WPRI correctly, but my goal is to indicate that new
fields may show up in these registers, and old software that isn't familiar
with them should ignore them (but preserve them). Which seems like the
definition of WPRI?
That's the definition, I'm just not sure if "preserve fields" makes sense for our use case. As far as I know the only reasons to write to entry registers are VM migration and synthesizing entries in privileged code, and in both cases the objective is to create an entirely fresh entry; preserving fields from the previous entry at the same index makes little sense to me.
> 1. This year marks 20 years since the first sale of x86_64 processors.
> Custom extensions that allow IALIGN=8 will happen with or without RVI's
> blessing, all we can do is not make them needlessly painful by using pc[0]
> for other purposes without a better reason than "it's free real estate". I
> am unsure that allowing implementations to not decode ctrdata counts as a
> better reason.
You're saying you think we need to support 1-byte instructions?
That's the argument, yes. Might be another ARC thing, they might have an alternate idea in mind for IALIGN=8 that doesn't use bit 0 of various U-mode address CSRs.
|
On Tue, Jul 25, 2023 at 2:40 PM sorear ***@***.***> wrote:
responses below.
On Mon, Jul 24, 2023, at 7:39 PM, Beeman Strong wrote:
>> new and continuing issues:
>> 1. Defining ctrsource.PC and ctrtarget.PC fields as being XLEN wide is
>> problematic for profiling workloads which mix RV64 and RV32; when
control
>> transfers into RV32, XLEN becomes 32, and the entry bits are truncated
to
>> 32 bits (CSR Width Modulation) before being re-zero-extended on entry to
>> RV64. Proposing that the width of PC fields in entry registers is the
>> current or maximum supported MXLEN, they are truncated on reads, and
zero
>> extended on implicit writes. Do RV32 writes to sireg* preserve the upper
>> bits of ctr*?
>
> Good point, though xireg* CSRs are XLEN that doesn't mean the underlying
> registers have to be. We could say that ctrsource, ctrtarget, and ctrdata
> are MXLEN, and if XLEN<MXLEN then register accesses access only the lower
> bits. For implicit writes, I agree we should zero-extend.
Sounds good. For explicit writes, do we preserve the upper bits like we do
when there exists a high register?
That's what I was suggesting, yes.
>> The semantics of WRAP seem not very useful for determining when
>> stitching is possible, since you cannot distinguish "recorded one value,
>> wrapped around" from "recorded many values, overwrote entire entry
file".
>> Propose a 2 bit field encoding -1, 0, +1, sticky overflow.
>
> I don't follow how you propose to use these 2 bits.
WRAP is a 2-bit signed integer. When WRPTR changes from DEPTH-1 to 0,
increment WRAP (11 -> 00, 00 -> 01, 01 -> 10, 10 -> 10). When WRPTR changes
from 0 to DEPTH-1, decrement WRAP (01 -> 00, 00 -> 11, 11 -> 10, 10 -> 10).
10 indicates data loss and is not automatically cleared.
>> proposals:
>>
>> 1. VM migrations are extremely rare events compared to profile
>> samples, and profile samples are expected to occasionally be lost
anyway,
>> so rather than writable DEPTH I would like to suggest a model where
DEPTH
>> is not software writable at any mode but allowed to spontaneously
change as
>> viewed by modes lower than M. The visible effect of a DEPTH change must
be
>> specified in a way that allows software to identify potentially lost
>> events; I propose to redefine nonexistent CTR entries as ctrsource.PC=1,
>> ctrsource.V=everything else=0, and specify that all CTR entries keep
their
>> values through a sequence of DEPTH changes unless they become
nonexistent
>> or temporarily nonexistent, in which case they change to PC=1 V=0.
>>
>> As mentioned in the email thread, I'm concerned about relying on tools
to
> handle depth changes, in particular because they are so rare, and will
> never occur in development environments.
>
> That said, something that will
> happen (occasionally) is that tools find that the CTR array isn't full.
> This is because, when RASEMU=0, Linux will clear CTR on context switch.
So
> if we made reduced depth look like such a case, maybe that would work.
But
> that would mean that unimplemented vs unwritten entries have to look the
> same.
This is essentially what I was proposing - making reduced depth look like
a random loss of CTR data, which would happen anyway if CTR entry migration
is not implemented.
I think I misunderstood the purpose of CLR. I was thinking CLR was
intended to create a _known_ empty state, from which 0 to DEPTH CTR entries
can be created without resulting in ambiguity between new and existing
entries; in which case "lost" entries must look different to avoid
reintroducing the ambiguity. But if CLR is primarily used to sanitize
state, making it unknown, then yes, the unimplemented and post-CLR CTR
entry values should be identical.
Another option would be to ignore the issue entirely and make DEPTH an
implementation-defined constant with all that entails. There are already a
large number of implementation-defined parameters visible to VS-mode
(likely including, for CTR use cases, the performance counter definitions),
and migrating VMs between cores from different vendors or generations IMO
realistically requires a future S-mode defeaturing extension, which could
include a writable DEPTH at that time.
>> 1. Requiring WRPTR to always be 8 bits would simplify the handing of
>> DEPTH changes and wrapping detection.
>>
>> I don't see how a WRPTR value >DEPTH-1 helps. Though I could see how it
> could be useful for stack stitching.
While the additional bits benefit stack stitching, the main purpose is so
that if WRPTR does not need to change when DEPTH does and DEPTH changes can
be ignored when accessing WRPTR.
Last week we discussed what I think is a good solution for dealing with
DEPTH and VM migration. We'll just make vsctrcontrol.DEPTH a read-only
alias of mctrcontrol.DEPTH. So the guest can only use the depth set by the
host, without requiring any new CSRs. Then there's no worry about changes
to DEPTH, WRPTR, WRAP, etc on VM migration. This assumes the server
platform spec defines some required CTR depth, to ensure there is some
common value supported.
I think this addresses a few of the concerns above, because it eliminates
unexpected depth changes. We'll want to make sure that WRPTR behaves the
same, for a given DEPTH value, regardless of the max depth supported. That
means it needs to be a fixed width per DEPTH value, at least.
As for the 2-bit WRAP scheme, I see that you want to be able to tell when
the entire buffer has been overwritten, which I presume is for
stack-stitching purposes (since a fully overwritten stack can't be stitched
with the last sample). We'll discuss it in the meeting this week.
>> 1. We could avoid a number of edge cases in the privilege mode change
>> and freeze definitions if all fields in vsctrcontrol other than VS, VU,
and
>> VSTE were made read-write aliases of the identically named fields in
>> mctrcontrol.
>>
>> Yes, I would like that better. I didn't define it that way because,
> AFAICT, all existing standard CSRs are either pure aliases (sometimes a
> subset) or purely new state. But that may not be a hard-and-fast rule.
> I'll check.
Let me know how that goes.
ARC agreed that it was fine, and I made the change. Now I'm not so sure.
I suspect KVM will want to support host + guest use of CTR, where the host
uses it when V=0 and guest uses it when V=1. Support for such usages is
already being added to KVM for x86. That can be accomplished by context
switching all the CTR CSRs on VM-exit/entry, but that's expensive. A
potential future extension could allow the hypervisor to split the HW
buffer, so guest gets half and host gets half. Then no switching of the
entries is needed. If we make vsctrcontrol a separate CTR config, we don't
need to switch it either. (the extension would have to add a
vsctrstatus CSR to completely avoid any context switching)
Then the question is whether we should duplicate all those ctrcontrol
fields just for this usage.
>> 1. WPRI is normally used for fields that contain, or might contain in
>> the future, control fields, where writing garbage to them will cause the
>> hart to behave in unpredictable ways. The entry registers shouldn't
behave
>> like that - writing anything to them should have no effect other than to
>> cause a value to be read back in the future - so it might make more
sense
>> to make them WARL as a whole.
>>
>> I may not be using WPRI correctly, but my goal is to indicate that new
> fields may show up in these registers, and old software that isn't
familiar
> with them should ignore them (but preserve them). Which seems like the
> definition of WPRI?
That's the definition, I'm just not sure if "preserve fields" makes sense
for our use case. As far as I know the only reasons to write to entry
registers are VM migration and synthesizing entries in privileged code, and
in both cases the objective is to create an entirely fresh entry;
preserving fields from the previous entry at the same index makes little
sense to me.
There's also context switch. But I agree that software that is
interpreting the entries is not likely to write them. I'll see what ARC
says about it.
>> 1. This year marks 20 years since the first sale of x86_64 processors.
>> Custom extensions that allow IALIGN=8 will happen with or without RVI's
>> blessing, all we can do is not make them needlessly painful by using
pc[0]
>> for other purposes without a better reason than "it's free real
estate". I
>> am unsure that allowing implementations to not decode ctrdata counts as
a
>> better reason.
>
> You're saying you think we need to support 1-byte instructions?
That's the argument, yes. Might be another ARC thing, they might have an
alternate idea in mind for IALIGN=8 that doesn't use bit 0 of various
U-mode address CSRs.
Will be discussed in the meeting this week.
… —
Reply to this email directly, view it on GitHub
<#2 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AXFCKAFYNP6KIGVI4Q7T6WDXSA4NFANCNFSM6AAAAAAYY7T5Q4>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
Sign up for free
to subscribe to this conversation on GitHub.
Already have an account?
Sign in.
"CTR" is likely to be confused with "counter" in the abbreviations of various CSRs. I suppose it's too late to use "CXR" or something else?
We need clear guidance about whether bits of
mctrcontrol
which are not defined in the current specification should be set to 0 or 1 by software. The forward compatibility requirements are presently inconsistent between hypothetical new control transfer types (which are 1 to ignore) and hypothetical new modes (0 to ignore). Current examples of unused portions of multi-function bitfield registers are either WPRI (mstatus, hstatus, menvcfg, mseccfg, fcsr) or have ad-hoc reserved rules (effectively WLRL); WARL seems inappropriate in either case.SEN
has the function of amstateen
bit and should be required to be 0 at reset for the reasons described in the Smstateen specification.VSTE
is the only control bit not allocated an index inmctrcontrol
. If it had one, M-mode code would not need to referencevsctrcontrol
for emulation or context swaps.We don't need names for
VS
orVU
insctrcontrol
since they are also available invsctrcontrol
. Bits[4:3]
in both CSRs can be defined as read-only zero, saving two of the three fake-virtualization bits.sctrcontrol.VSEN
should always exist as a writable field, regardless of the H extension, as part of the minimal support required to emulate the H-extension using M-mode primitives. Alternatively,VSEN
can be moved to a CSR accessible only in HS-mode (a newhctrcontrol
, orvsctrcontrol
, assuming in the latter case that the new bit is hidden when it is accessed assctrcontrol
in VS-mode).I think that the current handling of
DEPTH
invsctrcontrol
andsctrcontrol
is fine, since it can be configured in a privilege mode iff filtering can be configured in that privilege mode.Clarify accessibility of
vsctrcontrol
. Since it contains the only name ofVSTE
, it must be accessible for higher modes even if access to the CTR unit is disabled or mediated by VSEN=0. I think the correct check isprv = M OR (prv = HS AND mctrcontrol.SEN)
.We should specify the behavior of attempting to access entry registers using
vsireg
. (Always illegal instruction?)WPRI should only apply to the undefined portions of the Entry Registers, not the whole registers.
Why are
ctrsource.V
andctrtarget.MISP
in address fields when we have a perfectly goodctrdata
?Is there a difference between TYPE 0 or 4 (Undefined) and 7 (Reserved)?
It could be a little clearer (in the
ctrdata
andmctrcontrol
definitions) that the numeric values and precise definitions of the control transfer types are from "4.1D Instruction Trace Interface requirements" and "Table 4.4: Instruction Type (itype) encoding" in the E-trace spec, although the names are slightly different. In particular, there is no indication in this draft of which registers are used to distinguish calls, returns, and jumps.If one of
ctrtarget
andctrdata.CC
is going to be optional, only one of them can be inferred fromctrsource
in most dynamical cases.Clarify that unsupported fields will read as zero in newly added entries; the text as written implies that fields could be picked up from an unrelated entry.
"Target Mode > Source Mode" column header is unclear, suggest explicit column headers.
Are the filtering bits in
mctrcontrol
honored whenRASEMU
=1?A single trap can service multiple interrupts (e.g. using a loop on
mtopi
). At what point do we consider "a LCOFI is taken"? Would it be more appropriate for the freeze to occur at some point after the LCOFI becomes pending?If an
ebreak
instruction is executed withdcsr.ebreakx
=1 and Debug Mode is entered, is this considered a "breakpoint exception" for the purposes ofBPFRZ
?"Writing all 1s to the Entry Registers at entry 0" is not permissible because the Entry Registers are defined as WPRI, and therefore software may only change the defined portions. "Writing 1s to all defined fields of the Entry Registers at entry 0"?
Discovering xctrcontrol features by writing all 1s is similarly problematic if future fields have effects when written. Write all 1s to defined fields of xctrcontrol?
The text was updated successfully, but these errors were encountered: