From 19397bcecf212481f4098b6d5f81680d718eecf1 Mon Sep 17 00:00:00 2001
From: Lennart Van Hirtum <lennart.vanhirtum@gmail.com>
Date: Thu, 19 Oct 2023 16:29:08 +0200
Subject: [PATCH] Add text on latency and tensions

---
 philosophy/state.md    | 11 +++++++++++
 philosophy/tensions.md | 19 +++++++++++++++++++
 2 files changed, 30 insertions(+)
 create mode 100644 philosophy/tensions.md

diff --git a/philosophy/state.md b/philosophy/state.md
index 8acbb96..457c44c 100644
--- a/philosophy/state.md
+++ b/philosophy/state.md
@@ -32,6 +32,17 @@ However, state registers should not count towards the latency count. So specifyi
 
 If this rule holds for all possible hardware designs is up for further research. 
 
+### Maximum Latency Requirements
+It's the intention of the language to hide fixed-size latency as much as possible, making it easy to create pipelined designs. 
+
+Often however, there are limits to how long latency is allowed to be. The most common case is a state to itself feedback loop. If a state register must be updated every cycle, and it depends on itself, the loopback computation path may not include any latency. 
+
+For example, a FIFO with an almost_full threshold of _N_, may have at most a `ready_out -> valid_in` latency of _N_. 
+
+For state to state paths, this could be relaxed in several ways:
+- If it is proven the register won't be read for some cycles, then the latency can be hidden in these cycles. (Requires complex validity checking)
+- Slow the rate of state updating to the maximum latency, possibly allow automatic C-Slowing
+
 ## On State
 State goes hand-in-hand with the flow descriptors on the ports of modules. Without state all a module could represent is a simple flow-through pipeline. 
 
diff --git a/philosophy/tensions.md b/philosophy/tensions.md
new file mode 100644
index 0000000..d78c6df
--- /dev/null
+++ b/philosophy/tensions.md
@@ -0,0 +1,19 @@
+# Implementation Tensions
+
+## HW Design wants as much templating as possible --- Turing-Complete code generation can't be generically checked
+### Solutions
+- Don't analyze Templated Code before instantiation (C++ way)
+- Default Args required, do user-facing compile based on these. 
+- Limit Code Generation to a limited subset that can be analyzed generically. (Lot of work, will eliminate otherwise valid code)
+
+## Compilation Ordering: Code Generation --- Flow Analysis --- Latency Counting
+Most of the time, Latency Counting is dependent on Template Instantiation. For example, a larger Memory may incur more latency overhead for reads and writes. 
+
+On the other hand, one could want a measured latency count to be usable at compile time, to generate hardware that can specifically deal with this latency. For example, a FIFO's almostFull threshold. 
+
+Another important case: Automatic Generation of compact latency using BRAM shift registers. The compiler could instantiate a user-provided module with as template argument the latency between the bridging wires, but the module may then add vary in its latency depending on the memory block size, requiring the compiler to again fiddle with the template argument. 
+
+### Solutions
+- Always compile in order Template Instantiation -> Flow Analysis & Latency Counting. Explicitly state template args. Add asserts for latencies that are too high and reject designs that violate this. 
+- For each nested module separately, perform Template Instantiation & Latency Counting first, and allow use of measured latency of higher modules
+- Add latency template arguments, allowing higher modules to set the latency of lower modules. Reject lower modules that violate this latency.