diff --git a/_posts/cs_idols/2019-09-13-peter-john-landin.md b/_posts/cs_idols/2019-09-13-peter-john-landin.md deleted file mode 100644 index b982c98..0000000 --- a/_posts/cs_idols/2019-09-13-peter-john-landin.md +++ /dev/null @@ -1,33 +0,0 @@ ---- -title: "Peter John Landin" -subtitle: "「计算机科学偶像」- 彼得·约翰·兰丁" -layout: post -author: "Hux" -published: false -header-style: text -tags: - - CS Idols ---- - -> - [wiki](https://en.wikipedia.org/wiki/Peter_Landin) -> - [维基](https://zh.wikipedia.org/wiki/%E5%BD%BC%E5%BE%97%C2%B7%E5%85%B0%E4%B8%81) - -I was long curious about how does λ calculus become the foundation of formalizaing programming languages. It's strange that I haven't look up the answer until today: It's invented so early by Alonzo Church (whom I will write another post for) as an alternative mathematic foundation in 1930s and its relation with programming language was re-discoverred in 1960s. - -From the "Lambda calculus and programming languages" section of wikipedia page: - -> As pointed out by Peter Landin's 1965 paper "A Correspondence between ALGOL 60 and Church's Lambda-notation" - -I found this name quite familiar since I read his paper "The mechanical evaluation of expressions" before, in which he introduced the first abstract machine for functional programming language, namely [SECD machine](https://en.wikipedia.org/wiki/SECD_machine). This paper also define the term [Closure](https://en.wikipedia.org/wiki/Closure_(computer_programming)) which becomes a prevalent notion in computer programming nowadays. - -Besides of that, his contributions also include: - -- on ALGO definition -- [ISWIM](https://en.wikipedia.org/wiki/ISWIM) programming language -- [off-side rule](https://en.wikipedia.org/wiki/Off-side_rule), known as "indentation-based" syntax nowadays, popularized by Miranda, Haskell, Python, etc. -- coin the term [syntactic sugar](https://en.wikipedia.org/wiki/Syntactic_sugar) - -He was much influenced by a study of McCarthy's LISP and taught [Tony Hoare](https://en.wikipedia.org/wiki/Tony_Hoare) ALGO with Peter Naur and Edsger W. Dijkstra. (Oh yes, definitely 4 more people to write). - -I have just download his old, influential paper "The next 700 programming languages". -I am sure it will be an enjoyable read. diff --git a/_posts/data_rep/2020-06-19-data-rep-int.md b/_posts/data_rep/2020-06-19-data-rep-int.md deleted file mode 100644 index a953d6f..0000000 --- a/_posts/data_rep/2020-06-19-data-rep-int.md +++ /dev/null @@ -1,333 +0,0 @@ ---- -title: "Data Representation - Integer" -subtitle: "「数据表示」整数" -layout: post -author: "Hux" -header-style: text -hidden: true -tags: - - 笔记 - - 基础 - - C - - C++ ---- - -Integers, or _whole number_ from elemental mathematics, are the most common and -fundamental numbers used in the computers. It's represented as -_fixed-point numbers_, contrast to _floating-point numbers_ in the machine. -Today we are going to learn a whole bunch of way to encode it. - -There are mainly two properties to make a integer representation different: - -1. **Size, of the number of bits used**. -usually the power of 2. e.g. 8-bit, 16-bit, 32-bit, 64-bit. - -2. **Signed or unsigned**. -there are also multiple schemas to encode a signed integers. - -We are also gonna use the below terminologies throughout the post: - -- _MSB_: Most Significant Bit -- _LSB_: Least Significant Bit - - -Prerequisite - `printf` Recap ----------------------------------------- - -We will quickly recap the integers subset of usages of `printf`. -Basically, we used _format specifier_ to interpolate values into strings: - -### [Format Specifier](http://www.cplusplus.com/reference/cstdio/printf/) - -> `%[flags][width][.precision][length]specifier` - -- `specifier` - - `d`, `i` : signed decimal - - `u` : unsigned decimal - - `c` : char - - `p`: pointer addr - - `x` / `X` : lower/upper unsigned hex -- `length` - - `l` : long (at least 32) - - `ll` : long long (at least 64) - - `h` : short (usually 16) - - `hh` : short short (usually 8) - -```cpp -using namespace std; -int main() { - cout << "Size of int = "<< sizeof(int) << endl; - cout << "Size of long = " << sizeof(long) << endl; - cout << "Size of long long = " << sizeof(long long); -} -Output in 32 bit gcc compiler: 4 4 8 -Output in 64 bit gcc compiler: 4 8 8 -``` - -### [`inttypes.h` from C99](http://www.qnx.com/developers/docs/6.5.0/index.jsp?topic=%2Fcom.qnx.doc.dinkum_en_c99%2Finttypes.html) - -Also in [cppreference.com](https://en.cppreference.com/w/c/types/integer) - -```cpp -// signed int (d or i) -#define PRId8 "hhd" -#define PRId16 "hd" -#define PRId32 "ld" -#define PRId64 "lld" - -// unsigned int (u) -#define PRIu8 "hhd" -#define PRIu16 "hd" -#define PRIu32 "ld" -#define PRIu64 "lld" - -// unsigned hex -#define PRIx8 "hhu" -#define PRIx16 "hu" -#define PRIx32 "lu" -#define PRIx64 "llu" - -// uintptr_t (64 bit machine word len) -#define PRIxPTR "llx" -``` - - -Unsigned Integers ------------------ - -The conversion between unsigned integers and binaries are trivial. -Here, we can represent 8 bits (i.e. a _byte_) as a _hex pair_, e.g. -`255 == 0xff == 0b11111111`. - -```cpp -#include // uintN_t -#include // PRI macros - -uint8_t u8 = 255; -printf("0x%02" PRIx8 "\n", u8); // 0xff -printf( "%" PRId8 "\n", u8); // 255 -``` - - -Signed Integers ------------------ - -Signed integers are more complicated. We need to cut those bits to halves -to represent both positive and negative integers somehow. - -There are four well-known schemas to encode it, according to -[signed number representation of wikipedia](https://en.wikipedia.org/wiki/Signed_number_representations). - -### Sign magnitude 原码 - -It's also called _"sign and magnitude"_. From the name we can see how straightforward it is: -it's basically put one bit (often the _MSB_) as the _sign bit_ to represent _sign_ and the remaining bits indicating -the magnitude (or absolute value), e.g. - -```cpp - binary | sign-magn | unsigned ------------|-----------|------------ -0 000 0000 | +0 | 0 -0 111 1111 | 127 | 127 -... -1 000 0000 | -0 | 128 -1 111 1111 | -127 | 255 -``` - -It was used in early computer (IBM 7090) and now mainly used in the -_significand_ part in floating-point number - -Pros: -- simple and nature for human - -Cons: -- 2 way to represent zeros (`+0` and `-0`) -- not as good for machine - - add/sub/cmp require knowing the sign - - complicate CPU ALU design; potentially more cycles - - -### [Ones' complement](https://en.wikipedia.org/wiki/Ones%27_complement) 反码 - -It form a negative integers by applying a _bitwise NOT_ -i.e. _complement_ of its positive counterparts. - -```cpp - binary | 1s comp | unsigned ------------|-----------|------------ -0000 0000 | 0 | 0 -0000 0001 | 1 | 1 -... -0111 1111 | 127 | 127 -1000 0000 | -127 | 128 -... -1111 1110 | -1 | 254 -1111 1111 | -0 | 255 -``` - -N.B. _MSB_ can still be signified by MSB. - -It's referred to as _ones'_ complement because the negative can be formed -by subtracting the positive **from** _ones_: `1111 1111 (-0)` - -```cpp - 1111 1111 -0 -- 0111 1111 127 ---------------------- - 1000 0000 -127 -``` - -The benefits of the complement nature is that adding becomes simple, -except we need to do an _end-around carry_ to add resulting carry -back to get the correct result. - -```cpp - 0111 1111 127 -+ 1000 0001 -126 ---------------------- -1 0000 0000 0 - 1 +1 <- add carry "1" back ---------------------- - 0000 0001 1 -``` - -Pros: -- Arithmetics on machien are fast. - -Cons: -- still 2 zeros! - - -### [Twos' complement](https://en.wikipedia.org/wiki/Two%27s_complement) 补码 - -Most of the current architecture adopted this, including x86, MIPS, ARM, etc. -It differed with one's complement by one. - -```cpp - binary | 2s comp | unsigned ------------|-----------|------------ -0000 0000 | 0 | 0 -0000 0001 | 1 | 1 -... -0111 1111 | 127 | 127 -1000 0000 | -128 | 128 -1000 0001 | -127 | 129 -... -1111 1110 | -2 | 254 -1111 1111 | -1 | 255 -``` - -N.B. _MSB_ can still be signified by MSB. - -It's referred to as _twos'_ complement because the negative can be formed -by subtracting the positive **from** `2 ** N` (congruent to `0000 0000 (+0)`), -where `N` is the number of bits. - -E.g., for a `uint8_t`, the _sum_ of any number and it's twos' complement would -be `256 (1 0000 0000)`: - -```cpp -1 0000 0000 256 = 2 ** 8 -- 0111 1111 127 ---------------------- - 1000 0001 -127 -``` - -Becuase of this, arithmetics becomes really easier, for any number `x` e.g. `127` -we can get its twos' complement by: - -1. `~x => 1000 0000` bitwise NOT (like ones' complement) -2. `+1 => 1000 0001` add 1 (the one differed from ones' complement) - -Cons: -- bad named? - -Pros: -- fast machine arithmatics. -- only 1 zeros! -- the minimal negative is `-128` - - -### [Offset binary](https://en.wikipedia.org/wiki/Offset_binary) 移码 - -It's also called _excess-K_ (偏移 K) or _biased representation_, where `K` is -the _biasing value_ (the new `0`), e.g. in _excess-128_: - -```cpp - binary | K = 128 | unsigned ------------|-----------|------------ -0000 0000 | -128(-K)| 0 -0000 0001 | -127 | 1 -... -0111 1111 | -1 | 127 -1000 0000 | 0 | 128 (K) -1000 0001 | 1 | 129 -... -1111 1111 | 127 | 255 -``` - -It's now mainly used for the _exponent_ part of floating-point number. - - -Type Conversion & `Printf` ----------------------------------------------- - -This might be a little bit off topic, but I want to note down what I observed -from experimenting. Basically, `printf` would not perform an implicit type -conversion but merely _interpret_ the bits arrangement of your arguments as you -told it. - -- _UB!_ stands for _undefined behaviors_ - -```cpp -uint8_t u8 = 0b10000000; // 128 - int8_t s8 = 0b10000000; // -128 - -printf("%"PRIu8 "\n", u8); // 128 -printf("%"PRId8 "\n", u8); // 128 (UB! but somehow it's got right) -printf("%"PRId8 "\n", (int8_t)u8); // -128 - -printf("%"PRId8 "\n", s8); // -128 -printf("%"PRIu8 "\n", s8); // 4294967168 (UB!) -printf("%"PRId8 "\n", (uint8_t)s8); // 128 - -printf("%"PRIxPTR "\n", s8); // ffffff80 -printf("%"PRIxPTR "\n", (uintptr_t)s8); // ffffffffffffff80 -``` - - -Char & [ASCII](https://en.wikipedia.org/wiki/ASCII) ------------------ - -Traditionally, `char` is represented in the computer as 8 bits as well. And -really, ASCII is only defined between `0` and `127` and require 7 bits. -(8-bit Extended ASCII is not quite well popularized and supported.) - -It's more complicated in extension such as _Unicode_ nowadays, but we'll ignore -it for future posts dedicated for char and string representation. - -So how is a `char` different with a _byte_? - -Well, the answer is whether a `char` is a `signed char` (backed by `int8_t`) -or a `unsigned char` (backed by `uint8_t`) is... _implementaton-defined_. -And most systems made it _signed_ since most types (e.g. `int`) were signed -by default. - -N.B. `int` is standard-defined to be equivalent to `signed int`. This is -not the case of `char`. - -That's why you often see such `typedef` such as: - -```cpp -typedef unsigned char Byte_t; -typedef uint8_t byte_t; -``` - -to emphysize the nature of byte should be just plain, unsigned, bits. - - -References ----------- - -- -- diff --git a/_posts/data_rep/2020-06-21-data-rep-float.md b/_posts/data_rep/2020-06-21-data-rep-float.md deleted file mode 100644 index 60df834..0000000 --- a/_posts/data_rep/2020-06-21-data-rep-float.md +++ /dev/null @@ -1,331 +0,0 @@ ---- -title: "Data Representation - Floating Point Numbers" -subtitle: "「数据表示」浮点数" -layout: post -author: "Hux" -header-style: text -hidden: true -tags: - - 笔记 - - 基础 - - C - - C++ ---- - -In the last episode we talked about the data representation of integer, a kind -of fixed-point numbers. Today we're going to learn about floating-point numbers. - -Floating-point numbers are used to _approximate_ real numbers. Because of the -fact that all the stuffs in computers are, eventually, just a limited sequence -of bits. The representation of floating-point number had to made trade-offs -between _ranges_ and _precision_. - -Due to its computational complexities, CPU also have a dedicated set of -instructions to accelerate on floating-point arithmetics. - - -Terminologies -------------- - -The terminologies of floating-point number is coming from the -[_scientific notation_](https://en.wikipedia.org/wiki/Scientific_notation), -where a real number can be represented as such: - -``` -1.2345 = 12345 × 10 ** -4 - ----- -- -- - significand^ ^base ^exponent -``` - -- _significand_, or _mantissa_, 有效数字, 尾数 -- _base_, or _radix_ 底数 -- _exponent_, 幂 - -So where is the _floating point_? It's the `.` of `1.2345`. Imaging the dot -can be float to the left by one to make the representation `.12345`. - -The dot is called _radix point_, because to us it's seem to be a _decimal point_, -but it's really a _binary point_ in the computers. - -Now it becomes clear that, to represent a floating-point number in computers, -we will simply assign some bits for _significand_ and some for _exponent_, and -potentially a bit for _sign_ and that's it. - - -IEEE-754 32-bits Single-Precision Floats 单精度浮点数 ----------------------------------------- - -- - -It was called **single** back to IEEE-754-1985 and now **binary32** in the -relatively new IEEE-754-2008 standard. - -```cpp - (8 bits) (23 bits) -sign exponent fraction - 0 011 1111 1 000 0000 0000 0000 0000 0000 - - 31 30 .... 23 22 ....................... 0 -``` - -- The _sign_ part took 1 bit to indicate the sign of the floats. (`0` for `+` -and `1` for `-`. This is the same treatment as the [sign magnitute](2020-06-19-data-rep-int.md##sign-magnitude-原码). -- The _exponent_ part took 8 bits and used [_offset-binary (biased) form_](2020-06-19-data-rep-int.md#offset-binary-移码) to represent a signed integer. -It's a variant form since it took out the `-127` (all 0s) for zero and `+128` -(all 1s) for non-numbers, thus it ranges only `[-126, 127]` instead of -`[-127, 128]`. Then, it choose the zero offset of `127` in these 254 bits (like -using `128` in _excess-128_), a.k.a the _exponent bias_ in the standard. -- The _fraction_ part took 23 bits with an _implicit leading bit_ `1` and -represent the actual _significand_ in total precision of 24-bits. - -Don't be confused by why it's called _fraction_ instead of _significand_! -It's all because that the 23 bits in the representation is indeed, representing -the fraction part of the real significand in the scientific notation. - -The floating-point version of "scientific notation" is more like: - -```cpp -(leading 1) - 1. fraction × 2 ^ exponent × sign - (base-2) (base-2) -``` - -So what number does the above bits represent? - -```cpp -S F × E = R -+ 1.(0) × 0 = 1 -``` - -Aha! It's the real number `1`! -Recall that the `E = 0b0111 1111 = 0` because it used a biased representation! - -We will add more non-trivial examples later. - - -Demoing Floats in C/C++ ------------------------ - -Writing sample code converting between binaries (in hex) and floats are not -as straightforward as it for integers. Luckily, there are still some hacks to -perform it: - -### C - Unsafe Cast - -We unsafely cast a pointer to enable reinterpretation of the same binaries. - -```cpp -float f1 = 0x3f800000; // C doesn't have a floating literal taking hex. -printf("%f \n", f1); // 1065353216.000000 (???) - -uint32_t u2 = 0x3f800000; -float* f2 = (float*)&u2; // unsafe cast -printf("%f \n", *f2); // 1.000000 -``` - -### C - Union Trick - -Oh I really enjoyed this one...Union in C is not only untagged union, but also -share the exact same chunk of memory. So we are doing the same reinterpretation, -but in a more structural and technically fancier way. - -```cpp -#include -#include -#include - -float pi = (float)M_PI; -union { - float f; - uint32_t u; -} f2u = { .f = pi }; // we took the data as float - -printf ("pi : %f\n : 0x%" PRIx32 "\n", pi, f2u.u); // but interpret as uint32_t -pi : 3.141593 - : 0x40490fdb -``` - -N.B. this trick is well-known as [type punning](https://en.wikipedia.org/wiki/Type_punning): - -> In computer science, type punning is a common term for any programming technique that subverts or circumvents the type system of a programming language in order to achieve an effect that would be difficult or impossible to achieve within the bounds of the formal language. - -### C++ - `reinterpret_cast` - -C++ does provide such type punning to the standard language: - -```cpp -uint32_t u = 0x40490fdb; -float a = *reinterpret_cast(&u); -std::cout << a; // 3.14159 -``` - -N.B. it still need to be a conversion between pointers, -see . - -Besides, C++ 17 does add a floating point literal that can take hex, but it -works in a different way, using an explicit radix point in the hex: - -```cpp -float f = 0x1.2p3; // 1.2 by 2^3 -std::cout << f; // 9 -``` - -That's try with another direction: - -```cpp -#include -#include -#include - -int main() { - double qNan = std::numeric_limits::quiet_NaN(); - printf("0x%" PRIx64 "\n", *reinterpret_cast(&qNan)); - // 0x7ff8000000000000, the canonical qNaN! -} -``` - - -Representation of Non-Numbers ------------------------------ - -There are more in the IEEE-754! - -Real numbers doesn't satisfy [closure property](https://en.wikipedia.org/wiki/Closure_(mathematics)) -as integers does. Notably, the set of real numbers is NOT closed under the -division! It could produce non-number results such as **infinity** (e.g. `1/0`) -and [**NaN (Not-a-Number)**](https://en.wikipedia.org/wiki/NaN) (e.g. taking -a square root of a negative number). - -It would be algebraically ideal if the set of floating-point numbers can be -closed under all floating-point arithmetics. That would made many people's life -easier. So the IEEE made it so! Non-numeber values are squeezed in. - -We will also include the two zeros (`+0`/`-0`) into the comparison here, -since they are also special by being the only two demanding an `0x00` exponent: - -```cpp - binary | hex | --------------------------------------------------------- -0 00000000 00000000000000000000000 = 0000 0000 = +0 -1 00000000 00000000000000000000000 = 8000 0000 = −0 - -0 11111111 00000000000000000000000 = 7f80 0000 = +infinity -1 11111111 00000000000000000000000 = ff80 0000 = −infinity - -_ 11111111 10000000000000000000000 = _fc0 0000 = qNaN (canonical) -_ 11111111 00000000000000000000001 = _f80 0001 = sNaN (one of them) -``` - -```cpp - (8 bits) (23 bits) -sign exponent fraction - 0 00 0 ...0 0 = +0 - 1 00 0 ...0 0 = -0 - 0 FF 0 ...0 0 = +infinity - 1 FF 0 ...0 0 = -infinity - _ FF 1 ...0 0 = qNaN (canonical) - _ FF 0 ...0 1 = sNaN (one of them) -``` - -Encodings of qNaN and sNaN are not specified in IEEE 754 and implemented -differently on different processors. Luckily, both x86 and ARM family use the -"most significant bit of fraction" to indicate whether it's quite. - -### More on NaN - -If we look carefully into the IEEE 754-2008 spec, in the _page35, 6.2.1_, it -actually defined anything with exponent `FF` and not a infinity (i.e. with -all the fraction bits being `0`), a NaN! - -> All binary NaN bit strings have all the bits of the biased exponent field E set to 1 (see 3.4). A quiet NaN bit string should be encoded with the first bit (d1) of the trailing significand field T being 1. A signaling NaN bit string should be encoded with the first bit of the trailing significand field being 0. - -That implies, we actually had `2 ** 24 - 2` of NaNs in a 32-bits float! -The `24` came from the `1` sign bit plus `23` fractions and the `2` excluded -were the `+/- inf`. - -The continuous 22 bits inside the fraction looks quite a waste, and there -would be even 51 bits of them in the `double`! We will see how to made them useful -in later episodes (spoiler: they are known as _NaN payload_). - -It's also worth noting that it's weird that the IEEE choose to use the MSB -instead of the sign bit for NaN quiteness/signalness: - -> It seems strange to me that the bit which signifies whether or not the NaN is signaling is the top bit of the mantissa rather than the sign bit; perhaps something about how floating point pipelines are implemented makes it less natural to use the sign bit to decide whether or not to raise a signal. -> -- - -I guess it might be something related to the CPU pipeline? I don't know yet. - - -### Equality of NaNs and Zeros. - -The spec defined a comparison with NaNs to return an **unordered result**, that -means any comparison operation except `!=`, i.e. `>=, <=, >, <, =` between a -NaN and any other floating-point number would return `false`. - -No surprised that most (if not every) language implemented such behaviours, e.g. -in JavaScript: - -```js -NaN !== NaN // true -NaN === NaN // false -NaN > 1 // false -NaN < 1 // false -``` - -Position and negative zeros, however, are defined to be equal! - -```js -+0 === -0 // true, using the traditional JS equality -Object.is(+0, -0) // false, using the "SameValue" equality -``` - -In Cpp, we can tell them apart by looking at its sign bit: - -```cpp -#include // signbit - -cout << (+0.0f == -0.0f); // 1 -cout << std::signbit(-0.0f); // 1 -cout << std::signbit(+0.0f); // 0 -``` - - - - -IEEE-754 64-bits Double-Precision Floats ----------------------------------------- - -- - -Now, the 64-bit versions floating-point number, known as `double`, is just a -matter of scale: - -```cpp - (11 bits) (52 bits) -sign exponent fraction - 0 - - 63 62 .... 52 51 ....................... 0 -``` - - -IEEE-754-2008 16-bits Short Floats ----------------------------------------- - -The 2008 edition of IEEE-754 also standardize the `short float`, which is -neither in C or C++ standard. Though compiler extension might include it. - -It looks like: - -```cpp -1 sign bit | 5 exponent bits | 10 fraction bits -S E E E E E M M M M M M M M M M -``` - - - -References ----------- - -- -- diff --git a/_posts/data_rep/2020-06-21-data-rep-todo.md b/_posts/data_rep/2020-06-21-data-rep-todo.md deleted file mode 100644 index 81978fc..0000000 --- a/_posts/data_rep/2020-06-21-data-rep-todo.md +++ /dev/null @@ -1,22 +0,0 @@ ---- -title: "Data Representation - TODO" -subtitle: "「数据表示」待写" -layout: post -author: "Hux" -header-style: text -published: false -tags: - - 笔记 - - 基础 - - C - - C++ ---- - -- Endianness -- String (Char Sequence e.g. NULL `0x00`) -- Unicode / UTF8 -- Struct and Alignment -- Tagging - - Tagged Pointer - - NaN tagging - - Tagged Integer (SMI) diff --git a/_posts/hidden/2020-05-05-pl-chart.md b/_posts/hidden/2020-05-05-pl-chart.md deleted file mode 100644 index fac9638..0000000 --- a/_posts/hidden/2020-05-05-pl-chart.md +++ /dev/null @@ -1,18 +0,0 @@ ---- -title: "My Programming Languages Spectrum" -subtitle: "我的编程语言光谱" -layout: post -author: "Hux" -header-style: text -hidden: true -plchart: true -tags: ---- - - diff --git a/_posts/read_sf_lf/2019-01-14-sf-lf-14-imp-ceval.md b/_posts/read_sf_lf/2019-01-14-sf-lf-14-imp-ceval.md deleted file mode 100644 index d370371..0000000 --- a/_posts/read_sf_lf/2019-01-14-sf-lf-14-imp-ceval.md +++ /dev/null @@ -1,104 +0,0 @@ ---- -title: "「SF-LC」14 ImpCEvalFun" -subtitle: "Logical Foundations - An Evaluation Function For Imp" -layout: post -author: "Hux" -header-style: text -hidden: true -tags: - - LF (逻辑基础) - - SF (软件基础) - - Coq - - 笔记 ---- - - -Step-Indexed Evaluator ----------------------- - -...Copied from `12-imp.md`: - -> Chapter `ImpCEvalFun` provide some workarounds to make functional evalution works: -> 1. _step-indexed evaluator_, i.e. limit the recursion depth. (think about _Depth-Limited Search_). -> 2. return `option` to tell if it's a normal or abnormal termination. -> 3. use `LETOPT...IN...` to reduce the "optional unwrapping" (basicaly Monadic binding `>>=`!) -> this approach of `let-binding` became so popular in ML family. - - -```coq -Notation "'LETOPT' x <== e1 'IN' e2" - := (match e1 with - | Some x ⇒ e2 - | None ⇒ None - end) - (right associativity, at level 60). - -Open Scope imp_scope. -Fixpoint ceval_step (st : state) (c : com) (i : nat) - : option state := - match i with - | O ⇒ None (* depth-limit hit! *) - | S i' ⇒ - match c with - | SKIP ⇒ - Some st - | l ::= a1 ⇒ - Some (l !-> aeval st a1 ; st) - | c1 ;; c2 ⇒ - LETOPT st' <== ceval_step st c1 i' IN (* option bind *) - ceval_step st' c2 i' - | TEST b THEN c1 ELSE c2 FI ⇒ - if (beval st b) - then ceval_step st c1 i' - else ceval_step st c2 i' - | WHILE b1 DO c1 END ⇒ - if (beval st b1) - then LETOPT st' <== ceval_step st c1 i' IN - ceval_step st' c i' - else Some st - end - end. -Close Scope imp_scope. -``` - - - -Relational vs. Step-Indexed Evaluation --------------------------------------- - -Prove `ceval_step` is equiv to `ceval` - - -### -> - -```coq -Theorem ceval_step__ceval: forall c st st', - (exists i, ceval_step st c i = Some st') -> - st =[ c ]=> st'. -``` - -The critical part of proof: - -- `destruct` for the `i`. -- `induction i`, generalize on all `st st' c`. - 1. `i = 0` case contradiction - 2. `i = S i'` case; - `destruct c`. - - `destruct (ceval_step ...)` for the `option` - 1. `None` case contradiction - 2. `Some` case, use induction hypothesis... - - -### <- - -```coq -Theorem ceval__ceval_step: forall c st st', - st =[ c ]=> st' -> - exists i, ceval_step st c i = Some st'. -Proof. - intros c st st' Hce. - induction Hce. -``` - - - diff --git a/_posts/read_sf_lf/2019-01-15-sf-lf-15-extraction.md b/_posts/read_sf_lf/2019-01-15-sf-lf-15-extraction.md deleted file mode 100644 index 9e5d542..0000000 --- a/_posts/read_sf_lf/2019-01-15-sf-lf-15-extraction.md +++ /dev/null @@ -1,156 +0,0 @@ ---- -title: "「SF-LC」15 Extraction" -subtitle: "Logical Foundations - Extracting ML From Coq" -layout: post -author: "Hux" -header-style: text -hidden: true -tags: - - LF (逻辑基础) - - SF (软件基础) - - Coq - - 笔记 ---- - - -Basic Extraction ----------------- - -- OCaml (most mature) -- Haskell (mostly works) -- Scheme (a bit out of date) - -```coq -Extraction "imp1.ml" ceval_step. -``` - -When Coq processes this command: - -``` -The file imp1.ml has been created by extraction. -The file imp1.mli has been created by extraction. -``` - - - -Controlling Extraction of Specific Types ----------------------------------------- - -如果不做任何处理的话...生成的 `ml` 里的 `nat` 则都会是 Church Numeral... - -> We can tell Coq how to extract certain `Inductive` definitions to specific OCaml types. -> we must say: -> 1. how the Coq type itself should be represented in OCaml -> 2. how each constructor should be translated - -```coq -Extract Inductive bool ⇒ "bool" [ "true" "false" ]. -``` - -> also, for non-enumeration types (where the constructors take arguments), -> we give an OCaml expression that can be used as a _"recursor"_ over elements of the type. (Think Church numerals.) - -```coq -Extract Inductive nat ⇒ "int" - [ "0" "(fun x → x + 1)" ] - "(fun zero succ n → - if n=0 then zero () else succ (n-1))". -``` - -```coq -Extract Constant plus ⇒ "( + )". -Extract Constant mult ⇒ "( * )". -Extract Constant eqb ⇒ "( = )". -``` - -> 注意:保证提取结果的合理性是你的责任。 - -```coq -Extract Constant minus ⇒ "( - )". -``` - -比如这么做很诱人……但是我们 Coq 的定义里 `0 - 1 = 0`, OCaml 的 `int` 则会有负数... - - - -### Recursor 的理论与实现 - a "encoding" of case expression and sum type - -```coq -Fixpoint ceval_step (st : state) (c : com) (i : nat) - : option state := - match i with - | O => None - | S i' => - match c with -``` -```ocaml -let rec ceval_step st c = function - | O -> None - | S i' -> - (match c with -``` -```ocaml -let rec ceval_step st c i = - (fun zero succ n -> if n=0 then zero () else succ (n-1)) - (fun _ -> None) (* zero *) - (fun i' -> (* succ *) - match c with -``` - -注意我们是如何使用 "recursor" 来替代 `case`, `match`, pattern matching 得。 - -recall _sum type_ 在 PLT 中的语法与语义: - -```coq -T ::= - T + T - -e ::= - case e of - | L(e) => e - | R(e) => e - -``` -``` - e → e' - ------------- (work inside constructor) - C(e) -> C(e') - - e → e' - ------------------------------- (work on the expr match against) - case e of ... → case e' of ... - - ----------------------------------------------- (match Left constructor, substitute) - case L(v) of L(x) => e1 | R(y) => e2 → e1 [v/x] - - ----------------------------------------------- (match Right constructor, substitute) - case R(v) of L(x) => e1 | R(y) => e2 → e1 [v/x] -``` - -可以发现 `case` 表达式可以理解为一种特殊的 application,会将其 argument 根据某种 tag (这里为构造函数) apply 到对应的 case body 上, -每个 case body 都是和 lambda abstraction 同构的一种 binder: - - L(x) => e1 === λx.e1 - R(x) => e2 === λx.e2 - - case v e1|e2 === (λx.e1|e2) v -- `e1` or `e2` depends on the _tag_ wrapped on `v` - -这个角度也解释了 Haskell/SML 在申明函数时直接对参数写 pattern match 的理论合理性. - -根据经验几乎所有的 _binding_ 都可以被 desugar 成函数(即 lambda expression). -难点在于我们如何 re-implement 这个 _tag_ 的 _switch_ 机制? - -对于 `Inductive nat` 翻译到 OCaml `int` 时,这个机制可以用 `v =? 0` 来判断,因此我们的 _recursor_ 实现为 - -```ocaml -fun zero succ (* partial application *) - n -> if n=0 (* 判断 tag ... *) - then zero () (* 0 case => (λx.e1) v *) - else succ (n-1) (* S n case => (λx.e2) v *) -``` - - - - - - diff --git a/_posts/read_sf_lf/2019-01-16-sf-lf-16-auto.md b/_posts/read_sf_lf/2019-01-16-sf-lf-16-auto.md deleted file mode 100644 index 2c7463d..0000000 --- a/_posts/read_sf_lf/2019-01-16-sf-lf-16-auto.md +++ /dev/null @@ -1,223 +0,0 @@ ---- -title: "「SF-LC」16 Auto" -subtitle: "Logical Foundations - More Automation" -layout: post -author: "Hux" -header-style: text -hidden: true -tags: - - LF (逻辑基础) - - SF (软件基础) - - Coq - - 笔记 ---- - -- `auto` - proof search -- `Ltac` - automated forward reasoning (hypothesis matching machinery) -- `eauto`, `eapply` - deferred instantiation of existentials - - - -`Ltac` macro ------------- - -```coq -Ltac inv H := inversion H; subst; clear H. - -(** later in the proof... **) -inv H5. -``` - - - -The `auto` Tactic ------------------ - -> `auto` can free us by _searching_ for a sequence of applications that will prove the goal: - -```coq -intros P Q R H1 H2 H3. -apply H2. apply H1. assumption. - - -(** can be replaced by... **) -auto. -``` - -`auto` solves goals that are solvable by _any combination_ of -- `intros` -- `apply` (of hypotheses from the _local_ context, by default) - - -> 使用 auto 一定是“安全”的,它不会失败,也不会改变当前证明的状态: auto 要么完全解决它,要么什么也不做。 - -> Proof search could, in principle, take an arbitrarily long time, -> so there are limits to how far auto will search by default. (i.e. `5`) - -```coq -Example auto_example_3 : ∀(P Q R S T U: Prop), - (P → Q) → - (Q → R) → - (R → S) → - (S → T) → - (T → U) → - P → - U. -Proof. - (* 当 auto 无法解决此目标时,它就什么也不做 *) - auto. - (* 可选的参数用来控制它的搜索深度(默认为 5), 6 就刚好能解决了! *) - auto 6. -Qed. -``` - - -### Hint Database 提示数据库 - -> `auto` auto considers a __hint database__ of other lemmas and constructors. -> common lemmas about _equality_ and _logical operators_ are installed by default. - -> just for the purposes of one application of `auto` -> 我们可以为某次 `auto` 的调用扩展提示数据库,`auto using ...` - -```coq -Example auto_example_6 : ∀n m p : nat, - (n ≤ p → (n ≤ m ∧ m ≤ n)) → - n ≤ p → - n = m. -Proof. - intros. - auto using le_antisym. -Qed. -``` - - -### Global Hint Database 添加到全局提示数据库 - -```coq -Hint Resolve T. - -Hint Constructors c. - -Hint Unfold d. -``` - - -### `Proof with auto.` - -Under `Proof with t`, `t1...` == `t1; t`. - - - - -Searching For Hypotheses ------------------------- - -对于很常见的一种矛盾情形: - -```coq -H1: beval st b = false -H2: beval st b = true -``` - -`contradiction` 并不能解决,必须 `rewrite H1 in H2; inversion H2`. - -1. 宏: - -```coq -Ltac rwinv H1 H2 := rewrite H1 in H2; inv H2. - -(** later in the proof... **) -rwinv H H2. -``` - -2. `match goal` 调用宏 - -```coq -Ltac find_rwinv := - match goal with - H1: ?E = true, - H2: ?E = false - ⊢ _ ⇒ rwinv H1 H2 - end. - -(** later in the proof... **) -induction E1; intros st2 E2; inv E2; try find_rwinv; auto. (** 直接解决所有矛盾 case **) -- (* E_Seq *) - rewrite (IHE1_1 st'0 H1) in *. auto. -- (* E_WhileTrue *) - + (* b 求值为 true *) - rewrite (IHE1_1 st'0 H3) in *. auto. Qed. -``` - -可以看到最后只剩这种改写形式...我们也把他们自动化了: - -```coq -Ltac find_eqn := - match goal with - H1: ∀x, ?P x → ?L = ?R, - H2: ?P ?X - ⊢ _ ⇒ rewrite (H1 X H2) in * - end. -``` - -配合上 `repeat`...我们可以 keep doing useful rewrites until only trivial ones are left. -最终效果: - -```coq -Theorem ceval_deterministic''''': ∀c st st1 st2, - st =[ c ]⇒ st1 → - st =[ c ]⇒ st2 → - st1 = st2. -Proof. - intros c st st1 st2 E1 E2. - generalize dependent st2; - induction E1; intros st2 E2; inv E2; - try find_rwinv; - repeat find_eqn; auto. -Qed. -``` - -即使我们给 IMP 加上一个 `CRepeat`(其实就是 `DO c WHILE b`), -会发现颠倒一下自动化的顺序就能 work 了 - -```coq - induction E1; intros st2 E2; inv E2; - repeat find_eqn; - try find_rwinv; auto. -``` - -当然,这种「超级自动化」(hyper-automation) 并不总是现实,也不好调试... - - - -### The `eapply` and `eauto` variants - -> 推迟量词的实例化 - -比如对于 - -```coq -Example ceval_example1: - empty_st =[ - X ::= 2;; - TEST X ≤ 1 - THEN Y ::= 3 - ELSE Z ::= 4 - FI - ]⇒ (Z !-> 4 ; X !-> 2). -Proof. - (* 我们补充了中间状态 st'... *) - apply E_Seq with (X !-> 2). - - apply E_Ass. reflexivity. - - apply E_IfFalse. reflexivity. apply E_Ass. reflexivity. -Qed. -``` - -没有 `with` 就会 `Error: Unable to find an instance for the variable st'` - -但其实 `st'` 的取值在后面的步骤是很明显(很好 infer/unify)的,所以 `eapply` works. - - - - diff --git a/_posts/read_sf_plf/2019-03-17-sf-plf-17-use-tactics.md b/_posts/read_sf_plf/2019-03-17-sf-plf-17-use-tactics.md deleted file mode 100644 index 3c3beb3..0000000 --- a/_posts/read_sf_plf/2019-03-17-sf-plf-17-use-tactics.md +++ /dev/null @@ -1,160 +0,0 @@ ---- -title: "「SF-PLF」17 UseTactics" -subtitle: "Programming Language Foundations - Tactic Library For Coq" -layout: post -author: "Hux" -header-style: text -hidden: true -tags: - - SF (软件基础) - - PLF (编程语言基础) - - Coq - - 笔记 ---- - -```coq -From PLF Require Import LibTactics. -``` - -`LibTactics` vs. `SSReflect` (another tactics package) - -- for PL vs. for math -- traditional vs. rethinks..so harder - - -Tactics for Naming and Performing Inversion -------------------------------------------- - -### `introv` - -```coq -Theorem ceval_deterministic: ∀c st st1 st2, - st =[ c ]⇒ st1 → - st =[ c ]⇒ st2 → - st1 = st2. -intros c st st1 st2 E1 E2. (* 以往如果想给 Hypo 命名必须说全 *) -introv E1 E2. (* 现在可以忽略 forall 的部分 *) -``` - -### `inverts` - -```coq -(* was... 需要 subst, clear *) -- inversion H. subst. inversion H2. subst. -(* now... *) -- inverts H. inverts H2. - - -(* 可以把 invert 出来的东西放在 goal 的位置让你自己用 intro 命名!*) -inverts E2 as. -``` - - - - - - - -Tactics for N-ary Connectives ------------------------------ - -> Because Coq encodes conjunctions and disjunctions using binary constructors ∧ and ∨... -> to work with a `N`-ary logical connectives... - -### `splits` - -> n-ary conjunction - -n-ary `split` - - -### `branch` - -> n-ary disjunction - -faster `destruct`? - - - - - - -Tactics for Working with Equality ---------------------------------- - - -### `asserts_rewrite` and `cuts_rewrite` - - -### `substs` - -better `subst` - not fail on circular eq - - -### `fequals` - -vs `f_equal`? - - -### `applys_eq` - -variant of `eapply` - - - - - -Some Convenient Shorthands --------------------------- - - -### `unfolds` - -better `unfold` - - -### `false` and `tryfalse` - -better `exfalso` - - -### `gen` - -shorthand for `generalize dependent`, multiple arg. - -```coq -(* old *) -intros Gamma x U v t S Htypt Htypv. -generalize dependent S. generalize dependent Gamma. - -(* new...so nice!!! *) -introv Htypt Htypv. gen S Gamma. -``` - - -### `admits`, `admit_rewrite` and `admit_goal` - -wrappers around `admit` - - -### `sort` - -> proof context more readable - -vars -> top -hypotheses -> bottom - - - - - - - -Tactics for Advanced Lemma Instantiation ----------------------------------------- - - -### Working on `lets` - -### Working on `applys`, `forwards` and `specializes` - diff --git a/_posts/read_sf_plf/2019-03-18-sf-plf-18-use-auto.md b/_posts/read_sf_plf/2019-03-18-sf-plf-18-use-auto.md deleted file mode 100644 index 0287e48..0000000 --- a/_posts/read_sf_plf/2019-03-18-sf-plf-18-use-auto.md +++ /dev/null @@ -1,70 +0,0 @@ ---- -title: "「SF-PLF」18 UseAuto" -subtitle: "Programming Language Foundations - Theory And Practice Of Automation In Coq Proofs" -layout: post -author: "Hux" -header-style: text -hidden: true -tags: - - SF (软件基础) - - PLF (编程语言基础) - - Coq - - 笔记 ---- - - - -## Basic Features of Proof Search - -### Strength of Proof Search - -> four proof-search tactics: `auto`, `eauto`, `iauto` and `jauto`. - - - - ---- - - -## How Proof Search Works - -### Search Depth - -### Backtracking - -### Adding Hints - -### Integration of Automation in Tactics - - - ---- - - - -## Example Proofs - - - ---- - - - -## Advanced Topics in Proof Search - - -### - - ---- - - -## Decision Procedures - - -### Omega - -### Ring - -### Congurence - diff --git a/_posts/read_sf_plf/2019-03-19-sf-plf-19-partial-eval.md b/_posts/read_sf_plf/2019-03-19-sf-plf-19-partial-eval.md deleted file mode 100644 index 47666ea..0000000 --- a/_posts/read_sf_plf/2019-03-19-sf-plf-19-partial-eval.md +++ /dev/null @@ -1,15 +0,0 @@ ---- -title: "「SF-PLF」19 PE" -subtitle: "Programming Language Foundations - Partial Evaluation" -layout: post -author: "Hux" -header-style: text -hidden: true -tags: - - SF (软件基础) - - PLF (编程语言基础) - - Coq - - 笔记 ---- - -TBD diff --git a/_posts/read_sf_qc/2019-09-02-sf-qc-02-typeclasses.md b/_posts/read_sf_qc/2019-09-02-sf-qc-02-typeclasses.md deleted file mode 100644 index 79acd51..0000000 --- a/_posts/read_sf_qc/2019-09-02-sf-qc-02-typeclasses.md +++ /dev/null @@ -1,796 +0,0 @@ ---- -title: "「SF-QC」2 TypeClasses" -subtitle: "Quickcheck - A Tutorial on Typeclasses in Coq" -layout: post -author: "Hux" -header-style: text -hidden: true -tags: - - SF (软件基础) - - QC (Quickcheck) - - Coq - - 笔记 ---- - -Considerring printing different types with this common idiom: - -```coq -showBool : bool → string -showNat : nat → string -showList : {A : Type} (A → string) → (list A) → string -showPair : {A B : Type} (A → string) → (B → string) → A * B → string - -Definition showListOfPairsOfNats := showList (showPair showNat showNat) (* LOL *) -``` - -> The designers of Haskell addressed this clunkiness through _typeclasses_, a mechanism by which the typechecker is instructed to automatically construct "type-driven" functions [Wadler and Blott 1989]. - -Coq followed Haskell's lead as well, but - -> because Coq's type system is so much richer than that of Haskell, and because typeclasses in Coq are used to automatically construct not only programs but also proofs, Coq's presentation of typeclasses is quite a bit less "transparent" - - -Basics ------- - -### Classes and Instances - -```coq -Class Show A : Type := { - show : A → string -}. - -Instance showBool : Show bool := { - show := fun b:bool ⇒ if b then "true" else "false" -}. -``` - -Comparing with Haskell: - -```haskell -class Show a where - show :: a -> string - --- you cannot override a `instance` so in reality you need a `newtype` wrapper to do this -instance Show Bool where - show b = if b then "True" else "Fasle" -``` - -> The show function is sometimes said to be overloaded, since it can be applied to arguments of many types, with potentially radically different behavior depending on the type of its argument. - - -Next, we can define functions that use the overloaded function show like this: - -```coq -Definition showOne {A : Type} `{Show A} (a : A) : string := - "The value is " ++ show a. - -Compute (showOne true). -Compute (showOne 42). - -Definition showTwo {A B : Type} - `{Show A} `{Show B} (a : A) (b : B) : string := - "First is " ++ show a ++ " and second is " ++ show b. - -Compute (showTwo true 42). -Compute (showTwo Red Green). -``` - -> The parameter `` `{Show A}`` is a _class constraint_, which states that the function showOne is expected to be applied only to types A that belong to the Show class. - -> Concretely, this constraint should be thought of as an _extra parameter_ to showOne supplying _evidence_ that A is an instance of Show — i.e., it is essentially just a show function for A, which is implicitly invoked by the expression show a. - -读时猜测(后来发现接下来有更正确的解释):`show` 在 name resolution 到 `class Show` 时就可以根据其参数的 type(比如 `T`)infer 出「我们需要一个 `Show T` 的实现(`instance`,其实就是个 table)」,在 Haskell/Rust 中这个 table 会在 lower 到 IR 时才 made explicit,而 Coq 这里的语法就已经强调了这里需要 implicitly-and-inferred `{}` 一个 table,这个 table 的名字其实不重要,只要其 type 是被 `A` parametrized 的 `Show` 就好了,类似 ML 的 `functor` 或者 Java 的 generic `interface`。 - -This is _Ad-hoc polymorphism_. - - -#### Missing Constraint - -What if we forget the class constrints: - -```coq -Error: -Unable to satisfy the following constraints: -In environment: -A : Type -a : A - -?Show : "Show A" -``` - - -#### Class `Eq` - -```coq -Class Eq A := - { - eqb: A → A → bool; - }. - -Notation "x =? y" := (eqb x y) (at level 70). - -Instance eqBool : Eq bool := - { - eqb := fun (b c : bool) ⇒ - match b, c with - | true, true ⇒ true - | true, false ⇒ false - | false, true ⇒ false - | false, false ⇒ true - end - }. - -Instance eqNat : Eq nat := - { - eqb := Nat.eqb - }. -``` - -> Why should we need to define a typeclass for boolean equality when _Coq's propositional equality_ (`x = y`) is completely generic? -> while it makes sense to _claim_ that two values `x` and `y` are equal no matter what their type is, it is not possible to write a _decidable equality checker_ for arbitrary types. In particular, equality at types like `nat → nat` is undecidable. - -`x = y` 返回一个需要去证的 `Prop` (relational) 而非 executable `Fixpoint` (functional) -因为 function 的 equality 有时候会 undeciable,所以才需要加 Functional Extensionality `Axiom`(见 LF-06) - -```coq -Instance eqBoolArrowBool: Eq (bool -> bool) := - { - eqb := fun (f1 f2 : bool -> bool) => - (f1 true) =? (f2 true) && (f1 false) =? (f2 false) - }. - -Compute (id =? id). (* ==> true *) -Compute (negb =? negb). (* ==> true *) -Compute (id =? negb). (* ==> false *) -``` - -这里这个 `eqb` 的定义也是基于 extensionality 的定义,如果考虑到 effects(divergence、IO)是很容易 break 的(类似 parametricity) - - - -### Parameterized Instances: New Typeclasses from Old - -Structural recursion - -```coq -Instance showPair {A B : Type} `{Show A} `{Show B} : Show (A * B) := - { - show p := - let (a,b) := p in - "(" ++ show a ++ "," ++ show b ++ ")" - }. -Compute (show (true,42)). -``` - -Structural equality - -```coq -Instance eqPair {A B : Type} `{Eq A} `{Eq B} : Eq (A * B) := - { - eqb p1 p2 := - let (p1a,p1b) := p1 in - let (p2a,p2b) := p2 in - andb (p1a =? p2a) (p1b =? p2b) - }. -``` - -Slightly more complicated example: typical list: - -```coq -(* the book didn't use any from ListNotation *) -Fixpoint showListAux {A : Type} (s : A → string) (l : list A) : string := - match l with - | nil ⇒ "" - | cons h nil ⇒ s h - | cons h t ⇒ append (append (s h) ", ") (showListAux s t) - end. -Instance showList {A : Type} `{Show A} : Show (list A) := - { - show l := append "[" (append (showListAux show l) "]") - }. - -(* I used them though *) -Fixpoint eqListAux {A : Type} `{Eq A} (l1 l2 : list A) : bool := - match l1, l2 with - | nil, nil => true - | (h1::t1), (h2::t2) => (h1 =? h2) && (eqListAux t1 t2) - | _, _ => false - end. - -Instance eqList {A : Type} `{Eq A} : Eq (list A) := - { - eqb l1 l2 := eqListAux l1 l2 - }. -``` - - - -### Class Hierarchies - -> we might want a typeclass `Ord` for "ordered types" that support both equality and a less-or-equal comparison operator. - -A bad way would be declare a new class with two func `eq` and `le`. - -It's better to establish dependencies between typeclasses, similar with OOP `class` inheritence and subtyping (but better!), this gave good code reuses. - -> We often want to organize typeclasses into hierarchies. - -```coq -Class Ord A `{Eq A} : Type := - { - le : A → A → bool - }. -Check Ord. (* ==> -Ord - : forall A : Type, Eq A -> Type -*) -``` - -class `Eq` is a "super(type)class" of `Ord` (not to be confused with OOP superclass) - -This is _Sub-typeclassing_. - -```coq -Fixpoint listOrdAux {A : Type} `{Ord A} (l1 l2 : list A) : bool := - match l1, l2 with - | [], _ => true - | _, [] => false - | h1::t1, h2::t2 => if (h1 =? h2) - then (listOrdAux t1 t2) - else (le h1 h2) - end. - -Instance listOrd {A : Type} `{Ord A} : Ord (list A) := - { - le l1 l2 := listOrdAux l1 l2 - }. - -(* truthy *) -Compute (le [1] [2]). -Compute (le [1;2] [2;2]). -Compute (le [1;2;3] [2]). - -(* falsy *) -Compute (le [1;2;3] [1]). -Compute (le [2] [1;2;3]). -``` - - - -How It works ------------- - -### Implicit Generalization - -所以 `` `{...}`` 这个 "backtick" notation is called _implicit generalization_,比 implicit `{}` 多做了一件自动 generalize 泛化 free varabile 的事情。 - -> that was added to Coq to support typeclasses but that can also be used to good effect elsewhere. - -```coq -Definition showOne1 `{Show A} (a : A) : string := - "The value is " ++ show a. - -Print showOne1. -(* ==> - showOne1 = - fun (A : Type) (H : Show A) (a : A) => "The value is " ++ show a - : forall A : Type, Show A -> A -> string - - Arguments A, H are implicit and maximally inserted -*) -``` - -> notice that the occurrence of `A` inside the `` `{...}`` is unbound and automatically insert the binding that we wrote explicitly before. - -> The "implicit and maximally generalized" annotation on the last line means that the automatically inserted bindings are treated (注:printed) as if they had been written with `{...}`, rather than `(...)`. - -> The "implicit" part means that the type argument `A` and the `Show` witness `H` are usually expected to be left implicit -> whenever we write `showOne1`, Coq will automatically insert two _unification variables_ as the first two arguments. - -> This automatic insertion can be disabled by writing `@`, so a bare occurrence of `showOne1` means the same as `@showOne1 _ _` - -这里的 witness `H` 即 `A` implements `Show` 的 evidence,本质就是个 table or record,可以 written more explicitly: - -```coq -Definition showOne2 `{_ : Show A} (a : A) : string := - "The value is " ++ show a. - -Definition showOne3 `{H : Show A} (a : A) : string := - "The value is " ++ show a. -``` - -甚至 - -```coq -Definition showOne4 `{Show} a : string := - "The value is " ++ show a. -``` - -```coq -showOne = -fun (A : Type) (H : Show A) (a : A) => "The value is " ++ show a - : forall A : Type, Show A -> A -> string - -Set Printing Implicit. - -showOne = -fun (A : Type) (H : Show A) (a : A) => "The value is " ++ @show A H a (* <-- 注意这里 *) - : forall A : Type, Show A -> A -> string -``` - -#### vs. Haskell - -顺便,Haskell 的话,`Show` 是可以直接 inferred from the use of `show` 得 - -```haskell -Prelude> showOne a = show a -Prelude> :t showOne -showOne :: Show a => a -> String -``` - -但是 Coq 不行,会退化上「上一个定义的 instance Show」,还挺奇怪的( - -```coq -Definition showOne5 a : string := (* not generalized *) - "The value is " ++ show a. -``` - -#### Free Superclass Instance - -``{Ord A}` led Coq to fill in both `A` and `H : Eq A` because it's the superclass of `Ord` (appears as the second argument). - -```coq -Definition max1 `{Ord A} (x y : A) := - if le x y then y else x. - -Set Printing Implicit. -Print max1. -(* ==> - max1 = - fun (A : Type) (H : Eq A) (H0 : @Ord A H) (x y : A) => - if @le A H H0 x y then y else x - - : forall (A : Type) (H : Eq A), - @Ord A H -> A -> A -> A -*) -Check Ord. -(* ==> Ord : forall A : Type, Eq A -> Type *) -``` - -`Ord` type 写详细的话可以是: - -```coq -Ord : forall (A : Type), (H: Eq A) -> Type -``` - - -#### Other usages of `` `{} `` - -Implicit generalized `Prop` mentioning free vars. - -```coq -Generalizable Variables x y. - -Lemma commutativity_property : `{x + y = y + x}. -Proof. intros. omega. Qed. - -Check commutativity_property. -``` - -Implicit generalized `fun`/`λ`, however... - -```coq -Definition implicit_fun := `{x + y}. -Compute (implicit_fun 2 3) (* ==> Error *) -Compute (@implicit_fun 2 3) -``` - -Implicitly-generalized but inserted as explicit via `` `(...)`` - -```coq -Definition implicit_fun := `(x + y). -Compute (implicit_fun 2 3) -``` - -这里可以看到 Coq 的所有语法都是正交的(非常牛逼……) -- `()`/`{}` 控制是否是 implicit argument -- `` ` ``-prefix 控制是否做 implicit generalization - - N.B. 可能你忘记了但是 `→` is degenerated `∀` (`Π`),所以 generalization 自然会生成 `fun` - - -### Records are Products - -> Record types must be declared before they are used. For example: - -```coq -Record Point := - Build_Point - { - px : nat; - py : nat - }. - -(* built with constructor *) -Check (Build_Point 2 4). - -(* built with record syntax *) -Check {| px := 2; py := 4 |}. -Check {| py := 2; px := 4 |}. - -(* field access, with a clunky "dot notation" *) -Definition r : Point := {| px := 2; py := 4 |}. -Compute (r.(px) + r.(py)). -``` - -和 OCaml 一样是 nominal typing 而非 structural typing。 -类似于 OCaml 中的 record 其实到 backend 了就会和 tuple 等价:都会 lower 到 Heap Block), -Coq 中的 Record 其实和 Pair/Product 也是等价:都是 arity 为 2 的 Inductive type: - -```coq -Inductive Point : Set := - | Build_Point : nat → nat → Point. -``` - -我仿造 `Print px.` 输出的定义模拟了一下: - -```coq -Inductive Point2 : Set := - | Build_Point2 (px2:nat) (py2:nat). -Definition px2 := fun p : Point2 => let (px, _) := p in px. -Definition py2 := fun p : Point2 => let (_, py) := p in py. - -Definition r2 : Point2 := Build_Point2 2 4. -Compute (r2.(px2) + r2.(py2)). (* => 6 *) - -Definition r2 : Point2 := {| px2 := 2; py2 := 4 |}. (* Error: px2 is not a projection *) -``` - -可以发现 dot notation 是可以工作的,`.` 应该只是一个 pipe -但是 `{|...|}` 不知道为什么这里会认为 `px2` 不是一个 record projection. - - -> Note that the field names have to be different. Any given field name can belong to only one record type. -> This greatly simplifies type inference! - - -### Typeclasses are Records - -> Typeclasses and instances, in turn, are basically just syntactic sugar for record types and values (together with a bit of magic for using proof search to fill in appropriate instances during typechecking... - -> Internally, a typeclass declaration is elaborated into a _parameterized_ `Record` declaration: - -```coq -Class Show A : Type := { show : A → string }. - -Print Show. -Record Show (A : Type) : Type := - Build_Show { show : A -> string } - -Set Printing All. -Print Show. -Variant Show (A : Type) : Type := - Build_Show : forall _ : forall _ : A, string, Show A - -(* to make it more clear... *) -Inductive Show (A : Type) : Type := - | Build_Show : ∀(show : ∀(a : A), string), Show A - -(* or more GADT looking, i.e., implicit generalized *) -Inductive Show (A : Type) : Type := - | Build_Show : (A -> string) -> Show A -``` - -Coq actually call a single-field record `Variant`. -Well actually, I found it's for any single-constructor `Inductive`ly constructed type. -You can even use `Variant` nonchangbly with `Inductive` as a keyword... - -```coq -Set Printing All. -Print Point. -Variant Point : Set := - Build_Point : forall (_ : nat) (_ : nat), Point -``` - -> Analogously, Instance declarations become record values: - -```coq -Print showNat. -showNat = {| show := string_of_nat |} - : Show nat -``` - -> Similarly, overloaded functions like show are really just _record projections_, which in turn are just functions that select a particular argument of a one-constructor Inductive type. - -```coq -Print show. -show = - fun (A : Type) (Show0 : Show A) => - let (show) := Show0 in show - : forall A : Type, Show A -> A -> string - -Set Printing All. -Print show. -show = - fun (A : Type) (Show0 : Show A) => - match Show0 return (forall _ : A, string) with - | Build_Show _ show => show - end - : forall (A : Type) (_ : Show A) (_ : A), string -``` - - -### Inferring Instances - -> appropriate instances are automatically inferred (and/or constructed!) during typechecking. - -```coq -Definition eg42 := show 42. - -Set Printing Implicit. -Print eg42. -eg42 = @show nat showNat 42 : string -``` - -different with `Compute`, `Print` 居然还可以这么把所有 implicit argument (after inferred) 都给 print 出来…… - -type inferrence: - -- `show` is expanded to `@show _ _ 42` -- obviously it's `@show nat __42` -- obviously it's `@show nat (?H : Show Nat) 42` - -Okay now where to find this witness/evidence/instance/record/table/you-name-it `?H` - -> It attempts to find or construct such a value using a _variant of the `eauto` proof search_ procedure that refers to a "hint database" called `typeclass_instances`. - -```coq -Print HintDb typeclass_instances. (* too much to be useful *) -``` - -"hint database" to me is better understood as a reverse of environment or typing context `Γ`. Though specialized with only `Instance` there. -(这么一看实现一个 Scala 的 `Implicit` 也不难啊) - -Coq can even print what's happening during this proof search! - -```coq -Set Typeclasses Debug. -Check (show 42). -(* ==> - Debug: 1: looking for (Show nat) without backtracking - Debug: 1.1: exact showNat on (Show nat), 0 subgoal(s) -*) - -Check (show (true,42)). -(* ==> - Debug: 1: looking for (Show (bool * nat)) without backtracking - Debug: 1.1: simple apply @showPair on (Show (bool * nat)), 2 subgoal(s) - Debug: 1.1.3 : (Show bool) - Debug: 1.1.3: looking for (Show bool) without backtracking - Debug: 1.1.3.1: exact showBool on (Show bool), 0 subgoal(s) - Debug: 1.1.3 : (Show nat) - Debug: 1.1.3: looking for (Show nat) without backtracking - Debug: 1.1.3.1: exact showNat on (Show nat), 0 subgoal(s) *) -Unset Typeclasses Debug. -``` - -> In summary, here are the steps again: - -```coq -show 42 - ===> { Implicit arguments } -@show _ _ 42 - ===> { Typing } -@show (?A : Type) (?Show0 : Show ?A) 42 - ===> { Unification } -@show nat (?Show0 : Show nat) 42 - ===> { Proof search for Show Nat returns showNat } -@show nat showNat 42 -``` - - -Typeclasses and Proofs ----------------------- - -### Propositional Typeclass Members - -```coq -Class EqDec (A : Type) {H : Eq A} := - { - eqb_eq : ∀ x y, x =? y = true ↔ x = y - }. -``` - -```coq -Instance eqdecNat : EqDec nat := - { - eqb_eq := Nat.eqb_eq - }. -``` - -这里可以用于抽象 LF-07 的 reflection - - -### Substructures - -> Naturally, it is also possible to have typeclass instances as members of other typeclasses: these are called _substructures_. - -这里的 `relation` 来自 Prelude 不过和 LF-11 用法一样: - -```coq -Require Import Coq.Relations.Relation_Definitions. -Class Reflexive (A : Type) (R : relation A) := - { - reflexivity : ∀ x, R x x - }. -Class Transitive (A : Type) (R : relation A) := - { - transitivity : ∀ x y z, R x y → R y z → R x z - }. -``` - -```coq -Class PreOrder (A : Type) (R : relation A) := - { PreOrder_Reflexive :> Reflexive A R ; - PreOrder_Transitive :> Transitive A R }. -``` - -> The syntax `:>` indicates that each `PreOrder` can be seen as a `Reflexive` and `Transitive` relation, so that, any time a reflexive relation is needed, a preorder can be used instead. - -这里的 `:>` 方向和 subtyping 的 _subsumption_ 是反着的……跟 SML 的 ascription `:>` 一样…… - -- subtyping `T :> S` : value of `S` can safely be used as value of `T` -- ascription `P :> R` : value of `P` can safely be used as value of `R` - -Why? - - - -Some Useful Typeclasses ------------------------ - -### `Dec` - -> The `ssreflect` library defines what it means for a proposition `P` to be _decidable_ like this... - -```coq -Require Import ssreflect ssrbool. -Print decidable. -(* ==> - decidable = fun P : Prop => {P} + {~ P} -*) -``` - -> .. where `{P} + {¬ P}` is an "informative disjunction" of `P` and `¬P`. - -即两个 evidence(参考 LF-07) - -```coq -Class Dec (P : Prop) : Type := - { - dec : decidable P - }. -``` - -### Monad - -> In Haskell, one place typeclasses are used very heavily is with the Monad typeclass, especially in conjunction with Haskell's "do notation" for monadic actions. - -> Monads are an extremely powerful tool for organizing and streamlining code in a wide range of situations where computations can be thought of as yielding a result along with some kind of "effect." - -说话很严谨「in a wide range of situations where ... "effect"」 - -> most older projects simply define their own monads and monadic notations — sometimes typeclass-based, often not — while newer projects use one of several generic libraries for monads. Our current favorite (as of Summer 2017) is the monad typeclasses in Gregory Malecha's `ext-lib` package: - - - -```coq -Require Export ExtLib.Structures.Monads. -Export MonadNotation. -Open Scope monad_scope. -``` - -```coq -Class Monad (M : Type → Type) : Type := { - ret : ∀ {T : Type}, T → M T ; - bind : ∀ {T U : Type}, M T → (T → M U) → M U -}. - -Instance optionMonad : Monad option := { - ret T x := Some x ; - bind T U m f := - match m with - None ⇒ None - | Some x ⇒ f x - end -}. -``` - -Compare with Haskell: - -```haskell -class Applicative m => Monad (m :: * -> *) where - return :: a -> m a - (>>=) :: m a -> (a -> m b) -> m b - -instance Monad Maybe where - return = Just - (>>=) = (>>=) - where - (>>=) :: Maybe a -> (a -> Maybe b) -> Maybe b - Nothing >>= _ = Nothing - (Just x) >>= f = f x -``` - -After mimic `do` notation: (as PLF-11) - -```coq -Definition sum3 (l : list nat) : option nat := - x0 <- nth_opt 0 l ;; - x1 <- nth_opt 1 l ;; - x2 <- nth_opt 2 l ;; - ret (x0 + x1 + x2). -``` - - -Controlling Instantiation -------------------------- - -### "Defaulting" - -Would better explicitly typed. searching can be stupid - -### Manipulating the Hint Database - -> One of the ways in which Coq's typeclasses differ most from Haskell's is the lack, in Coq, of an automatic check for "overlapping instances." - -在 Haskell 中一大 use case 是可以做类似 C++ 的 partial specification(偏特化) - -- Check out [this](https://kseo.github.io/posts/2017-02-05-avoid-overlapping-instances-with-closed-type-families.html) on the pros and cons of overlapping instances in Haskell -- Check out [this] (https://www.ibm.com/developerworks/community/blogs/12bb75c9-dfec-42f5-8b55-b669cc56ad76/entry/c__e6_a8_a1_e6_9d_bf__e7_a9_b6_e7_ab_9f_e4_bb_80_e4_b9_88_e6_98_af_e7_89_b9_e5_8c_96?lang=en) on template partial specification in C++ - -> That is, it is completely legal to define a given type to be an instance of a given class in two different ways. -> When this happens, it is unpredictable which instance will be found first by the instance search process; - -Workarounds in Coq when this happen: -1. removing instances from hint database -2. priorities - - - -Debugging ---------- - -TBD. - -- Instantiation Failures -- Nontermination - - -Alternative Structuring Mechanisms ----------------------------------- - -_large-scale structuring mechanisms_ - -> Typeclasses are just one of several mechanisms that can be used in Coq for structuring large developments. Others include: -> -> - canonical structures -> - bare dependent records -> - modules and functors - -Module and functors is very familiar! - - -Further Reading ----------------------------------- - -On the origins of typeclasses in Haskell: - -- How to make ad-hoc polymorphism less ad hoc Philip Wadler and Stephen Blott. 16'th Symposium on Principles of Programming Languages, ACM Press, Austin, Texas, January 1989. - - -The original paper on typeclasses In Coq: - -- Matthieu Sozeau and Nicolas Oury. First-Class Type Classes. TPHOLs 2008. - -