diff --git a/content/assignments/async-web-server/src/http-parser/README.md b/content/assignments/async-web-server/src/http-parser/README.md index 57b3831c19..e7069ecf6a 100644 --- a/content/assignments/async-web-server/src/http-parser/README.md +++ b/content/assignments/async-web-server/src/http-parser/README.md @@ -10,23 +10,22 @@ stream (in a web server that is per connection). Features: - * No dependencies - * Handles persistent streams (keep-alive). - * Decodes chunked encoding. - * Upgrade support - * Defends against buffer overflow attacks. +- No dependencies +- Handles persistent streams (keep-alive). +- Decodes chunked encoding. +- Upgrade support +- Defends against buffer overflow attacks. The parser extracts the following information from HTTP messages: - * Header fields and values - * Content-Length - * Request method - * Response status code - * Transfer-Encoding - * HTTP version - * Request path, query string, fragment - * Message body - +- Header fields and values +- Content-Length +- Request method +- Response status code +- Transfer-Encoding +- HTTP version +- Request path, query string, fragment +- Message body Usage ----- @@ -35,37 +34,41 @@ One `http_parser` object is used per TCP connection. Initialize the struct using `http_parser_init()` and set the callbacks. That might look something like this for a request parser: - http_parser_settings settings; - settings.on_path = my_path_callback; - settings.on_header_field = my_header_field_callback; - /* ... */ +```c +http_parser_settings settings; +settings.on_path = my_path_callback; +settings.on_header_field = my_header_field_callback; +/* ... */ - http_parser *parser = malloc(sizeof(http_parser)); - http_parser_init(parser, HTTP_REQUEST); - parser->data = my_socket; +http_parser *parser = malloc(sizeof(http_parser)); +http_parser_init(parser, HTTP_REQUEST); +parser->data = my_socket; +``` When data is received on the socket execute the parser and check for errors. - size_t len = 80*1024, nparsed; - char buf[len]; - ssize_t recved; +```c +size_t len = 80*1024, nparsed; +char buf[len]; +ssize_t recved; - recved = recv(fd, buf, len, 0); +recved = recv(fd, buf, len, 0); - if (recved < 0) { - /* Handle error. */ - } +if (recved < 0) { + /* Handle error. */ +} - /* Start up / continue the parser. - * Note we pass recved==0 to signal that EOF has been received. - */ - nparsed = http_parser_execute(parser, &settings, buf, recved); +/* Start up / continue the parser. + * Note we pass recved==0 to signal that EOF has been received. + */ +nparsed = http_parser_execute(parser, &settings, buf, recved); - if (parser->upgrade) { - /* handle new protocol */ - } else if (nparsed != recved) { - /* Handle error. Usually just close the connection. */ - } +if (parser->upgrade) { + /* handle new protocol */ +} else if (nparsed != recved) { + /* Handle error. Usually just close the connection. */ +} +``` HTTP needs to know where the end of the stream is. For example, sometimes servers send responses without Content-Length and expect the client to @@ -84,7 +87,6 @@ The parser decodes the transfer-encoding for both requests and responses transparently. That is, a chunked encoding is decoded before being sent to the on_body callback. - The Special Problem of Upgrade ------------------------------ @@ -92,42 +94,41 @@ HTTP supports upgrading the connection to a different protocol. An increasingly common example of this is the Web Socket protocol which sends a request like - GET /demo HTTP/1.1 - Upgrade: WebSocket - Connection: Upgrade - Host: example.com - Origin: http://example.com - WebSocket-Protocol: sample +```plain +GET /demo HTTP/1.1 +Upgrade: WebSocket +Connection: Upgrade +Host: example.com +Origin: http://example.com +WebSocket-Protocol: sample +``` followed by non-HTTP data. -(See http://tools.ietf.org/html/draft-hixie-thewebsocketprotocol-75 for more +(See for more information the Web Socket protocol.) -To support this, the parser will treat this as a normal HTTP message without a -body. Issuing both on_headers_complete and on_message_complete callbacks. However -http_parser_execute() will stop parsing at the end of the headers and return. - -The user is expected to check if `parser->upgrade` has been set to 1 after -`http_parser_execute()` returns. Non-HTTP data begins at the buffer supplied -offset by the return value of `http_parser_execute()`. +To support this, the parser will treat this as a normal HTTP message without a body. +Issuing both on_headers_complete and on_message_complete callbacks. +However http_parser_execute() will stop parsing at the end of the headers and return. +The user is expected to check if `parser->upgrade` has been set to 1 after `http_parser_execute()` returns. +Non-HTTP data begins at the buffer supplied offset by the return value of `http_parser_execute()`. Callbacks --------- -During the `http_parser_execute()` call, the callbacks set in -`http_parser_settings` will be executed. The parser maintains state and -never looks behind, so buffering the data is not necessary. If you need to -save certain data for later usage, you can do that from the callbacks. +During the `http_parser_execute()` call, the callbacks set in `http_parser_settings` will be executed. +The parser maintains state and never looks behind, so buffering the data is not necessary. +If you need to save certain data for later usage, you can do that from the callbacks. There are two types of callbacks: -* notification `typedef int (*http_cb) (http_parser*);` - Callbacks: on_message_begin, on_headers_complete, on_message_complete. -* data `typedef int (*http_data_cb) (http_parser*, const char *at, size_t length);` - Callbacks: (requests only) on_path, on_query_string, on_uri, on_fragment, - (common) on_header_field, on_header_value, on_body; +- notification `typedef int (*http_cb) (http_parser*);` + Callbacks: on_message_begin, on_headers_complete, on_message_complete. +- data `typedef int (*http_data_cb) (http_parser*, const char *at, size_t length);` + Callbacks: (requests only) on_path, on_query_string, on_uri, on_fragment, + (common) on_header_field, on_header_value, on_body; Callbacks must return 0 on success. Returning a non-zero value indicates error to the parser, making it exit immediately. @@ -142,30 +143,29 @@ Reading headers may be a tricky task if you read/parse headers partially. Basically, you need to remember whether last header callback was field or value and apply following logic: - (on_header_field and on_header_value shortened to on_h_*) - ------------------------ ------------ -------------------------------------------- - | State (prev. callback) | Callback | Description/action | - ------------------------ ------------ -------------------------------------------- - | nothing (first call) | on_h_field | Allocate new buffer and copy callback data | - | | | into it | - ------------------------ ------------ -------------------------------------------- - | value | on_h_field | New header started. | - | | | Copy current name,value buffers to headers | - | | | list and allocate new buffer for new name | - ------------------------ ------------ -------------------------------------------- - | field | on_h_field | Previous name continues. Reallocate name | - | | | buffer and append callback data to it | - ------------------------ ------------ -------------------------------------------- - | field | on_h_value | Value for current header started. Allocate | - | | | new buffer and copy callback data to it | - ------------------------ ------------ -------------------------------------------- - | value | on_h_value | Value continues. Reallocate value buffer | - | | | and append callback data to it | - ------------------------ ------------ -------------------------------------------- - +(on_header_field and on_header_value shortened to on_h_*) + ------------------------ ------------ -------------------------------------------- +| State (prev. callback) | Callback | Description/action | + ------------------------ ------------ -------------------------------------------- +| nothing (first call) | on_h_field | Allocate new buffer and copy callback data | +| | | into it | + ------------------------ ------------ -------------------------------------------- +| value | on_h_field | New header started. | +| | | Copy current name,value buffers to headers | +| | | list and allocate new buffer for new name | + ------------------------ ------------ -------------------------------------------- +| field | on_h_field | Previous name continues. Reallocate name | +| | | buffer and append callback data to it | + ------------------------ ------------ -------------------------------------------- +| field | on_h_value | Value for current header started. Allocate | +| | | new buffer and copy callback data to it | + ------------------------ ------------ -------------------------------------------- +| value | on_h_value | Value continues. Reallocate value buffer | +| | | and append callback data to it | + ------------------------ ------------ -------------------------------------------- See examples of reading in headers: -* [partial example](http://gist.github.com/155877) in C -* [from http-parser tests](http://github.com/ry/http-parser/blob/37a0ff8928fb0d83cec0d0d8909c5a4abcd221af/test.c#L403) in C -* [from Node library](http://github.com/ry/node/blob/842eaf446d2fdcb33b296c67c911c32a0dabc747/src/http.js#L284) in Javascript +- [partial example](http://gist.github.com/155877) in C +- [from http-parser tests](http://github.com/ry/http-parser/blob/37a0ff8928fb0d83cec0d0d8909c5a4abcd221af/test.c#L403) in C +- [from Node library](http://github.com/ry/node/blob/842eaf446d2fdcb33b296c67c911c32a0dabc747/src/http.js#L284) in Javascript