Skip to content

Latest commit

 

History

History
203 lines (167 loc) · 10.6 KB

CHANGELOG.md

File metadata and controls

203 lines (167 loc) · 10.6 KB

4.0.5

September 06, 2018

Special thanks to Kirill Zhumarin for PRs.

4.0.4

January 08, 2018

  • Updated CMakeLists.txt for cmake build. Added support for create Visual Studio Solution and the creation of packages for Linux systems. lexborisov#116
  • Fixed segfault if we have but not have a opening . lexborisov#124
  • Fixed cmake install path. lexborisov#126
  • Fixed rpm changelog date
  • Minor bug fixes

4.0.3

January 08, 2018

  • Delete

4.0.2

November 07, 2017

  • Grammar: change function name _pasition => _position
  • Fixed infinite loop if html file is to big. Queue round not work properly - fixed. lexborisov#117
  • Append new function myhtml_node_is_void_element for check to see if we are dealing with a void element. lexborisov#119
  • Potential loss of the pointer on systems other than x86, x86_64 (Misaligned Integer Pointer)

4.0.1

June 16, 2017

  • Fix for creating a spinlock without support siplock lexborisov#103
  • Added two functions for detect encoding with returning found position myencoding_prescan_stream_to_determine_encoding_with_found and myencoding_extracting_character_encoding_from_charset_with_found lexborisov#107
  • Added automated package build and publicate on PackageCloud.io (packagecloud.io/modest/myhtml)
  • Minor bug fixes

Special thanks for Alexander Fedyashov for help with automated package build.

4.0.0

March 21, 2017

  • API breaking changes!!!
  • MyHTML split to MyCORE, MyHTML, MyENCODING. MyCORE is a base module which include shared functions for all others modules.
  • Removed all io print functions to file: myhtml_tree_print_by_node, myhtml_tree_print_node_children, myhtml_tree_print_node; Use serializations instead of their
  • If you use encoding enum, like MyHTML_ENCODING_UTF8, now it MyENCODING_UTF_8, i.e MyHTML_ENCODING_* => MyENCODING_*
  • Functions migrated to MyCORE from MyHTML: myhtml_incoming_buffer_* => mycore_incoming_buffer_*, myhtml_string* => mycore_string*, myhtml_utils* => mycore_utils*
  • Fully refactoring build system with GNU Make (Makefile), now it expects generally accepted parameters and rules, like install, clean, library and more
  • Tested create a DLL library for Windows OS
  • Support create ports for different OS or for simple change work with memory, io, threads (if build with threads, default)
  • Support add self modules for build library
  • Now all return statuses, like a myhtml_status_t, mycss_status_t changed to global mystatus_t (unsigned int)
  • Added forgot '\0' if text node ends with '\r' #91
  • Remove CMakeLists.txt
  • Added PKG-CONFIG *.pc after make command

3.0.1

February 17, 2017

  • API breaking changes!!! See api_breaking_changes.md file
  • Sync with Specification (https://html.spec.whatwg.org/multipage/)
  • Fix problem with close token position in title tag (the inner essence)
  • Fix problem with detect SHIFT_JIS encoding
  • Added function myhtml_encoding_prescan_stream_to_determine_encoding to prescan a byte stream to determine its encoding. In other words, detect encoding in meta tag before start HTML parsing. See exapmle
  • Added function myhtml_encoding_name_by_id for get encoding name by id
  • Added function myhtml_encoding_extracting_character_encoding_from_charset
  • Added utils/mhash.* for create a hash table
  • Added function myhtml_node_tree for get current Tree from a node
  • Сonsumes less memory when initializing, 3MB => 1MB with no negative impact on performance. In the future, the memory will be consumed even less.
  • Now MyHTML_INSTALL_HEADER in cmake options set ON by default
  • Fixed broken mapping for convert encoding functions after release 3.0.0

3.0.0 =========== February 17, 2017 ...

2.0.1

January 08, 2017

  • Fixed very serious problem with MyHTML::Collection in function myhtml_collection_check_size lexborisov#84

2.0.0

December 22, 2016

  • API Breaking Changes: Remove all functions associated with tag index: myhtml_tree_get_tag_index, myhtml_tag_index_*
  • Changes for work with threads
  • Removed example replacing_node_attributes_low_level.c. Example is not working correctly. Let the future
  • Fix for myhtml_string_destroy function. Sometimes the resources are not free.
  • Fix problem with serialization in UTF-8 (0xC2 0xA0)
  • Added AVL-Tree for utils

1.0.4

November 15, 2016

1.0.3

September 24, 2016

1.0.2

July 14, 2016

  • Fixed a bug that in some cases can lead to an infinite loop (lexborisov#49)
  • Fixed bug for broken tag (like a <div/===>); lexborisov#50
  • Added function myhtml_version for get current version

1.0.1

July 13, 2016

  • First Release
  • Remove deprecated functions

1.0.0-rc

June 23, 2016

  • Synchronized with the specification of the 19.06.2016

  • Changed many "strange" code. Improved code stability and readability

  • Сonsumes less RAM

  • Added interesting examples:

    • tokenizer_colorize_high_level.c — colorize input html (work with callbacks)
    • parse_without_whitespace.c — parse and build tree without whitespace tokens (work with parse flags)
    • nodes_by_attr_key_high_level.c — get nodes by attribute key
    • nodes_by_attr_value_high_level.c — get nodes by attribute value (by key), interesting example, look to see
  • Added API to work with the Incoming Buffer

  • Added API for tokens

  • Added original positions in html for tokens

  • Added parsing flags (their names speak for themselves):

    • MyHTML_TREE_PARSE_FLAGS_WITHOUT_BUILD_TREE
    • MyHTML_TREE_PARSE_FLAGS_WITHOUT_PROCESS_TOKEN
    • MyHTML_TREE_PARSE_FLAGS_SKIP_WHITESPACE_TOKEN
    • MyHTML_TREE_PARSE_FLAGS_WITHOUT_DOCTYPE_IN_TREE
  • Added functions to find nodes by attibutes

    • myhtml_get_nodes_by_attribute_key (like a css [foo])
    • myhtml_get_nodes_by_attribute_value (like a css [foo="bar"])
    • myhtml_get_nodes_by_attribute_value_whitespace_separated (like a css [class~="footer"])
    • myhtml_get_nodes_by_attribute_value_begin (like a css [foo^="bar"])
    • myhtml_get_nodes_by_attribute_value_end (like a css [foo$="bar"])
    • myhtml_get_nodes_by_attribute_value_contain (like a css [foo*="bar"])
    • myhtml_get_nodes_by_attribute_value_hyphen_separated (like a css [foo|="bar"])
  • Added callbacks for tokens

    • myhtml_callback_before_token_done_set
    • myhtml_callback_after_token_done_set
  • Added conditions in source code to build MinGW

  • All functions html_parser* now return real status

  • Redesigned chunks handler and chunk global position, an Incoming Buffer for utf-16

  • myhtml attribute_name is now deprecated, use myhtml_attribute_key

  • Tested myhtml_t object for thread safe

  • Changed myhtml_string_realloc args count

  • Changed name for function myhtml_tree_print_node_childs to myhtml_tree_print_node_children

  • Changed name for function myhtml_node_insert_append_child to myhtml_node_append_child

  • Changed tag_ctx_idx on tokens to tag_id

  • Changed tag_idx on tree nodes to tag_id

  • Changed my_str_tm to str on tokens and my_namespace to ns

  • Changed attributes. Now for the key and value using different strings, not united as before.

  • Changed input to tokenizer stages

  • Fixed handling of strings in the encoding is not UTF-8

  • Fixed a hypothetical possibility of going beyond the limits of the buffer in strings

  • Reworked parsers tokens. Now they have become clear and obvious

  • Fixed significant bug with memory allocated for strings

  • Fixed cache for strings. Previously, due to an error it did not work

  • Fixed a bug which caused incorrect handle documents in UTF-16

  • Deleted function myhtml_token_is_whithspace

  • Deleted function myhtml_tree_incomming_buffer_get_last

  • Deleted myhtml_tree_temp_stream_t structure and everything connected with it