diff --git a/changelog/index.html b/changelog/index.html index ae355e8..0c266b5 100755 --- a/changelog/index.html +++ b/changelog/index.html @@ -1114,6 +1114,13 @@
iscc_core\codec.py
19 +
Source code in
- |
51 +
Source code in
- |
109 +
Source code in
- |
78 +
Source code in
- |
139 +
Source code in
- |
162 +
Source code in
- |
175 +
Source code in
- |
185 +
Source code in
- |
249 +
Source code in
- |
267 +
Source code in
- |
275 +
Source code in
- |
287 +
Source code in
- |
296 +
Source code in
- |
306 +
Source code in
- |
317 +
Development Setup Development Tasks Tests, coverage, code formatting and other tasks can be run with the Use @titusz "},{"location":"#contributing","title":"Contributing","text":"Pull requests are welcome. For significant changes, please open an issue first to discuss your plans. Please make sure to update tests as appropriate. You may also want join our developer chat on Telegram at https://t.me/iscc_dev. "},{"location":"changelog/","title":"Changelog","text":""},{"location":"changelog/#108-2024-01-30","title":"[1.0.8] - 2024-01-30","text":"
An application that claims ISCC conformance MUST pass all core functions from the ISCC conformance test suite. The test suite is available as JSON data on GitHub. Test data is structured as follows: Inputs that are expected to be Example Byte-stream outputs in JSON test data: "},{"location":"conformance/#iscc_core.conformance.conformance_testdata","title":"conformance_testdata() ","text":"Yield tuples of test data. Returns: Type DescriptionGenerator[Tuple[str, Callable, List[Any], List[Any]]] Tuple with testdata (test_name, func_obj, inputs, outputs) "},{"location":"conformance/#iscc_core.conformance.conformance_selftest","title":"conformance_selftest() ","text":"Run conformance tests. Returns: Type Descriptionbool whether all tests passed "},{"location":"constants/","title":"ISCC - Types and Constants","text":""},{"location":"constants/#iscc_core.constants.MT","title":"MT ","text":""},{"location":"constants/#iscc_core.constants.MT--mt-maintypes","title":"MT - MainTypes","text":"Uint Symbol Bits Purpose 0 META 0000 Match on metadata similarity 1 SEMANTIC 0001 Match on semantic content similarity 2 CONTENT 0010 Match on perceptual content similarity 3 DATA 0011 Match on data similarity 4 INSTANCE 0100 Match on data identity 5 ISCC 0101 Composite of two or more ISCC-UNITs with common header"},{"location":"constants/#iscc_core.constants.ST","title":"ST ","text":""},{"location":"constants/#iscc_core.constants.ST--st-subtypes","title":"ST - SubTypes","text":"Uint Symbol Bits Purpose 0 NONE 0000 For MainTypes that do not specify SubTypes"},{"location":"constants/#iscc_core.constants.ST_CC","title":"ST_CC ","text":""},{"location":"constants/#iscc_core.constants.ST_CC--st_cc","title":"ST_CC","text":"SubTypes for ST_ISCC ","text":""},{"location":"constants/#iscc_core.constants.ST_ISCC--st_iscc","title":"ST_ISCC","text":"SubTypes for VS ","text":""},{"location":"constants/#iscc_core.constants.VS--vs-version","title":"VS - Version","text":"Code Version Uint Symbol Bits Purpose 0 V0 0000 Initial Version of Code without breaking changes"},{"location":"constants/#iscc_core.constants.LN","title":"LN ","text":""},{"location":"constants/#iscc_core.constants.LN--ln-length","title":"LN - Length","text":"Valid lengths for hash-digests.
MULTIBASE ","text":"Supported Multibase encodings.
A multi-component identifier for digital media assets. An ISCC-CODE can be generated from the concatenation of the digests of the following five ISCC-UNITs together with a single common header:
The following sequences of ISCC-UNITs are possible:
gen_iscc_code_v0(codes) ","text":"Combine multiple ISCC-UNITS to an ISCC-CODE with a common header using algorithm v0. Parameters: Name Type Description Defaultcodes Sequence[str] A valid sequence of singluar ISCC-UNITS. requiredReturns: Type Descriptiondict An ISCC object with ISCC-CODE Source code iniscc_core\\iscc_code.py "},{"location":"iscc_id/","title":"ISCC-ID","text":"A decentralized, owned, and short identifier for digital assets. The ISCC-ID is generated from a similarity-hash of the units of an ISCC-CODE together with a blockchain wallet address. Its SubType designates the blockchain from which the ISCC-ID was minted. The similarity-hash is always at least 64-bits and optionally suffixed with a gen_iscc_id(iscc_code, chain_id, wallet, uc = 0) ","text":"Generate ISCC-ID from ISCC-CODE with the latest standard algorithm. Parameters: Name Type Description Defaultiscc_code str The ISCC-CODE from which to mint the ISCC-ID. requiredchain_id int Chain-ID of blockchain from which the ISCC-ID is minted. requiredwallet str The wallet address that signes the ISCC declaration requireduc int Uniqueness counter of ISCC-ID. 0 Returns: Type Descriptiondict ISCC object with an ISCC-ID "},{"location":"iscc_id/#iscc_core.iscc_id.gen_iscc_id_v0","title":"gen_iscc_id_v0(iscc_code, chain_id, wallet, uc = 0) ","text":"Generate an ISCC-ID from an ISCC-CODE with uniqueness counter 'uc' with algorithm v0. Parameters: Name Type Description Defaultiscc_code str The ISCC-CODE from which to mint the ISCC-ID. requiredchain_id int Chain-ID of blockchain from which the ISCC-ID is minted. requiredwallet str The wallet address that signes the ISCC declaration requireduc int Uniqueness counter of ISCC-ID. 0 Returns: Type Descriptiondict ISCC object with an ISCC-ID "},{"location":"iscc_id/#iscc_core.iscc_id.soft_hash_iscc_id_v0","title":"soft_hash_iscc_id_v0(iscc_code, wallet, uc = 0) ","text":"Calculate ISCC-ID hash digest from ISCC-CODE with algorithm v0. Accepts an ISCC-CODE or any sequence of ISCC-UNITs. Parameters: Name Type Description Defaultiscc_code str ISCC-CODE requiredwallet str The wallet address that signes the ISCC declaration requireduc int Uniqueness counter for ISCC-ID. 0 Returns: Type Descriptionbytes Digest for ISCC-ID without header but including uniqueness counter. "},{"location":"iscc_id/#iscc_core.iscc_id.iscc_id_incr","title":"iscc_id_incr(iscc_id) ","text":"Increment uniqueness counter of an ISCC-ID with latest standard algorithm. Parameters: Name Type Description Defaultiscc_id str Base32-encoded ISCC-ID. requiredReturns: Type Descriptionstr Base32-encoded ISCC-ID with counter incremented by one. "},{"location":"iscc_id/#iscc_core.iscc_id.iscc_id_incr_v0","title":"iscc_id_incr_v0(iscc_id) ","text":"Increment uniqueness counter of an ISCC-ID with algorithm v0. Parameters: Name Type Description Defaultiscc_id str Base32-encoded ISCC-ID. requiredReturns: Type Descriptionstr Base32-encoded ISCC-ID with counter incremented by one (without \"ISCC:\" prefix). "},{"location":"iscc_id/#iscc_core.iscc_id.alg_simhash_from_iscc_id","title":"alg_simhash_from_iscc_id(iscc_id, wallet) ","text":"Extract similarity preserving hex-encoded hash digest from ISCC-ID We need to un-xor the ISCC-ID hash digest with the wallet address hash to obtain the similarity preserving bytestring. "},{"location":"iso-reference/","title":"ISCC - ISO Reference","text":"The following functions are the reference implementations of ISO 24138: "},{"location":"iso-reference/#iso-24138-51-meta-code","title":"ISO 24138 / 5.1 Meta-Code","text":"gen_meta_code_v0(name, description=None, meta=None, bits=ic.core_opts.meta_bits) # Create an ISCC Meta-Code with the algorithm version 0. Parameters: Name Type Description Defaultname str Name or title of the work manifested by the digital asset requireddescription Optional[str] Optional description for disambiguation None meta Optional[Union[dict,str] Dict or Data-URL string with extended metadata None bits int Bit-length of resulting Meta-Code (multiple of 64) ic.core_opts.meta_bits Returns: Type Descriptiondict ISCC object with possible fields: iscc, name, description, metadata, metahash Source code iniscc_core\\code_meta.py "},{"location":"iso-reference/#iso-24138-53-text-code","title":"ISO 24138 / 5.3 Text-Code","text":"gen_text_code_v0(text, bits=ic.core_opts.text_bits) # Create an ISCC Text-Code with algorithm v0. Note Any markup (like HTML tags or markdown) should be removed from the plain-text before passing it to this function. Parameters: Name Type Description Defaulttext str Text for Text-Code creation requiredbits int Bit-length of ISCC Code Hash (default 64) ic.core_opts.text_bits Returns: Type Descriptiondict ISCC schema instance with Text-Code and an aditional property iscc_core\\code_content_text.py "},{"location":"iso-reference/#iso-24138-54-image-code","title":"ISO 24138 / 5.4 Image-Code","text":"gen_image_code_v0(pixels, bits=ic.core_opts.image_bits) # Create an ISCC Content-Code Image with algorithm v0. Parameters: Name Type Description Defaultpixels Sequence[int] Normalized image pixels (32x32 flattened gray values) requiredbits int Bit-length of ISCC Content-Code Image (default 64). ic.core_opts.image_bits Returns: Type DescriptionISCC ISCC object with Content-Code Image. Source code iniscc_core\\code_content_image.py "},{"location":"iso-reference/#iso-24138-55-audio-code","title":"ISO 24138 / 5.5 Audio-Code","text":"gen_audio_code_v0(cv, bits=ic.core_opts.audio_bits) # Create an ISCC Content-Code Audio with algorithm v0. Parameters: Name Type Description Defaultcv Iterable[int] Chromaprint vector requiredbits int Bit-length resulting Content-Code Audio (multiple of 64) ic.core_opts.audio_bits Returns: Type Descriptiondict ISCC object with Content-Code Audio Source code iniscc_core\\code_content_audio.py "},{"location":"iso-reference/#iso-24138-56-video-code","title":"ISO 24138 / 5.6 Video-Code","text":"gen_video_code_v0(frame_sigs, bits=ic.core_opts.video_bits) # Create an ISCC Video-Code with algorithm v0. Parameters: Name Type Description Defaultframe_sigs ic.FrameSig Sequence of MP7 frame signatures requiredbits int Bit-length resulting Video-Code (multiple of 64) ic.core_opts.video_bits Returns: Type Descriptiondict ISCC object with Video-Code Source code iniscc_core\\code_content_video.py "},{"location":"iso-reference/#iso-24138-57-mixed-code","title":"ISO 24138 / 5.7 Mixed-Code","text":"gen_mixed_code_v0(codes, bits=ic.core_opts.mixed_bits) # Create an ISCC Content-Code-Mixed with algorithm v0. If the provided codes are of mixed length they are stripped to Parameters: Name Type Description Defaultcodes Iterable[str] a list of Content-Codes. requiredbits int Target bit-length of generated Content-Code-Mixed. ic.core_opts.mixed_bits Returns: Type Descriptiondict ISCC object with Content-Code Mixed. Source code iniscc_core\\code_content_mixed.py "},{"location":"iso-reference/#iso-24138-58-data-code","title":"ISO 24138 / 5.8 Data-Code","text":"gen_data_code_v0(stream, bits=ic.core_opts.data_bits) # Create an ISCC Data-Code with algorithm v0. Parameters: Name Type Description Defaultstream Stream Input data stream. requiredbits int Bit-length of ISCC Data-Code (default 64). ic.core_opts.data_bits Returns: Type Descriptiondict ISCC object with Data-Code Source code iniscc_core\\code_data.py "},{"location":"iso-reference/#iso-24138-59-instance-code","title":"ISO 24138 / 5.9 Instance-Code","text":"gen_instance_code_v0(stream, bits=ic.core_opts.instance_bits) # Create an ISCC Instance-Code with algorithm v0. Parameters: Name Type Description Defaultstream Stream Binary data stream for Instance-Code generation requiredbits int Bit-length of resulting Instance-Code (multiple of 64) ic.core_opts.instance_bits Returns: Type Descriptiondict ISCC object with Instance-Code and properties: datahash, filesize Source code iniscc_core\\code_instance.py "},{"location":"iso-reference/#iso-24138-60-iscc-code","title":"ISO 24138 / 6.0 ISCC-CODE","text":"gen_iscc_code_v0(codes) # Combine multiple ISCC-UNITS to an ISCC-CODE with a common header using algorithm v0. Parameters: Name Type Description Defaultcodes Sequence[str] A valid sequence of singluar ISCC-UNITS. requiredReturns: Type Descriptiondict An ISCC object with ISCC-CODE Source code iniscc_core\\iscc_code.py "},{"location":"algorithms/cdc/","title":"ISCC - Content Defined Chunking","text":"Compatible with fastcdc "},{"location":"algorithms/cdc/#iscc_core.cdc.alg_cdc_chunks","title":"alg_cdc_chunks(data, utf32, avg_chunk_size = ic.core_opts.data_avg_chunk_size) ","text":"A generator that yields data-dependent chunks for Usage Example: Parameters: Name Type Description Defaultdata bytes Raw data for variable sized chunking. requiredutf32 bool If true assume we are chunking text that is utf32 encoded. requiredavg_chunk_size int Target chunk size in number of bytes. ic.core_opts.data_avg_chunk_size Returns: Type DescriptionGenerator[bytes] A generator that yields data chunks of variable sizes. Source code iniscc_core\\cdc.py "},{"location":"algorithms/cdc/#iscc_core.cdc.alg_cdc_offset","title":"alg_cdc_offset(buffer, mi, ma, cs, mask_s, mask_l) ","text":"Find breakpoint offset for a given buffer. Parameters: Name Type Description Defaultbuffer Data The data to be chunked. requiredmi int Minimum chunk size. requiredma int Maximung chunk size. requiredcs int Center size. requiredmask_s int Small mask. requiredmask_l int Large mask. requiredReturns: Type Descriptionint Offset of dynamic cutpoint in number of bytes. Source code iniscc_core\\cdc.py "},{"location":"algorithms/cdc/#iscc_core.cdc.alg_cdc_params","title":"alg_cdc_params(avg_size: int) -> tuple ","text":"Calculate CDC parameters Parameters: Name Type Description Defaultavg_size int Target average size of chunks in number of bytes. requiredReturns: Type Descriptiontuple Tuple of (min_size, max_size, center_size, mask_s, mask_l). Source code iniscc_core\\cdc.py "},{"location":"algorithms/dct/","title":"ISCC - Discrete Cosine Transform","text":""},{"location":"algorithms/dct/#iscc_core.dct.alg_dct","title":"alg_dct(v) ","text":"Discrete cosine transform. See: nayuki.io. Parameters: Name Type Description Defaultv Sequence[float] Input vector for DCT calculation. requiredReturns: Type DescriptionList DCT Transformed vector. Source code iniscc_core\\dct.py "},{"location":"algorithms/minhash/","title":"ISCC - Minhash","text":""},{"location":"algorithms/minhash/#iscc_core.minhash.alg_minhash","title":"alg_minhash(features) ","text":"Calculate a 64 dimensional minhash integer vector. Parameters: Name Type Description Defaultfeatures List[int] List of integer features requiredReturns: Type DescriptionList[int] Minhash vector Source code iniscc_core\\minhash.py "},{"location":"algorithms/minhash/#iscc_core.minhash.alg_minhash_64","title":"alg_minhash_64(features) ","text":"Create 64-bit minimum hash digest. Parameters: Name Type Description Defaultfeatures List[int] List of integer features requiredReturns: Type Descriptionbytes 64-bit binary from the least significant bits of the minhash values Source code iniscc_core\\minhash.py "},{"location":"algorithms/minhash/#iscc_core.minhash.alg_minhash_256","title":"alg_minhash_256(features) ","text":"Create 256-bit minimum hash digest. Parameters: Name Type Description Defaultfeatures List[int] List of integer features requiredReturns: Type Descriptionbytes 256-bit binary from the least significant bits of the minhash values Source code iniscc_core\\minhash.py "},{"location":"algorithms/minhash/#iscc_core.minhash.alg_minhash_compress","title":"alg_minhash_compress(mhash, lsb = 4) ","text":"Compress minhash vector to byte hash-digest. Concatenates Parameters: Name Type Description Defaultmhash List[int] List of minhash integer features requiredlsb int Number of the least significant bits to retain 4 Returns: Type Descriptionbytes 256-bit binary from the least significant bits of the minhash values Source code iniscc_core\\minhash.py "},{"location":"algorithms/simhash/","title":"ISCC - Simhash","text":""},{"location":"algorithms/simhash/#iscc_core.simhash.alg_simhash","title":"alg_simhash(hash_digests) ","text":"Creates a similarity preserving hash from a sequence of equal sized hash digests. Parameters: Name Type Description Defaulthash_digests list A sequence of equaly sized byte-hashes. requiredReturns: Type Descriptionbytes Similarity byte-hash Source code iniscc_core\\simhash.py "},{"location":"algorithms/wtahash/","title":"ISCC - Winner Takes All Hash","text":""},{"location":"algorithms/wtahash/#iscc_core.wtahash.alg_wtahash","title":"alg_wtahash(vec: Sequence[float], bits: Sequence[float]) -> bytes ","text":"Calculate WTA Hash for vector with 380 values (MP7 frame signature). Source code iniscc_core\\wtahash.py "},{"location":"codec/","title":"ISCC - Codec","text":"This module implements encoding, decoding and transcoding functions of ISCC "},{"location":"codec/#codec-overview","title":"Codec Overview","text":""},{"location":"codec/#codec-functions","title":"Codec Functions","text":""},{"location":"codec/#iscc_core.codec.encode_component","title":"encode_component(mtype, stype, version, bit_length, digest) ","text":"Encode an ISCC-UNIT inlcuding header and body with standard base32 encoding. Note The Parameters: Name Type Description Defaultmtype MainType Maintype of unit (0-6) requiredstype SubType SubType of unit depending on MainType (0-5) requiredversion Version Version of unit algorithm (0). requiredbit_length length Length of unit, in number of bits (multiple of 32) requireddigest bytes The hash digest of the unit. requiredReturns: Type Descriptionstr Base32 encoded ISCC-UNIT. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_header","title":"encode_header(mtype, stype, version = 0, length = 1) ","text":"Encodes header values with nibble-sized (4-bit) variable-length encoding. The result is minimum 2 and maximum 8 bytes long. If the final count of nibbles is uneven it is padded with 4-bit Warning The length value must be encoded beforhand because its semantics depend on the MainType (see Parameters: Name Type Description Defaultmtype MainType MainType of unit. requiredstype SubType SubType of unit. requiredversion Version Version of component algorithm. 0 length Length length value of unit (1 means 64-bits for standard units) 1 Returns: Type Descriptionbytes Varnibble stream encoded ISCC header as bytes. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_header","title":"decode_header(data) ","text":"Decodes varnibble encoded header and returns it together with Tail data is included to enable decoding of sequential ISCCs. The returned tail data must be truncated to decode_length(r[0], r[3]) bits to recover the actual hash-bytes. Parameters: Name Type Description Defaultdata bytes ISCC bytes requiredReturns: Type DescriptionIsccTuple (MainType, SubType, Version, length, TailData) Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_varnibble","title":"encode_varnibble(n) ","text":"Writes integer to variable length sequence of 4-bit chunks. Variable-length encoding scheme: prefix bits nibbles data bits unsigned range 0 1 3 0 - 7 10 2 6 8 - 71 110 3 9 72 - 583 1110 4 12 584 - 4679Parameters: Name Type Description Defaultn int Positive integer to be encoded as varnibble (0-4679) requiredReturns: Type Descriptionbitarray Varnibble encoded integera Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_varnibble","title":"decode_varnibble(b) ","text":"Reads first varnibble, returns its integer value and remaining bits. Parameters: Name Type Description Defaultb bitarray Array of header bits requiredReturns: Type DescriptionTuple[int, bitarray] A tuple of the integer value of first varnible and the remaining bits. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_units","title":"encode_units(units) ","text":"Encodes a combination of ISCC units to an integer between 0-7 to be used as length value for the final encoding of MT.ISCC Parameters: Name Type Description Defaultunits Tuple A tuple of a MainType combination (can be empty) requiredReturns: Type Descriptionint Integer value to be used as length-value for header encoding Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_units","title":"decode_units(unit_id) ","text":"Decodes an ISCC header length value that has been encoded with a unit_id to an ordered tuple of MainTypes. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_length","title":"encode_length(mtype, length) ","text":"Encode length to integer value for header encoding. The For MainTypes For MainType For MainType Parameters: Name Type Description Defaultmtype MainType The MainType for which to encode the length value. requiredlength Length The length expressed according to the semantics of the type requiredReturns: Type Descriptionint The length value encoded as integer for use with write_header. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_length","title":"decode_length(mtype, length) ","text":"Dedoce raw length value from ISCC header to length of digest in number of bits. Decodes a raw header integer value in to its semantically meaningfull value (e.g. number of bits) Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_base32","title":"encode_base32(data) ","text":"Standard RFC4648 base32 encoding without padding. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_base32","title":"decode_base32(code) ","text":"Standard RFC4648 base32 decoding without padding and with casefolding. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_decompose","title":"iscc_decompose(iscc_code) ","text":"Decompose a normalized ISCC-CODE or any valid ISCC sequence into a list of ISCC-UNITS. A valid ISCC sequence is a string concatenation of ISCC-UNITS optionally seperated by a hyphen. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_normalize","title":"iscc_normalize(iscc_code) ","text":"Normalize an ISCC to its canonical form. The canonical form of an ISCC is its shortest base32 encoded representation prefixed with the string Possible valid inputs: Info A concatenated sequence of codes will be composed into a single ISCC of MainType Example Parameters: Name Type Description Defaultiscc_code str Any valid ISCC string requiredReturns: Type Descriptionstr Normalized ISCC Source code iniscc_core\\codec.py "},{"location":"codec/#alternate-encodings","title":"Alternate Encodings","text":""},{"location":"codec/#iscc_core.codec.encode_base64","title":"encode_base64(data) ","text":"Standard RFC4648 base64url encoding without padding. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_base64","title":"decode_base64(code) ","text":"Standard RFC4648 base64url decoding without padding. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_base32hex","title":"encode_base32hex(data) ","text":"RFC4648 Base32hex encoding without padding see: https://tools.ietf.org/html/rfc4648#page-10 Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_base32hex","title":"decode_base32hex(code) ","text":"RFC4648 Base32hex decoding without padding see: https://tools.ietf.org/html/rfc4648#page-10 Source code iniscc_core\\codec.py "},{"location":"codec/#helper-functions","title":"Helper Functions","text":""},{"location":"codec/#iscc_core.codec.iscc_decode","title":"iscc_decode(iscc) ","text":"Decode ISCC to an IsccTuple Parameters: Name Type Description Defaultiscc str ISCC string requiredReturns: Type DescriptionIsccTuple ISCC decoded to a tuple Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_explain","title":"iscc_explain(iscc) ","text":"Convert ISCC to a human-readable representation Parameters: Name Type Description Defaultiscc str ISCC string requiredReturns: Type Descriptionstr Human-readable representation of ISCC Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_type_id","title":"iscc_type_id(iscc) ","text":"Extract and convert ISCC HEADER to a readable Type-ID string. Type-ids can be used as names in databases to index ISCC-UNITs seperatly. Parameters: Name Type Description Defaultiscc str ISCC string requiredReturns: Type Descriptionstr Unique Type-ID string Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_validate","title":"iscc_validate(iscc, strict = True) ","text":"Validate that a given string is a strictly well-formed ISCC. A strictly well-formed ISCC is:
Parameters: Name Type Description Defaultiscc str ISCC string requiredstrict bool Raise an exeption if validation fails (default True) True Returns: Type Descriptionbool True if sting is valid else false. (raises ValueError in strict mode) Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_clean","title":"iscc_clean(iscc) ","text":"Cleanup ISCC string. Removes leading scheme, dashes, leading/trailing whitespace. Parameters: Name Type Description Defaultiscc str Any valid ISCC string requiredReturns: Type Descriptionstr Cleaned ISCC string. Source code iniscc_core\\codec.py "},{"location":"options/options/","title":"ISCC-CORE - Configuration Options","text":"Options for the iscc-core package can be configured using environment variables. Variables are loaded as class-attributes on the Example how to access configuration options "},{"location":"options/options/#iscc_core.options.CoreOptions","title":"CoreOptions","text":"Parameters with defaults for ISCC calculations. "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_bits","title":"meta_bitsinstance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_trim_name","title":"meta_trim_name instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_trim_description","title":"meta_trim_description instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_ngram_size_text","title":"meta_ngram_size_text instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_ngram_size_bytes","title":"meta_ngram_size_bytes instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.text_bits","title":"text_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.text_ngram_size","title":"text_ngram_size instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.text_unicode_filter","title":"text_unicode_filter instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.text_newlines","title":"text_newlines instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.image_bits","title":"image_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.audio_bits","title":"audio_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.video_bits","title":"video_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.data_bits","title":"data_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.data_avg_chunk_size","title":"data_avg_chunk_size instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.instance_bits","title":"instance_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.mixed_bits","title":"mixed_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.io_read_size","title":"io_read_size instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.cdc_gear","title":"cdc_gear instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.conformanc_critical","title":"conformanc_critical module-attribute ","text":" "},{"location":"options/options/#iscc_core.options.has_logged_confromance","title":"has_logged_confromance module-attribute ","text":" "},{"location":"options/options/#iscc_core.options.conformance_check_options","title":"conformance_check_options","text":" Check and log if options have non-default conformance critical values "},{"location":"options/options/#iscc_core.options.core_opts","title":"core_optsmodule-attribute ","text":" "},{"location":"options/options/#iscc_core.options.conformant_options","title":"conformant_options module-attribute ","text":" "},{"location":"units/","title":"ISCC - UNITs","text":"A standard ISCC-CODE is build from multiple ISCC-UNITs. Each unit serve a different purpose. "},{"location":"units/code_data/","title":"ISCC - Data-Code","text":"A similarity perserving hash for binary data (soft hash). "},{"location":"units/code_data/#iscc_core.code_data.gen_data_code","title":"gen_data_code(stream, bits = ic.core_opts.data_bits) ","text":"Create a similarity preserving ISCC Data-Code with the latest standard algorithm. Parameters: Name Type Description Defaultstream Stream Input data stream. requiredbits int Bit-length of ISCC Data-Code (default 64). ic.core_opts.data_bits Returns: Type Descriptiondict ISCC Data-Code "},{"location":"units/code_data/#iscc_core.code_data.gen_data_code_v0","title":"gen_data_code_v0(stream, bits = ic.core_opts.data_bits) ","text":"Create an ISCC Data-Code with algorithm v0. Parameters: Name Type Description Defaultstream Stream Input data stream. requiredbits int Bit-length of ISCC Data-Code (default 64). ic.core_opts.data_bits Returns: Type Descriptiondict ISCC object with Data-Code "},{"location":"units/code_data/#iscc_core.code_data.soft_hash_data_v0","title":"soft_hash_data_v0(stream) ","text":"Create a similarity preserving Data-Hash digest Parameters: Name Type Description Defaultstream Stream Input data stream. requiredReturns: Type Descriptionbytes 256-bit Data-Hash (soft-hash) digest used as body for Data-Code "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0","title":"DataHasherV0 ","text":"Incremental Data-Hash generator. "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0.__init__","title":"__init__(data = None) ","text":"Create a DataHasher Parameters: Name Type Description Defaultdata Optional[Data] initial payload for hashing. None "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0.push","title":"push(data) ","text":"Push data to the Data-Hash generator. "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0.digest","title":"digest() ","text":"Calculate 256-bit minhash digest from feature hashes. "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0.code","title":"code(bits = ic.core_opts.data_bits) ","text":"Encode digest as an ISCC Data-Code unit. Parameters: Name Type Description Defaultbits int Number of bits for the ISCC Data-Code ic.core_opts.data_bits Returns: Type Descriptionstr ISCC Data-Code "},{"location":"units/code_flake/","title":"ISCC - Flake-Code","text":"A unique, time-sorted identifier composed of an 48-bit timestamp and 16 to 208 bit randomness. The ISCC Flake-Code is a unique identifier for distributed ID generation. The 64-bit version can be used as efficient surrogate key in database systems. It has guaranteed uniqueness if generated from a singele process and is time sortable in integer and base32hex representation. The 128-bit version is a K-sortable, globally unique identifier for use in distributed systems and is compatible with UUID. Example "},{"location":"units/code_flake/#iscc_core.code_flake.gen_flake_code","title":"gen_flake_code(bits = ic.core_opts.flake_bits) ","text":"Create an ISCC Flake-Code with the latest standard algorithm Parameters: Name Type Description Defaultbits int Target bit-length of generated Flake-Code ic.core_opts.flake_bits Returns: Type Descriptiondict ISCC object with Flake-Code "},{"location":"units/code_flake/#iscc_core.code_flake.gen_flake_code_v0","title":"gen_flake_code_v0(bits = ic.core_opts.flake_bits) ","text":"Create an ISCC Flake-Code with the latest algorithm v0 Parameters: Name Type Description Defaultbits int Target bit-length of generated Flake-Code ic.core_opts.flake_bits Returns: Type Descriptiondict ISCC object with Flake-Code "},{"location":"units/code_flake/#iscc_core.code_flake.uid_flake_v0","title":"uid_flake_v0(ts = None, bits = ic.core_opts.flake_bits) ","text":"Generate time and randomness based Flake-Hash Parameters: Name Type Description Defaultts Optional[float] Unix timestamp (defaults to current time) None bits int Bit-length resulting Flake-Code (multiple of 32) ic.core_opts.flake_bits Returns: Type Descriptionbytes Flake-Hash digest "},{"location":"units/code_instance/","title":"ISCC - Instance-Code","text":"A data checksum. "},{"location":"units/code_instance/#iscc_core.code_instance.gen_instance_code","title":"gen_instance_code(stream, bits = ic.core_opts.instance_bits) ","text":"Create an ISCC Instance-Code with the latest standard algorithm. Parameters: Name Type Description Defaultstream Stream Binary data stream for Instance-Code generation requiredbits int Bit-length resulting Instance-Code (multiple of 64) ic.core_opts.instance_bits Returns: Type Descriptiondict ISCC object with properties: iscc, datahash, filesize "},{"location":"units/code_instance/#iscc_core.code_instance.gen_instance_code_v0","title":"gen_instance_code_v0(stream, bits = ic.core_opts.instance_bits) ","text":"Create an ISCC Instance-Code with algorithm v0. Parameters: Name Type Description Defaultstream Stream Binary data stream for Instance-Code generation requiredbits int Bit-length of resulting Instance-Code (multiple of 64) ic.core_opts.instance_bits Returns: Type Descriptiondict ISCC object with Instance-Code and properties: datahash, filesize "},{"location":"units/code_instance/#iscc_core.code_instance.hash_instance_v0","title":"hash_instance_v0(stream) ","text":"Create 256-bit hash digest for the Instance-Code body Parameters: Name Type Description Defaultstream Stream Binary data stream for hash generation. requiredReturns: Type Descriptionbytes 256-bit Instance-Hash digest used as body of Instance-Code "},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0","title":"InstanceHasherV0 ","text":"Incremental Instance-Hash generator. "},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0.push","title":"push(data) ","text":"Push data to the Instance-Hash generator. Parameters: Name Type Description Defaultdata Data Data to be hashed required"},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0.digest","title":"digest() ","text":"Return Instance-Hash Returns: Type Descriptionbytes Instance-Hash digest "},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0.multihash","title":"multihash() ","text":"Return blake3 multihash Returns: Type Descriptionstr Blake3 hash as 256-bit multihash "},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0.code","title":"code(bits = ic.core_opts.instance_bits) ","text":"Encode digest as an ISCC Instance-Code unit. Parameters: Name Type Description Defaultbits int Number of bits for the ISCC Instance-Code ic.core_opts.instance_bits Returns: Type Descriptionstr ISCC Instance-Code "},{"location":"units/code_meta/","title":"ISCC - Meta-Code","text":"A similarity preserving hash for digital asset metadata. "},{"location":"units/code_meta/#purpose","title":"Purpose","text":"The Meta-Code is the first possible (optional) unit of an ISCC-CODE. It is calculated from the metadata of a digital asset. The primary purpose of the Meta-Code is to aid the discovery of digital assets with similar metadata and the detection of metadata anomalies. As a secondary function, Meta-Code processing also creates a secure Meta-Hash for cryptogrpahic binding purposes. "},{"location":"units/code_meta/#inputs","title":"Inputs","text":"The metadata supplied for Meta-Code calculation is called Seed-Metadata. Seed-Metadata has 3 possible elements:
Note Due to the broad applicability of the ISCC we do not prescribe a particular schema for the Data-URL Examples:
Data-URLs are also supported by all major internet browsers. "},{"location":"units/code_meta/#processing","title":"Processing","text":""},{"location":"units/code_meta/#meta-code","title":"Meta-Code","text":"The first 32-bits of a Meta-Code are calculated as a simliarity hash from the Note To support automation and reproducibility, applications that generate ISCCs, should prioritize metadata that is automatically extracted from the digital asset. If embedded metadata is not available or known to be unreliable an application should rely on external metadata or explicitly ask users to supply at least the If neither embedded nor external metadata is available, the application may resort to use the filename of the digital asset as value for the In addition to the Meta-Code we also create a cryptographic hash (the Meta-Hash) of the supplied Seed-Metadata. It is used to securely bind metadata to the digital asset. "},{"location":"units/code_meta/#functions","title":"Functions","text":""},{"location":"units/code_meta/#iscc_core.code_meta.gen_meta_code_v0","title":"gen_meta_code_v0(name, description = None, meta = None, bits = ic.core_opts.meta_bits) ","text":"Create an ISCC Meta-Code with the algorithm version 0. Parameters: Name Type Description Defaultname str Name or title of the work manifested by the digital asset requireddescription Optional[str] Optional description for disambiguation None meta Optional[Union[dict,str] Dict or Data-URL string with extended metadata None bits int Bit-length of resulting Meta-Code (multiple of 64) ic.core_opts.meta_bits Returns: Type Descriptiondict ISCC object with possible fields: iscc, name, description, metadata, metahash Source code iniscc_core\\code_meta.py "},{"location":"units/code_meta/#iscc_core.code_meta.soft_hash_meta_v0","title":"soft_hash_meta_v0(name, extra = None) ","text":"Calculate simmilarity preserving 256-bit hash digest from asset metadata. Textual input should be stripped of markup, normalized and trimmed before hashing. Bytes input can be any serialized metadata (JSON, XML, Image...). Metadata should be serialized in a canonical form (for example JCS for JSON) Note The processing algorithm depends on the type of the
Parameters: Name Type Description Defaultname str Title of the work manifested in the digital asset requiredextra Union[str,bytes,None] Additional metadata for disambiguation None Returns: Type Descriptionbytes 256-bit simhash digest for Meta-Code Source code iniscc_core\\code_meta.py "},{"location":"units/code_meta/#iscc_core.code_meta.text_clean","title":"text_clean(text) ","text":"Clean text for display.
iscc_core\\code_meta.py "},{"location":"units/code_meta/#iscc_core.code_meta.text_remove_newlines","title":"text_remove_newlines(text) ","text":"Remove newlines. The Parameters: Name Type Description Defaulttext Text for newline removal requiredReturns: Type Descriptionstr Single line of text Source code iniscc_core\\code_meta.py "},{"location":"units/code_meta/#iscc_core.code_meta.text_trim","title":"text_trim(text, nbytes) ","text":"Trim text such that its utf-8 encoded size does not exceed iscc_core\\code_meta.py "},{"location":"units/content/","title":"ISCC - Content-Codes","text":""},{"location":"units/content/code_content_audio/","title":"ISCC - Audio-Code","text":"A similarity preserving hash for audio content (soft hash). Creates an ISCC object that provides an The Content-Code Audio is generated from a Chromaprint fingerprint provided as a vector of 32-bit signed integers. The iscc-sdk uses fpcalc to extract Chromaprint vectors with the following command line parameters:
gen_audio_code(cv, bits = ic.core_opts.audio_bits) ","text":"Create an ISCC Content-Code Audio with the latest standard algorithm. Parameters: Name Type Description Defaultcv Iterable[int] Chromaprint vector requiredbits int Bit-length resulting Content-Code Audio (multiple of 64) ic.core_opts.audio_bits Returns: Type Descriptiondict ISCC object with Content-Code Audio "},{"location":"units/content/code_content_audio/#iscc_core.code_content_audio.gen_audio_code_v0","title":"gen_audio_code_v0(cv, bits = ic.core_opts.audio_bits) ","text":"Create an ISCC Content-Code Audio with algorithm v0. Parameters: Name Type Description Defaultcv Iterable[int] Chromaprint vector requiredbits int Bit-length resulting Content-Code Audio (multiple of 64) ic.core_opts.audio_bits Returns: Type Descriptiondict ISCC object with Content-Code Audio "},{"location":"units/content/code_content_audio/#iscc_core.code_content_audio.soft_hash_audio_v0","title":"soft_hash_audio_v0(cv, bits = ic.core_opts.audio_bits) ","text":"Create audio similarity hash from a chromaprint vector. Parameters: Name Type Description Defaultcv Iterable[int] Chromaprint vector requiredbits int Bit-length resulting similarity hash (multiple of 32) ic.core_opts.audio_bits Returns: Type Descriptionbytes Audio-Hash digest "},{"location":"units/content/code_content_image/","title":"ISCC - Image-Code","text":"A similarity preserving perceptual hash for images. The ISCC Content-Code Image is created by calculating a discrete cosine transform on normalized image-pixels and comparing the values from the upper left area of the dct-matrix against their median values to set the hash-bits. Images must be normalized before using gen_image_code. Prepare images as follows:
gen_image_code(pixels, bits = ic.core_opts.image_bits) ","text":"Create an ISCC Content-Code Image with the latest standard algorithm. Parameters: Name Type Description Defaultpixels Sequence[int] Normalized image pixels (32x32 flattened gray values). requiredbits int Bit-length of ISCC Content-Code Image (default 64). ic.core_opts.image_bits Returns: Type DescriptionISCC ISCC object with Content-Code Image. "},{"location":"units/content/code_content_image/#iscc_core.code_content_image.gen_image_code_v0","title":"gen_image_code_v0(pixels, bits = ic.core_opts.image_bits) ","text":"Create an ISCC Content-Code Image with algorithm v0. Parameters: Name Type Description Defaultpixels Sequence[int] Normalized image pixels (32x32 flattened gray values) requiredbits int Bit-length of ISCC Content-Code Image (default 64). ic.core_opts.image_bits Returns: Type DescriptionISCC ISCC object with Content-Code Image. "},{"location":"units/content/code_content_image/#iscc_core.code_content_image.soft_hash_image_v0","title":"soft_hash_image_v0(pixels, bits = ic.core_opts.image_bits) ","text":"Calculate image hash from normalized grayscale pixel sequence of length 1024. Parameters: Name Type Description Defaultpixels Sequence[int] required bits int Bit-length of image hash (default 64). ic.core_opts.image_bits Returns: Type Descriptionbytes Similarity preserving Image-Hash digest. "},{"location":"units/content/code_content_mixed/","title":"ISCC - Mixed Code","text":"A similarity hash for mixed media content. Creates an ISCC object that provides a Many digital assets embed multiple assets of different mediatypes in a single file. Text documents may include images, video includes audio in most cases. The ISCC Content-Code-Mixed encodes the similarity of a collection of assets of the same or different mediatypes that may occur in a multimedia asset. Applications that create mixed Content-Codes must be capable to extract embedded assets and create individual Content-Codes per asset. "},{"location":"units/content/code_content_mixed/#iscc_core.code_content_mixed.gen_mixed_code","title":"gen_mixed_code(codes, bits = ic.core_opts.mixed_bits) ","text":"Create an ISCC Content-Code Mixed with the latest standard algorithm. Parameters: Name Type Description Defaultcodes Iterable[str] a list of Content-Codes. requiredbits int Target bit-length of generated Content-Code-Mixed. ic.core_opts.mixed_bits Returns: Type Descriptiondict ISCC object with Content-Code Mixed. "},{"location":"units/content/code_content_mixed/#iscc_core.code_content_mixed.gen_mixed_code_v0","title":"gen_mixed_code_v0(codes, bits = ic.core_opts.mixed_bits) ","text":"Create an ISCC Content-Code-Mixed with algorithm v0. If the provided codes are of mixed length they are stripped to Parameters: Name Type Description Defaultcodes Iterable[str] a list of Content-Codes. requiredbits int Target bit-length of generated Content-Code-Mixed. ic.core_opts.mixed_bits Returns: Type Descriptiondict ISCC object with Content-Code Mixed. "},{"location":"units/content/code_content_mixed/#iscc_core.code_content_mixed.soft_hash_codes_v0","title":"soft_hash_codes_v0(cc_digests, bits = ic.core_opts.mixed_bits) ","text":"Create a similarity hash from multiple Content-Code digests. The similarity hash is created from the bodies of the input codes with the first byte of the code-header prepended. All codes must be of main-type CONTENT and have a minimum length of Parameters: Name Type Description Defaultcc_digests Sequence[bytes] a list of Content-Code digests. requiredbits int Target bit-length of generated Content-Code-Mixed. ic.core_opts.mixed_bits Returns: Type Descriptionbytes Similarity preserving byte hash. "},{"location":"units/content/code_content_text/","title":"ISCC - Text Code","text":"A similarity preserving hash for plain-text content (soft hash). The ISCC Text-Code is generated from plain-text that has been extracted from a media assets. Warning Plain-text extraction from documents in various formats (especially PDF) may yield very diffent results depending on the extraction tools being used. The iscc-sdk uses Apache Tika to extract text from documents for Text-Code generation. Algorithm overview
gen_text_code_v0(text, bits = ic.core_opts.text_bits) ","text":"Create an ISCC Text-Code with algorithm v0. Note Any markup (like HTML tags or markdown) should be removed from the plain-text before passing it to this function. Parameters: Name Type Description Defaulttext str Text for Text-Code creation requiredbits int Bit-length of ISCC Code Hash (default 64) ic.core_opts.text_bits Returns: Type Descriptiondict ISCC schema instance with Text-Code and an aditional property iscc_core\\code_content_text.py "},{"location":"units/content/code_content_text/#iscc_core.code_content_text.text_collapse","title":"text_collapse(text) ","text":"Normalize and simplify text for similarity hashing.
Note See: Unicode normalization. Parameters: Name Type Description Defaulttext str Plain text to be collapsed. requiredReturns: Type Descriptionstr Collapsed plain text. Source code iniscc_core\\code_content_text.py "},{"location":"units/content/code_content_text/#iscc_core.code_content_text.soft_hash_text_v0","title":"soft_hash_text_v0(text) ","text":"Creates a 256-bit similarity preserving hash for text input with algorithm v0.
Note Before passing text to this function it must be:
Parameters: Name Type Description Defaulttext str Plain text to be hashed. requiredReturns: Type Descriptionbytes 256-bit similarity preserving byte hash. Source code iniscc_core\\code_content_text.py "},{"location":"units/content/code_content_video/","title":"ISCC - Video-Code","text":"A similarity preserving hash for video content The Content-Code Video is generated from MPEG-7 video frame signatures. The iscc-sdk uses ffmpeg to extract frame signatures with the following command line parameters:
The relevant frame signatures can be parsed from the following elements in sig.xml:
Tip It is also possible to extract the signatures in a more compact binary format. But the format requires a custom binary parser to decode the frame signaturs. "},{"location":"units/content/code_content_video/#iscc_core.code_content_video.gen_video_code","title":"gen_video_code(frame_sigs, bits = ic.core_opts.video_bits) ","text":"Create an ISCC Video-Code with the latest standard algorithm. Parameters: Name Type Description Defaultframe_sigs ic.FrameSig Sequence of MP7 frame signatures requiredbits int Bit-length resulting Instance-Code (multiple of 64) ic.core_opts.video_bits Returns: Type Descriptiondict ISCC object with Video-Code "},{"location":"units/content/code_content_video/#iscc_core.code_content_video.gen_video_code_v0","title":"gen_video_code_v0(frame_sigs, bits = ic.core_opts.video_bits) ","text":"Create an ISCC Video-Code with algorithm v0. Parameters: Name Type Description Defaultframe_sigs ic.FrameSig Sequence of MP7 frame signatures requiredbits int Bit-length resulting Video-Code (multiple of 64) ic.core_opts.video_bits Returns: Type Descriptiondict ISCC object with Video-Code "},{"location":"units/content/code_content_video/#iscc_core.code_content_video.soft_hash_video_v0","title":"soft_hash_video_v0(frame_sigs, bits = ic.core_opts.video_bits) ","text":"Compute video hash v0 from MP7 frame signatures. Parameters: Name Type Description Defaultframe_sigs ic.FrameSig 2D matrix of MP7 frame signatures requiredbits int Bit-length of resulting Video-Code (multiple of 64) ic.core_opts.video_bits "},{"location":"utilities/utils/","title":"ISCC - Utilities","text":""},{"location":"utilities/utils/#iscc_core.utils.json_canonical","title":"json_canonical(obj) ","text":"Canonical, deterministic serialization of ISCC metadata. We serialize ISCC metadata in a deterministic/reproducible manner by using JCS (RFC 8785) canonicalization. Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.sliding_window","title":"sliding_window(seq, width) ","text":"Generate a sequence of equal \"width\" slices each advancing by one elemnt. All types that have a length and can be sliced are supported (list, tuple, str ...). The result type matches the type of the input sequence. Fragment slices smaller than the width at the end of the sequence are not produced. If \"witdh\" is smaller than the input sequence than one element will be returned that is shorter than the requested width. Parameters: Name Type Description Defaultseq Sequence Sequence of values to slide over requiredwidth int Width of sliding window in number of items requiredReturns: Type DescriptionGenerator A generator of window sized items Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_compare","title":"iscc_compare(a, b) ","text":"Calculate separate hamming distances of compatible components of two ISCCs Returns: Type Descriptiondict A dict with keys meta_dist, semantic_dist, content_dist, data_dist, instance_match Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_similarity","title":"iscc_similarity(a, b) ","text":"Calculate similarity of ISCC codes as a percentage value (0-100). MainType, SubType, Version and Length of the codes must be the same. Parameters: Name Type Description Defaulta ISCC a requiredb ISCC b requiredReturns: Type Descriptionint Similarity of ISCC a and b in percent (based on hamming distance) Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_distance","title":"iscc_distance(a, b) ","text":"Calculate hamming distance of ISCC codes. MainType, SubType, Version and Length of the codes must be the same. Parameters: Name Type Description Defaulta ISCC a requiredb ISCC b requiredReturns: Type Descriptionint Hamming distanced in number of bits. Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_distance_bytes","title":"iscc_distance_bytes(a, b) ","text":"Calculate hamming distance for binary hash digests of equal length. Parameters: Name Type Description Defaulta bytes binary hash digest requiredb bytes binary hash digest requiredReturns: Type Descriptionint Hamming distance in number of bits. Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_pair_unpack","title":"iscc_pair_unpack(a, b) ","text":"Unpack two ISCC codes and return their body hash digests if their headers match. Headers match if their MainType, SubType, and Version are identical. Parameters: Name Type Description Defaulta ISCC a requiredb ISCC b requiredReturns: Type DescriptionTuple[bytes, bytes] Tuple with hash digests of a and b Raises: Type DescriptionValueError If ISCC headers don\u00b4t match Source code iniscc_core\\utils.py "}]}
\ No newline at end of file
+{"config":{"lang":["en"],"separator":"[\\s\\-]+","pipeline":["stopWordFilter"]},"docs":[{"location":"","title":"ISCC - Codec & Algorithms","text":"
The ISCC is a similarity preserving fingerprint and identifier for digital media assets. ISCCs are generated algorithmically from digital content, just like cryptographic hashes. However, instead of using a single cryptographic hash function to identify data only, the ISCC uses various algorithms to create a composite identifier that exhibits similarity-preserving properties (soft hash). The component-based structure of the ISCC identifies content at multiple levels of abstraction. Each component is self-describing, modular, and can be used separately or with others to aid in various content identification tasks. The algorithmic design supports content deduplication, database synchronization, indexing, integrity verification, timestamping, versioning, data provenance, similarity clustering, anomaly detection, usage tracking, allocation of royalties, fact-checking and general digital asset management use-cases. "},{"location":"#what-is-iscc-core","title":"What isiscc-core ","text":"
Tip This is a low level reference implementation that does not inlcude features like mediatype detection, metadata extraction or file format specific content extraction. Please have a look at iscc-sdk which adds those higher level features on top of the For reproducible installation of the reference implementation we included a "},{"location":"#testing-conformance","title":"Testing & Conformance","text":"The reference implementation comes with 100% test coverage. To run the conformance selftest from the repository root use To build a conformant implementation work through the follwing top level entrypoint functions: The corresponding test vectors can be found in Use the package manager pip to install "},{"location":"#quick-start","title":"Quick Start","text":" The output of this example is as follows: "},{"location":"#documentation","title":"Documentation","text":"Documentation is published at https://core.iscc.codes "},{"location":"#development","title":"Development","text":"Requirements
Development Setup Development Tasks Tests, coverage, code formatting and other tasks can be run with the Use @titusz "},{"location":"#contributing","title":"Contributing","text":"Pull requests are welcome. For significant changes, please open an issue first to discuss your plans. Please make sure to update tests as appropriate. You may also want join our developer chat on Telegram at https://t.me/iscc_dev. "},{"location":"changelog/","title":"Changelog","text":""},{"location":"changelog/#109-2024-03-17","title":"[1.0.9] - 2024-03-17","text":"
An application that claims ISCC conformance MUST pass all core functions from the ISCC conformance test suite. The test suite is available as JSON data on GitHub. Test data is structured as follows: Inputs that are expected to be Example Byte-stream outputs in JSON test data: "},{"location":"conformance/#iscc_core.conformance.conformance_testdata","title":"conformance_testdata() ","text":"Yield tuples of test data. Returns: Type DescriptionGenerator[Tuple[str, Callable, List[Any], List[Any]]] Tuple with testdata (test_name, func_obj, inputs, outputs) "},{"location":"conformance/#iscc_core.conformance.conformance_selftest","title":"conformance_selftest() ","text":"Run conformance tests. Returns: Type Descriptionbool whether all tests passed "},{"location":"constants/","title":"ISCC - Types and Constants","text":""},{"location":"constants/#iscc_core.constants.MT","title":"MT ","text":""},{"location":"constants/#iscc_core.constants.MT--mt-maintypes","title":"MT - MainTypes","text":"Uint Symbol Bits Purpose 0 META 0000 Match on metadata similarity 1 SEMANTIC 0001 Match on semantic content similarity 2 CONTENT 0010 Match on perceptual content similarity 3 DATA 0011 Match on data similarity 4 INSTANCE 0100 Match on data identity 5 ISCC 0101 Composite of two or more ISCC-UNITs with common header"},{"location":"constants/#iscc_core.constants.ST","title":"ST ","text":""},{"location":"constants/#iscc_core.constants.ST--st-subtypes","title":"ST - SubTypes","text":"Uint Symbol Bits Purpose 0 NONE 0000 For MainTypes that do not specify SubTypes"},{"location":"constants/#iscc_core.constants.ST_CC","title":"ST_CC ","text":""},{"location":"constants/#iscc_core.constants.ST_CC--st_cc","title":"ST_CC","text":"SubTypes for ST_ISCC ","text":""},{"location":"constants/#iscc_core.constants.ST_ISCC--st_iscc","title":"ST_ISCC","text":"SubTypes for VS ","text":""},{"location":"constants/#iscc_core.constants.VS--vs-version","title":"VS - Version","text":"Code Version Uint Symbol Bits Purpose 0 V0 0000 Initial Version of Code without breaking changes"},{"location":"constants/#iscc_core.constants.LN","title":"LN ","text":""},{"location":"constants/#iscc_core.constants.LN--ln-length","title":"LN - Length","text":"Valid lengths for hash-digests.
MULTIBASE ","text":"Supported Multibase encodings.
A multi-component identifier for digital media assets. An ISCC-CODE can be generated from the concatenation of the digests of the following five ISCC-UNITs together with a single common header:
The following sequences of ISCC-UNITs are possible:
gen_iscc_code_v0(codes) ","text":"Combine multiple ISCC-UNITS to an ISCC-CODE with a common header using algorithm v0. Parameters: Name Type Description Defaultcodes Sequence[str] A valid sequence of singluar ISCC-UNITS. requiredReturns: Type Descriptiondict An ISCC object with ISCC-CODE Source code iniscc_core\\iscc_code.py "},{"location":"iscc_id/","title":"ISCC-ID","text":"A decentralized, owned, and short identifier for digital assets. The ISCC-ID is generated from a similarity-hash of the units of an ISCC-CODE together with a blockchain wallet address. Its SubType designates the blockchain from which the ISCC-ID was minted. The similarity-hash is always at least 64-bits and optionally suffixed with a gen_iscc_id(iscc_code, chain_id, wallet, uc = 0) ","text":"Generate ISCC-ID from ISCC-CODE with the latest standard algorithm. Parameters: Name Type Description Defaultiscc_code str The ISCC-CODE from which to mint the ISCC-ID. requiredchain_id int Chain-ID of blockchain from which the ISCC-ID is minted. requiredwallet str The wallet address that signes the ISCC declaration requireduc int Uniqueness counter of ISCC-ID. 0 Returns: Type Descriptiondict ISCC object with an ISCC-ID "},{"location":"iscc_id/#iscc_core.iscc_id.gen_iscc_id_v0","title":"gen_iscc_id_v0(iscc_code, chain_id, wallet, uc = 0) ","text":"Generate an ISCC-ID from an ISCC-CODE with uniqueness counter 'uc' with algorithm v0. Parameters: Name Type Description Defaultiscc_code str The ISCC-CODE from which to mint the ISCC-ID. requiredchain_id int Chain-ID of blockchain from which the ISCC-ID is minted. requiredwallet str The wallet address that signes the ISCC declaration requireduc int Uniqueness counter of ISCC-ID. 0 Returns: Type Descriptiondict ISCC object with an ISCC-ID "},{"location":"iscc_id/#iscc_core.iscc_id.soft_hash_iscc_id_v0","title":"soft_hash_iscc_id_v0(iscc_code, wallet, uc = 0) ","text":"Calculate ISCC-ID hash digest from ISCC-CODE with algorithm v0. Accepts an ISCC-CODE or any sequence of ISCC-UNITs. Parameters: Name Type Description Defaultiscc_code str ISCC-CODE requiredwallet str The wallet address that signes the ISCC declaration requireduc int Uniqueness counter for ISCC-ID. 0 Returns: Type Descriptionbytes Digest for ISCC-ID without header but including uniqueness counter. "},{"location":"iscc_id/#iscc_core.iscc_id.iscc_id_incr","title":"iscc_id_incr(iscc_id) ","text":"Increment uniqueness counter of an ISCC-ID with latest standard algorithm. Parameters: Name Type Description Defaultiscc_id str Base32-encoded ISCC-ID. requiredReturns: Type Descriptionstr Base32-encoded ISCC-ID with counter incremented by one. "},{"location":"iscc_id/#iscc_core.iscc_id.iscc_id_incr_v0","title":"iscc_id_incr_v0(iscc_id) ","text":"Increment uniqueness counter of an ISCC-ID with algorithm v0. Parameters: Name Type Description Defaultiscc_id str Base32-encoded ISCC-ID. requiredReturns: Type Descriptionstr Base32-encoded ISCC-ID with counter incremented by one (without \"ISCC:\" prefix). "},{"location":"iscc_id/#iscc_core.iscc_id.alg_simhash_from_iscc_id","title":"alg_simhash_from_iscc_id(iscc_id, wallet) ","text":"Extract similarity preserving hex-encoded hash digest from ISCC-ID We need to un-xor the ISCC-ID hash digest with the wallet address hash to obtain the similarity preserving bytestring. "},{"location":"iso-reference/","title":"ISCC - ISO Reference","text":"The following functions are the reference implementations of ISO 24138: "},{"location":"iso-reference/#iso-24138-51-meta-code","title":"ISO 24138 / 5.1 Meta-Code","text":"gen_meta_code_v0(name, description=None, meta=None, bits=ic.core_opts.meta_bits) # Create an ISCC Meta-Code with the algorithm version 0. Parameters: Name Type Description Defaultname str Name or title of the work manifested by the digital asset requireddescription Optional[str] Optional description for disambiguation None meta Optional[Union[dict,str] Dict or Data-URL string with extended metadata None bits int Bit-length of resulting Meta-Code (multiple of 64) ic.core_opts.meta_bits Returns: Type Descriptiondict ISCC object with possible fields: iscc, name, description, metadata, metahash Source code iniscc_core\\code_meta.py "},{"location":"iso-reference/#iso-24138-53-text-code","title":"ISO 24138 / 5.3 Text-Code","text":"gen_text_code_v0(text, bits=ic.core_opts.text_bits) # Create an ISCC Text-Code with algorithm v0. Note Any markup (like HTML tags or markdown) should be removed from the plain-text before passing it to this function. Parameters: Name Type Description Defaulttext str Text for Text-Code creation requiredbits int Bit-length of ISCC Code Hash (default 64) ic.core_opts.text_bits Returns: Type Descriptiondict ISCC schema instance with Text-Code and an aditional property iscc_core\\code_content_text.py "},{"location":"iso-reference/#iso-24138-54-image-code","title":"ISO 24138 / 5.4 Image-Code","text":"gen_image_code_v0(pixels, bits=ic.core_opts.image_bits) # Create an ISCC Content-Code Image with algorithm v0. Parameters: Name Type Description Defaultpixels Sequence[int] Normalized image pixels (32x32 flattened gray values) requiredbits int Bit-length of ISCC Content-Code Image (default 64). ic.core_opts.image_bits Returns: Type DescriptionISCC ISCC object with Content-Code Image. Source code iniscc_core\\code_content_image.py "},{"location":"iso-reference/#iso-24138-55-audio-code","title":"ISO 24138 / 5.5 Audio-Code","text":"gen_audio_code_v0(cv, bits=ic.core_opts.audio_bits) # Create an ISCC Content-Code Audio with algorithm v0. Parameters: Name Type Description Defaultcv Iterable[int] Chromaprint vector requiredbits int Bit-length resulting Content-Code Audio (multiple of 64) ic.core_opts.audio_bits Returns: Type Descriptiondict ISCC object with Content-Code Audio Source code iniscc_core\\code_content_audio.py "},{"location":"iso-reference/#iso-24138-56-video-code","title":"ISO 24138 / 5.6 Video-Code","text":"gen_video_code_v0(frame_sigs, bits=ic.core_opts.video_bits) # Create an ISCC Video-Code with algorithm v0. Parameters: Name Type Description Defaultframe_sigs ic.FrameSig Sequence of MP7 frame signatures requiredbits int Bit-length resulting Video-Code (multiple of 64) ic.core_opts.video_bits Returns: Type Descriptiondict ISCC object with Video-Code Source code iniscc_core\\code_content_video.py "},{"location":"iso-reference/#iso-24138-57-mixed-code","title":"ISO 24138 / 5.7 Mixed-Code","text":"gen_mixed_code_v0(codes, bits=ic.core_opts.mixed_bits) # Create an ISCC Content-Code-Mixed with algorithm v0. If the provided codes are of mixed length they are stripped to Parameters: Name Type Description Defaultcodes Iterable[str] a list of Content-Codes. requiredbits int Target bit-length of generated Content-Code-Mixed. ic.core_opts.mixed_bits Returns: Type Descriptiondict ISCC object with Content-Code Mixed. Source code iniscc_core\\code_content_mixed.py "},{"location":"iso-reference/#iso-24138-58-data-code","title":"ISO 24138 / 5.8 Data-Code","text":"gen_data_code_v0(stream, bits=ic.core_opts.data_bits) # Create an ISCC Data-Code with algorithm v0. Parameters: Name Type Description Defaultstream Stream Input data stream. requiredbits int Bit-length of ISCC Data-Code (default 64). ic.core_opts.data_bits Returns: Type Descriptiondict ISCC object with Data-Code Source code iniscc_core\\code_data.py "},{"location":"iso-reference/#iso-24138-59-instance-code","title":"ISO 24138 / 5.9 Instance-Code","text":"gen_instance_code_v0(stream, bits=ic.core_opts.instance_bits) # Create an ISCC Instance-Code with algorithm v0. Parameters: Name Type Description Defaultstream Stream Binary data stream for Instance-Code generation requiredbits int Bit-length of resulting Instance-Code (multiple of 64) ic.core_opts.instance_bits Returns: Type Descriptiondict ISCC object with Instance-Code and properties: datahash, filesize Source code iniscc_core\\code_instance.py "},{"location":"iso-reference/#iso-24138-60-iscc-code","title":"ISO 24138 / 6.0 ISCC-CODE","text":"gen_iscc_code_v0(codes) # Combine multiple ISCC-UNITS to an ISCC-CODE with a common header using algorithm v0. Parameters: Name Type Description Defaultcodes Sequence[str] A valid sequence of singluar ISCC-UNITS. requiredReturns: Type Descriptiondict An ISCC object with ISCC-CODE Source code iniscc_core\\iscc_code.py "},{"location":"algorithms/cdc/","title":"ISCC - Content Defined Chunking","text":"Compatible with fastcdc "},{"location":"algorithms/cdc/#iscc_core.cdc.alg_cdc_chunks","title":"alg_cdc_chunks(data, utf32, avg_chunk_size = ic.core_opts.data_avg_chunk_size) ","text":"A generator that yields data-dependent chunks for Usage Example: Parameters: Name Type Description Defaultdata bytes Raw data for variable sized chunking. requiredutf32 bool If true assume we are chunking text that is utf32 encoded. requiredavg_chunk_size int Target chunk size in number of bytes. ic.core_opts.data_avg_chunk_size Returns: Type DescriptionGenerator[bytes] A generator that yields data chunks of variable sizes. Source code iniscc_core\\cdc.py "},{"location":"algorithms/cdc/#iscc_core.cdc.alg_cdc_offset","title":"alg_cdc_offset(buffer, mi, ma, cs, mask_s, mask_l) ","text":"Find breakpoint offset for a given buffer. Parameters: Name Type Description Defaultbuffer Data The data to be chunked. requiredmi int Minimum chunk size. requiredma int Maximung chunk size. requiredcs int Center size. requiredmask_s int Small mask. requiredmask_l int Large mask. requiredReturns: Type Descriptionint Offset of dynamic cutpoint in number of bytes. Source code iniscc_core\\cdc.py "},{"location":"algorithms/cdc/#iscc_core.cdc.alg_cdc_params","title":"alg_cdc_params(avg_size: int) -> tuple ","text":"Calculate CDC parameters Parameters: Name Type Description Defaultavg_size int Target average size of chunks in number of bytes. requiredReturns: Type Descriptiontuple Tuple of (min_size, max_size, center_size, mask_s, mask_l). Source code iniscc_core\\cdc.py "},{"location":"algorithms/dct/","title":"ISCC - Discrete Cosine Transform","text":""},{"location":"algorithms/dct/#iscc_core.dct.alg_dct","title":"alg_dct(v) ","text":"Discrete cosine transform. See: nayuki.io. Parameters: Name Type Description Defaultv Sequence[float] Input vector for DCT calculation. requiredReturns: Type DescriptionList DCT Transformed vector. Source code iniscc_core\\dct.py "},{"location":"algorithms/minhash/","title":"ISCC - Minhash","text":""},{"location":"algorithms/minhash/#iscc_core.minhash.alg_minhash","title":"alg_minhash(features) ","text":"Calculate a 64 dimensional minhash integer vector. Parameters: Name Type Description Defaultfeatures List[int] List of integer features requiredReturns: Type DescriptionList[int] Minhash vector Source code iniscc_core\\minhash.py "},{"location":"algorithms/minhash/#iscc_core.minhash.alg_minhash_64","title":"alg_minhash_64(features) ","text":"Create 64-bit minimum hash digest. Parameters: Name Type Description Defaultfeatures List[int] List of integer features requiredReturns: Type Descriptionbytes 64-bit binary from the least significant bits of the minhash values Source code iniscc_core\\minhash.py "},{"location":"algorithms/minhash/#iscc_core.minhash.alg_minhash_256","title":"alg_minhash_256(features) ","text":"Create 256-bit minimum hash digest. Parameters: Name Type Description Defaultfeatures List[int] List of integer features requiredReturns: Type Descriptionbytes 256-bit binary from the least significant bits of the minhash values Source code iniscc_core\\minhash.py "},{"location":"algorithms/minhash/#iscc_core.minhash.alg_minhash_compress","title":"alg_minhash_compress(mhash, lsb = 4) ","text":"Compress minhash vector to byte hash-digest. Concatenates Parameters: Name Type Description Defaultmhash List[int] List of minhash integer features requiredlsb int Number of the least significant bits to retain 4 Returns: Type Descriptionbytes 256-bit binary from the least significant bits of the minhash values Source code iniscc_core\\minhash.py "},{"location":"algorithms/simhash/","title":"ISCC - Simhash","text":""},{"location":"algorithms/simhash/#iscc_core.simhash.alg_simhash","title":"alg_simhash(hash_digests) ","text":"Creates a similarity preserving hash from a sequence of equal sized hash digests. Parameters: Name Type Description Defaulthash_digests list A sequence of equaly sized byte-hashes. requiredReturns: Type Descriptionbytes Similarity byte-hash Source code iniscc_core\\simhash.py "},{"location":"algorithms/wtahash/","title":"ISCC - Winner Takes All Hash","text":""},{"location":"algorithms/wtahash/#iscc_core.wtahash.alg_wtahash","title":"alg_wtahash(vec: Sequence[float], bits: Sequence[float]) -> bytes ","text":"Calculate WTA Hash for vector with 380 values (MP7 frame signature). Source code iniscc_core\\wtahash.py "},{"location":"codec/","title":"ISCC - Codec","text":"This module implements encoding, decoding and transcoding functions of ISCC "},{"location":"codec/#codec-overview","title":"Codec Overview","text":""},{"location":"codec/#codec-functions","title":"Codec Functions","text":""},{"location":"codec/#iscc_core.codec.encode_component","title":"encode_component(mtype, stype, version, bit_length, digest) ","text":"Encode an ISCC-UNIT inlcuding header and body with standard base32 encoding. Note The Parameters: Name Type Description Defaultmtype MainType Maintype of unit (0-6) requiredstype SubType SubType of unit depending on MainType (0-5) requiredversion Version Version of unit algorithm (0). requiredbit_length length Length of unit, in number of bits (multiple of 32) requireddigest bytes The hash digest of the unit. requiredReturns: Type Descriptionstr Base32 encoded ISCC-UNIT. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_header","title":"encode_header(mtype, stype, version = 0, length = 1) ","text":"Encodes header values with nibble-sized (4-bit) variable-length encoding. The result is minimum 2 and maximum 8 bytes long. If the final count of nibbles is uneven it is padded with 4-bit Warning The length value must be encoded beforhand because its semantics depend on the MainType (see Parameters: Name Type Description Defaultmtype MainType MainType of unit. requiredstype SubType SubType of unit. requiredversion Version Version of component algorithm. 0 length Length length value of unit (1 means 64-bits for standard units) 1 Returns: Type Descriptionbytes Varnibble stream encoded ISCC header as bytes. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_header","title":"decode_header(data) ","text":"Decodes varnibble encoded header and returns it together with Tail data is included to enable decoding of sequential ISCCs. The returned tail data must be truncated to decode_length(r[0], r[3]) bits to recover the actual hash-bytes. Parameters: Name Type Description Defaultdata bytes ISCC bytes requiredReturns: Type DescriptionIsccTuple (MainType, SubType, Version, length, TailData) Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_varnibble","title":"encode_varnibble(n) ","text":"Writes integer to variable length sequence of 4-bit chunks. Variable-length encoding scheme: prefix bits nibbles data bits unsigned range 0 1 3 0 - 7 10 2 6 8 - 71 110 3 9 72 - 583 1110 4 12 584 - 4679Parameters: Name Type Description Defaultn int Positive integer to be encoded as varnibble (0-4679) requiredReturns: Type Descriptionbitarray Varnibble encoded integera Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_varnibble","title":"decode_varnibble(b) ","text":"Reads first varnibble, returns its integer value and remaining bits. Parameters: Name Type Description Defaultb bitarray Array of header bits requiredReturns: Type DescriptionTuple[int, bitarray] A tuple of the integer value of first varnible and the remaining bits. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_units","title":"encode_units(units) ","text":"Encodes a combination of ISCC units to an integer between 0-7 to be used as length value for the final encoding of MT.ISCC Parameters: Name Type Description Defaultunits Tuple A tuple of a MainType combination (can be empty) requiredReturns: Type Descriptionint Integer value to be used as length-value for header encoding Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_units","title":"decode_units(unit_id) ","text":"Decodes an ISCC header length value that has been encoded with a unit_id to an ordered tuple of MainTypes. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_length","title":"encode_length(mtype, length) ","text":"Encode length to integer value for header encoding. The For MainTypes For MainType For MainType Parameters: Name Type Description Defaultmtype MainType The MainType for which to encode the length value. requiredlength Length The length expressed according to the semantics of the type requiredReturns: Type Descriptionint The length value encoded as integer for use with write_header. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_length","title":"decode_length(mtype, length) ","text":"Dedoce raw length value from ISCC header to length of digest in number of bits. Decodes a raw header integer value in to its semantically meaningfull value (e.g. number of bits) Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_base32","title":"encode_base32(data) ","text":"Standard RFC4648 base32 encoding without padding. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_base32","title":"decode_base32(code) ","text":"Standard RFC4648 base32 decoding without padding and with casefolding. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_decompose","title":"iscc_decompose(iscc_code) ","text":"Decompose a normalized ISCC-CODE or any valid ISCC sequence into a list of ISCC-UNITS. A valid ISCC sequence is a string concatenation of ISCC-UNITS optionally seperated by a hyphen. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_normalize","title":"iscc_normalize(iscc_code) ","text":"Normalize an ISCC to its canonical form. The canonical form of an ISCC is its shortest base32 encoded representation prefixed with the string Possible valid inputs: Info A concatenated sequence of codes will be composed into a single ISCC of MainType Example Parameters: Name Type Description Defaultiscc_code str Any valid ISCC string requiredReturns: Type Descriptionstr Normalized ISCC Source code iniscc_core\\codec.py "},{"location":"codec/#alternate-encodings","title":"Alternate Encodings","text":""},{"location":"codec/#iscc_core.codec.encode_base64","title":"encode_base64(data) ","text":"Standard RFC4648 base64url encoding without padding. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_base64","title":"decode_base64(code) ","text":"Standard RFC4648 base64url decoding without padding. Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.encode_base32hex","title":"encode_base32hex(data) ","text":"RFC4648 Base32hex encoding without padding see: https://tools.ietf.org/html/rfc4648#page-10 Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.decode_base32hex","title":"decode_base32hex(code) ","text":"RFC4648 Base32hex decoding without padding see: https://tools.ietf.org/html/rfc4648#page-10 Source code iniscc_core\\codec.py "},{"location":"codec/#helper-functions","title":"Helper Functions","text":""},{"location":"codec/#iscc_core.codec.iscc_decode","title":"iscc_decode(iscc) ","text":"Decode ISCC to an IsccTuple Parameters: Name Type Description Defaultiscc str ISCC string requiredReturns: Type DescriptionIsccTuple ISCC decoded to a tuple Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_explain","title":"iscc_explain(iscc) ","text":"Convert ISCC to a human-readable representation Parameters: Name Type Description Defaultiscc str ISCC string requiredReturns: Type Descriptionstr Human-readable representation of ISCC Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_type_id","title":"iscc_type_id(iscc) ","text":"Extract and convert ISCC HEADER to a readable Type-ID string. Type-ids can be used as names in databases to index ISCC-UNITs seperatly. Parameters: Name Type Description Defaultiscc str ISCC string requiredReturns: Type Descriptionstr Unique Type-ID string Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_validate","title":"iscc_validate(iscc, strict = True) ","text":"Validate that a given string is a strictly well-formed ISCC. A strictly well-formed ISCC is:
Parameters: Name Type Description Defaultiscc str ISCC string requiredstrict bool Raise an exeption if validation fails (default True) True Returns: Type Descriptionbool True if sting is valid else false. (raises ValueError in strict mode) Source code iniscc_core\\codec.py "},{"location":"codec/#iscc_core.codec.iscc_clean","title":"iscc_clean(iscc) ","text":"Cleanup ISCC string. Removes leading scheme, dashes, leading/trailing whitespace. Parameters: Name Type Description Defaultiscc str Any valid ISCC string requiredReturns: Type Descriptionstr Cleaned ISCC string. Source code iniscc_core\\codec.py "},{"location":"options/options/","title":"ISCC-CORE - Configuration Options","text":"Options for the iscc-core package can be configured using environment variables. Variables are loaded as class-attributes on the Example how to access configuration options "},{"location":"options/options/#iscc_core.options.CoreOptions","title":"CoreOptions","text":"Parameters with defaults for ISCC calculations. "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_bits","title":"meta_bitsinstance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_trim_name","title":"meta_trim_name instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_trim_description","title":"meta_trim_description instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_ngram_size_text","title":"meta_ngram_size_text instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.meta_ngram_size_bytes","title":"meta_ngram_size_bytes instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.text_bits","title":"text_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.text_ngram_size","title":"text_ngram_size instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.text_unicode_filter","title":"text_unicode_filter instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.text_newlines","title":"text_newlines instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.image_bits","title":"image_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.audio_bits","title":"audio_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.video_bits","title":"video_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.data_bits","title":"data_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.data_avg_chunk_size","title":"data_avg_chunk_size instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.instance_bits","title":"instance_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.mixed_bits","title":"mixed_bits instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.io_read_size","title":"io_read_size instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.CoreOptions.cdc_gear","title":"cdc_gear instance-attribute class-attribute ","text":" "},{"location":"options/options/#iscc_core.options.conformanc_critical","title":"conformanc_critical module-attribute ","text":" "},{"location":"options/options/#iscc_core.options.has_logged_confromance","title":"has_logged_confromance module-attribute ","text":" "},{"location":"options/options/#iscc_core.options.conformance_check_options","title":"conformance_check_options","text":" Check and log if options have non-default conformance critical values "},{"location":"options/options/#iscc_core.options.core_opts","title":"core_optsmodule-attribute ","text":" "},{"location":"options/options/#iscc_core.options.conformant_options","title":"conformant_options module-attribute ","text":" "},{"location":"units/","title":"ISCC - UNITs","text":"A standard ISCC-CODE is build from multiple ISCC-UNITs. Each unit serve a different purpose. "},{"location":"units/code_data/","title":"ISCC - Data-Code","text":"A similarity perserving hash for binary data (soft hash). "},{"location":"units/code_data/#iscc_core.code_data.gen_data_code","title":"gen_data_code(stream, bits = ic.core_opts.data_bits) ","text":"Create a similarity preserving ISCC Data-Code with the latest standard algorithm. Parameters: Name Type Description Defaultstream Stream Input data stream. requiredbits int Bit-length of ISCC Data-Code (default 64). ic.core_opts.data_bits Returns: Type Descriptiondict ISCC Data-Code "},{"location":"units/code_data/#iscc_core.code_data.gen_data_code_v0","title":"gen_data_code_v0(stream, bits = ic.core_opts.data_bits) ","text":"Create an ISCC Data-Code with algorithm v0. Parameters: Name Type Description Defaultstream Stream Input data stream. requiredbits int Bit-length of ISCC Data-Code (default 64). ic.core_opts.data_bits Returns: Type Descriptiondict ISCC object with Data-Code "},{"location":"units/code_data/#iscc_core.code_data.soft_hash_data_v0","title":"soft_hash_data_v0(stream) ","text":"Create a similarity preserving Data-Hash digest Parameters: Name Type Description Defaultstream Stream Input data stream. requiredReturns: Type Descriptionbytes 256-bit Data-Hash (soft-hash) digest used as body for Data-Code "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0","title":"DataHasherV0 ","text":"Incremental Data-Hash generator. "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0.__init__","title":"__init__(data = None) ","text":"Create a DataHasher Parameters: Name Type Description Defaultdata Optional[Data] initial payload for hashing. None "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0.push","title":"push(data) ","text":"Push data to the Data-Hash generator. "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0.digest","title":"digest() ","text":"Calculate 256-bit minhash digest from feature hashes. "},{"location":"units/code_data/#iscc_core.code_data.DataHasherV0.code","title":"code(bits = ic.core_opts.data_bits) ","text":"Encode digest as an ISCC Data-Code unit. Parameters: Name Type Description Defaultbits int Number of bits for the ISCC Data-Code ic.core_opts.data_bits Returns: Type Descriptionstr ISCC Data-Code "},{"location":"units/code_flake/","title":"ISCC - Flake-Code","text":"A unique, time-sorted identifier composed of an 48-bit timestamp and 16 to 208 bit randomness. The ISCC Flake-Code is a unique identifier for distributed ID generation. The 64-bit version can be used as efficient surrogate key in database systems. It has guaranteed uniqueness if generated from a singele process and is time sortable in integer and base32hex representation. The 128-bit version is a K-sortable, globally unique identifier for use in distributed systems and is compatible with UUID. Example "},{"location":"units/code_flake/#iscc_core.code_flake.gen_flake_code","title":"gen_flake_code(bits = ic.core_opts.flake_bits) ","text":"Create an ISCC Flake-Code with the latest standard algorithm Parameters: Name Type Description Defaultbits int Target bit-length of generated Flake-Code ic.core_opts.flake_bits Returns: Type Descriptiondict ISCC object with Flake-Code "},{"location":"units/code_flake/#iscc_core.code_flake.gen_flake_code_v0","title":"gen_flake_code_v0(bits = ic.core_opts.flake_bits) ","text":"Create an ISCC Flake-Code with the latest algorithm v0 Parameters: Name Type Description Defaultbits int Target bit-length of generated Flake-Code ic.core_opts.flake_bits Returns: Type Descriptiondict ISCC object with Flake-Code "},{"location":"units/code_flake/#iscc_core.code_flake.uid_flake_v0","title":"uid_flake_v0(ts = None, bits = ic.core_opts.flake_bits) ","text":"Generate time and randomness based Flake-Hash Parameters: Name Type Description Defaultts Optional[float] Unix timestamp (defaults to current time) None bits int Bit-length resulting Flake-Code (multiple of 32) ic.core_opts.flake_bits Returns: Type Descriptionbytes Flake-Hash digest "},{"location":"units/code_instance/","title":"ISCC - Instance-Code","text":"A data checksum. "},{"location":"units/code_instance/#iscc_core.code_instance.gen_instance_code","title":"gen_instance_code(stream, bits = ic.core_opts.instance_bits) ","text":"Create an ISCC Instance-Code with the latest standard algorithm. Parameters: Name Type Description Defaultstream Stream Binary data stream for Instance-Code generation requiredbits int Bit-length resulting Instance-Code (multiple of 64) ic.core_opts.instance_bits Returns: Type Descriptiondict ISCC object with properties: iscc, datahash, filesize "},{"location":"units/code_instance/#iscc_core.code_instance.gen_instance_code_v0","title":"gen_instance_code_v0(stream, bits = ic.core_opts.instance_bits) ","text":"Create an ISCC Instance-Code with algorithm v0. Parameters: Name Type Description Defaultstream Stream Binary data stream for Instance-Code generation requiredbits int Bit-length of resulting Instance-Code (multiple of 64) ic.core_opts.instance_bits Returns: Type Descriptiondict ISCC object with Instance-Code and properties: datahash, filesize "},{"location":"units/code_instance/#iscc_core.code_instance.hash_instance_v0","title":"hash_instance_v0(stream) ","text":"Create 256-bit hash digest for the Instance-Code body Parameters: Name Type Description Defaultstream Stream Binary data stream for hash generation. requiredReturns: Type Descriptionbytes 256-bit Instance-Hash digest used as body of Instance-Code "},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0","title":"InstanceHasherV0 ","text":"Incremental Instance-Hash generator. "},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0.push","title":"push(data) ","text":"Push data to the Instance-Hash generator. Parameters: Name Type Description Defaultdata Data Data to be hashed required"},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0.digest","title":"digest() ","text":"Return Instance-Hash Returns: Type Descriptionbytes Instance-Hash digest "},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0.multihash","title":"multihash() ","text":"Return blake3 multihash Returns: Type Descriptionstr Blake3 hash as 256-bit multihash "},{"location":"units/code_instance/#iscc_core.code_instance.InstanceHasherV0.code","title":"code(bits = ic.core_opts.instance_bits) ","text":"Encode digest as an ISCC Instance-Code unit. Parameters: Name Type Description Defaultbits int Number of bits for the ISCC Instance-Code ic.core_opts.instance_bits Returns: Type Descriptionstr ISCC Instance-Code "},{"location":"units/code_meta/","title":"ISCC - Meta-Code","text":"A similarity preserving hash for digital asset metadata. "},{"location":"units/code_meta/#purpose","title":"Purpose","text":"The Meta-Code is the first possible (optional) unit of an ISCC-CODE. It is calculated from the metadata of a digital asset. The primary purpose of the Meta-Code is to aid the discovery of digital assets with similar metadata and the detection of metadata anomalies. As a secondary function, Meta-Code processing also creates a secure Meta-Hash for cryptogrpahic binding purposes. "},{"location":"units/code_meta/#inputs","title":"Inputs","text":"The metadata supplied for Meta-Code calculation is called Seed-Metadata. Seed-Metadata has 3 possible elements:
Note Due to the broad applicability of the ISCC we do not prescribe a particular schema for the Data-URL Examples:
Data-URLs are also supported by all major internet browsers. "},{"location":"units/code_meta/#processing","title":"Processing","text":""},{"location":"units/code_meta/#meta-code","title":"Meta-Code","text":"The first 32-bits of a Meta-Code are calculated as a simliarity hash from the Note To support automation and reproducibility, applications that generate ISCCs, should prioritize metadata that is automatically extracted from the digital asset. If embedded metadata is not available or known to be unreliable an application should rely on external metadata or explicitly ask users to supply at least the If neither embedded nor external metadata is available, the application may resort to use the filename of the digital asset as value for the In addition to the Meta-Code we also create a cryptographic hash (the Meta-Hash) of the supplied Seed-Metadata. It is used to securely bind metadata to the digital asset. "},{"location":"units/code_meta/#functions","title":"Functions","text":""},{"location":"units/code_meta/#iscc_core.code_meta.gen_meta_code_v0","title":"gen_meta_code_v0(name, description = None, meta = None, bits = ic.core_opts.meta_bits) ","text":"Create an ISCC Meta-Code with the algorithm version 0. Parameters: Name Type Description Defaultname str Name or title of the work manifested by the digital asset requireddescription Optional[str] Optional description for disambiguation None meta Optional[Union[dict,str] Dict or Data-URL string with extended metadata None bits int Bit-length of resulting Meta-Code (multiple of 64) ic.core_opts.meta_bits Returns: Type Descriptiondict ISCC object with possible fields: iscc, name, description, metadata, metahash Source code iniscc_core\\code_meta.py "},{"location":"units/code_meta/#iscc_core.code_meta.soft_hash_meta_v0","title":"soft_hash_meta_v0(name, extra = None) ","text":"Calculate simmilarity preserving 256-bit hash digest from asset metadata. Textual input should be stripped of markup, normalized and trimmed before hashing. Bytes input can be any serialized metadata (JSON, XML, Image...). Metadata should be serialized in a canonical form (for example JCS for JSON) Note The processing algorithm depends on the type of the
Parameters: Name Type Description Defaultname str Title of the work manifested in the digital asset requiredextra Union[str,bytes,None] Additional metadata for disambiguation None Returns: Type Descriptionbytes 256-bit simhash digest for Meta-Code Source code iniscc_core\\code_meta.py "},{"location":"units/code_meta/#iscc_core.code_meta.text_clean","title":"text_clean(text) ","text":"Clean text for display.
iscc_core\\code_meta.py "},{"location":"units/code_meta/#iscc_core.code_meta.text_remove_newlines","title":"text_remove_newlines(text) ","text":"Remove newlines. The Parameters: Name Type Description Defaulttext Text for newline removal requiredReturns: Type Descriptionstr Single line of text Source code iniscc_core\\code_meta.py "},{"location":"units/code_meta/#iscc_core.code_meta.text_trim","title":"text_trim(text, nbytes) ","text":"Trim text such that its utf-8 encoded size does not exceed iscc_core\\code_meta.py "},{"location":"units/content/","title":"ISCC - Content-Codes","text":""},{"location":"units/content/code_content_audio/","title":"ISCC - Audio-Code","text":"A similarity preserving hash for audio content (soft hash). Creates an ISCC object that provides an The Content-Code Audio is generated from a Chromaprint fingerprint provided as a vector of 32-bit signed integers. The iscc-sdk uses fpcalc to extract Chromaprint vectors with the following command line parameters:
gen_audio_code(cv, bits = ic.core_opts.audio_bits) ","text":"Create an ISCC Content-Code Audio with the latest standard algorithm. Parameters: Name Type Description Defaultcv Iterable[int] Chromaprint vector requiredbits int Bit-length resulting Content-Code Audio (multiple of 64) ic.core_opts.audio_bits Returns: Type Descriptiondict ISCC object with Content-Code Audio "},{"location":"units/content/code_content_audio/#iscc_core.code_content_audio.gen_audio_code_v0","title":"gen_audio_code_v0(cv, bits = ic.core_opts.audio_bits) ","text":"Create an ISCC Content-Code Audio with algorithm v0. Parameters: Name Type Description Defaultcv Iterable[int] Chromaprint vector requiredbits int Bit-length resulting Content-Code Audio (multiple of 64) ic.core_opts.audio_bits Returns: Type Descriptiondict ISCC object with Content-Code Audio "},{"location":"units/content/code_content_audio/#iscc_core.code_content_audio.soft_hash_audio_v0","title":"soft_hash_audio_v0(cv, bits = ic.core_opts.audio_bits) ","text":"Create audio similarity hash from a chromaprint vector. Parameters: Name Type Description Defaultcv Iterable[int] Chromaprint vector requiredbits int Bit-length resulting similarity hash (multiple of 32) ic.core_opts.audio_bits Returns: Type Descriptionbytes Audio-Hash digest "},{"location":"units/content/code_content_image/","title":"ISCC - Image-Code","text":"A similarity preserving perceptual hash for images. The ISCC Content-Code Image is created by calculating a discrete cosine transform on normalized image-pixels and comparing the values from the upper left area of the dct-matrix against their median values to set the hash-bits. Images must be normalized before using gen_image_code. Prepare images as follows:
gen_image_code(pixels, bits = ic.core_opts.image_bits) ","text":"Create an ISCC Content-Code Image with the latest standard algorithm. Parameters: Name Type Description Defaultpixels Sequence[int] Normalized image pixels (32x32 flattened gray values). requiredbits int Bit-length of ISCC Content-Code Image (default 64). ic.core_opts.image_bits Returns: Type DescriptionISCC ISCC object with Content-Code Image. "},{"location":"units/content/code_content_image/#iscc_core.code_content_image.gen_image_code_v0","title":"gen_image_code_v0(pixels, bits = ic.core_opts.image_bits) ","text":"Create an ISCC Content-Code Image with algorithm v0. Parameters: Name Type Description Defaultpixels Sequence[int] Normalized image pixels (32x32 flattened gray values) requiredbits int Bit-length of ISCC Content-Code Image (default 64). ic.core_opts.image_bits Returns: Type DescriptionISCC ISCC object with Content-Code Image. "},{"location":"units/content/code_content_image/#iscc_core.code_content_image.soft_hash_image_v0","title":"soft_hash_image_v0(pixels, bits = ic.core_opts.image_bits) ","text":"Calculate image hash from normalized grayscale pixel sequence of length 1024. Parameters: Name Type Description Defaultpixels Sequence[int] required bits int Bit-length of image hash (default 64). ic.core_opts.image_bits Returns: Type Descriptionbytes Similarity preserving Image-Hash digest. "},{"location":"units/content/code_content_mixed/","title":"ISCC - Mixed Code","text":"A similarity hash for mixed media content. Creates an ISCC object that provides a Many digital assets embed multiple assets of different mediatypes in a single file. Text documents may include images, video includes audio in most cases. The ISCC Content-Code-Mixed encodes the similarity of a collection of assets of the same or different mediatypes that may occur in a multimedia asset. Applications that create mixed Content-Codes must be capable to extract embedded assets and create individual Content-Codes per asset. "},{"location":"units/content/code_content_mixed/#iscc_core.code_content_mixed.gen_mixed_code","title":"gen_mixed_code(codes, bits = ic.core_opts.mixed_bits) ","text":"Create an ISCC Content-Code Mixed with the latest standard algorithm. Parameters: Name Type Description Defaultcodes Iterable[str] a list of Content-Codes. requiredbits int Target bit-length of generated Content-Code-Mixed. ic.core_opts.mixed_bits Returns: Type Descriptiondict ISCC object with Content-Code Mixed. "},{"location":"units/content/code_content_mixed/#iscc_core.code_content_mixed.gen_mixed_code_v0","title":"gen_mixed_code_v0(codes, bits = ic.core_opts.mixed_bits) ","text":"Create an ISCC Content-Code-Mixed with algorithm v0. If the provided codes are of mixed length they are stripped to Parameters: Name Type Description Defaultcodes Iterable[str] a list of Content-Codes. requiredbits int Target bit-length of generated Content-Code-Mixed. ic.core_opts.mixed_bits Returns: Type Descriptiondict ISCC object with Content-Code Mixed. "},{"location":"units/content/code_content_mixed/#iscc_core.code_content_mixed.soft_hash_codes_v0","title":"soft_hash_codes_v0(cc_digests, bits = ic.core_opts.mixed_bits) ","text":"Create a similarity hash from multiple Content-Code digests. The similarity hash is created from the bodies of the input codes with the first byte of the code-header prepended. All codes must be of main-type CONTENT and have a minimum length of Parameters: Name Type Description Defaultcc_digests Sequence[bytes] a list of Content-Code digests. requiredbits int Target bit-length of generated Content-Code-Mixed. ic.core_opts.mixed_bits Returns: Type Descriptionbytes Similarity preserving byte hash. "},{"location":"units/content/code_content_text/","title":"ISCC - Text Code","text":"A similarity preserving hash for plain-text content (soft hash). The ISCC Text-Code is generated from plain-text that has been extracted from a media assets. Warning Plain-text extraction from documents in various formats (especially PDF) may yield very diffent results depending on the extraction tools being used. The iscc-sdk uses Apache Tika to extract text from documents for Text-Code generation. Algorithm overview
gen_text_code_v0(text, bits = ic.core_opts.text_bits) ","text":"Create an ISCC Text-Code with algorithm v0. Note Any markup (like HTML tags or markdown) should be removed from the plain-text before passing it to this function. Parameters: Name Type Description Defaulttext str Text for Text-Code creation requiredbits int Bit-length of ISCC Code Hash (default 64) ic.core_opts.text_bits Returns: Type Descriptiondict ISCC schema instance with Text-Code and an aditional property iscc_core\\code_content_text.py "},{"location":"units/content/code_content_text/#iscc_core.code_content_text.text_collapse","title":"text_collapse(text) ","text":"Normalize and simplify text for similarity hashing.
Note See: Unicode normalization. Parameters: Name Type Description Defaulttext str Plain text to be collapsed. requiredReturns: Type Descriptionstr Collapsed plain text. Source code iniscc_core\\code_content_text.py "},{"location":"units/content/code_content_text/#iscc_core.code_content_text.soft_hash_text_v0","title":"soft_hash_text_v0(text) ","text":"Creates a 256-bit similarity preserving hash for text input with algorithm v0.
Note Before passing text to this function it must be:
Parameters: Name Type Description Defaulttext str Plain text to be hashed. requiredReturns: Type Descriptionbytes 256-bit similarity preserving byte hash. Source code iniscc_core\\code_content_text.py "},{"location":"units/content/code_content_video/","title":"ISCC - Video-Code","text":"A similarity preserving hash for video content The Content-Code Video is generated from MPEG-7 video frame signatures. The iscc-sdk uses ffmpeg to extract frame signatures with the following command line parameters:
The relevant frame signatures can be parsed from the following elements in sig.xml:
Tip It is also possible to extract the signatures in a more compact binary format. But the format requires a custom binary parser to decode the frame signaturs. "},{"location":"units/content/code_content_video/#iscc_core.code_content_video.gen_video_code","title":"gen_video_code(frame_sigs, bits = ic.core_opts.video_bits) ","text":"Create an ISCC Video-Code with the latest standard algorithm. Parameters: Name Type Description Defaultframe_sigs ic.FrameSig Sequence of MP7 frame signatures requiredbits int Bit-length resulting Instance-Code (multiple of 64) ic.core_opts.video_bits Returns: Type Descriptiondict ISCC object with Video-Code "},{"location":"units/content/code_content_video/#iscc_core.code_content_video.gen_video_code_v0","title":"gen_video_code_v0(frame_sigs, bits = ic.core_opts.video_bits) ","text":"Create an ISCC Video-Code with algorithm v0. Parameters: Name Type Description Defaultframe_sigs ic.FrameSig Sequence of MP7 frame signatures requiredbits int Bit-length resulting Video-Code (multiple of 64) ic.core_opts.video_bits Returns: Type Descriptiondict ISCC object with Video-Code "},{"location":"units/content/code_content_video/#iscc_core.code_content_video.soft_hash_video_v0","title":"soft_hash_video_v0(frame_sigs, bits = ic.core_opts.video_bits) ","text":"Compute video hash v0 from MP7 frame signatures. Parameters: Name Type Description Defaultframe_sigs ic.FrameSig 2D matrix of MP7 frame signatures requiredbits int Bit-length of resulting Video-Code (multiple of 64) ic.core_opts.video_bits "},{"location":"utilities/utils/","title":"ISCC - Utilities","text":""},{"location":"utilities/utils/#iscc_core.utils.json_canonical","title":"json_canonical(obj) ","text":"Canonical, deterministic serialization of ISCC metadata. We serialize ISCC metadata in a deterministic/reproducible manner by using JCS (RFC 8785) canonicalization. Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.sliding_window","title":"sliding_window(seq, width) ","text":"Generate a sequence of equal \"width\" slices each advancing by one elemnt. All types that have a length and can be sliced are supported (list, tuple, str ...). The result type matches the type of the input sequence. Fragment slices smaller than the width at the end of the sequence are not produced. If \"witdh\" is smaller than the input sequence than one element will be returned that is shorter than the requested width. Parameters: Name Type Description Defaultseq Sequence Sequence of values to slide over requiredwidth int Width of sliding window in number of items requiredReturns: Type DescriptionGenerator A generator of window sized items Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_compare","title":"iscc_compare(a, b) ","text":"Calculate separate hamming distances of compatible components of two ISCCs Returns: Type Descriptiondict A dict with keys meta_dist, semantic_dist, content_dist, data_dist, instance_match Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_similarity","title":"iscc_similarity(a, b) ","text":"Calculate similarity of ISCC codes as a percentage value (0-100). MainType, SubType, Version and Length of the codes must be the same. Parameters: Name Type Description Defaulta ISCC a requiredb ISCC b requiredReturns: Type Descriptionint Similarity of ISCC a and b in percent (based on hamming distance) Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_distance","title":"iscc_distance(a, b) ","text":"Calculate hamming distance of ISCC codes. MainType, SubType, Version and Length of the codes must be the same. Parameters: Name Type Description Defaulta ISCC a requiredb ISCC b requiredReturns: Type Descriptionint Hamming distanced in number of bits. Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_distance_bytes","title":"iscc_distance_bytes(a, b) ","text":"Calculate hamming distance for binary hash digests of equal length. Parameters: Name Type Description Defaulta bytes binary hash digest requiredb bytes binary hash digest requiredReturns: Type Descriptionint Hamming distance in number of bits. Source code iniscc_core\\utils.py "},{"location":"utilities/utils/#iscc_core.utils.iscc_pair_unpack","title":"iscc_pair_unpack(a, b) ","text":"Unpack two ISCC codes and return their body hash digests if their headers match. Headers match if their MainType, SubType, and Version are identical. Parameters: Name Type Description Defaulta ISCC a requiredb ISCC b requiredReturns: Type DescriptionTuple[bytes, bytes] Tuple with hash digests of a and b Raises: Type DescriptionValueError If ISCC headers don\u00b4t match Source code iniscc_core\\utils.py "}]}
\ No newline at end of file
diff --git a/sitemap.xml b/sitemap.xml
index 2743f7a..0332f99 100755
--- a/sitemap.xml
+++ b/sitemap.xml
@@ -2,137 +2,137 @@
|