Class SmileConstants

java.lang.Object
com.fasterxml.jackson.dataformat.smile.SmileConstants

public final class SmileConstants extends Object
Constants used by SmileGenerator and SmileParser
  • Field Details

    • MAX_SHORT_VALUE_STRING_BYTES

      public static final int MAX_SHORT_VALUE_STRING_BYTES
      Encoding has special "short" forms for value Strings that can be represented by 64 bytes of UTF-8 or less.
      See Also:
    • MAX_SHORT_NAME_ASCII_BYTES

      public static final int MAX_SHORT_NAME_ASCII_BYTES
      Maximum byte length for short ASCII names is 64.
      See Also:
    • MAX_SHORT_NAME_UNICODE_BYTES

      public static final int MAX_SHORT_NAME_UNICODE_BYTES
      Maximum byte length for short non-ASCII names is slightly less due to having to reserve bytes 0xF8 and above (but we get one more as values 0 and 1 are not valid)
      See Also:
    • MAX_SHORT_NAME_ANY_BYTES

      public static final int MAX_SHORT_NAME_ANY_BYTES
      Regardless of ASCII/non-ASCII aspect, maximum byte length for any short name is then 64 bytes.
      See Also:
    • MAX_SHARED_NAMES

      public static final int MAX_SHARED_NAMES
      Longest back reference we use for field names is 10 bits; no point in keeping much more around
      See Also:
    • MAX_SHARED_STRING_VALUES

      public static final int MAX_SHARED_STRING_VALUES
      Longest back reference we use for short shared String values is 10 bits, so up to (1 << 10) values to keep track of.
      See Also:
    • MAX_SHARED_STRING_LENGTH_BYTES

      public static final int MAX_SHARED_STRING_LENGTH_BYTES
      Also: whereas we can refer to names of any length, we will only consider text values that are considered "tiny" or "short" (ones encoded with length prefix); this value thereby has to be maximum length of Strings that can be encoded as such.
      See Also:
    • MIN_BUFFER_FOR_POSSIBLE_SHORT_STRING

      public static final int MIN_BUFFER_FOR_POSSIBLE_SHORT_STRING
      And to make encoding logic tight and simple, we can always require that output buffer has this amount of space available before encoding possibly short String (3 bytes since longest UTF-8 encoded Java char is 3 bytes). Two extra bytes need to be reserved as well; first for token indicator, and second for terminating null byte (in case it's not a short String after all)
      See Also:
    • INT_MARKER_END_OF_STRING

      public static final int INT_MARKER_END_OF_STRING
      We need a byte marker to denote end of variable-length Strings. Although null byte is commonly used, let's try to avoid using it since it can't be embedded in Web Sockets content (similarly, 0xFF can't). There are multiple candidates for bytes UTF-8 can not have; 0xFC is chosen to allow reasonable ordering (highest values meaning most significant framing function; 0xFF being end-of-content and so on)
      See Also:
    • BYTE_MARKER_END_OF_STRING

      public static final byte BYTE_MARKER_END_OF_STRING
      See Also:
    • BYTE_MARKER_END_OF_CONTENT

      public static final byte BYTE_MARKER_END_OF_CONTENT
      In addition we can use a marker to allow simple framing; splitting of physical data (like file) into distinct logical sections like JSON documents. 0xFF makes sense here since it is also used as end marker for Web Sockets.
      See Also:
    • HEADER_BYTE_1

      public static final byte HEADER_BYTE_1
      First byte of data header (0x3A)
      See Also:
    • HEADER_BYTE_2

      public static final byte HEADER_BYTE_2
      Second byte of data header (0x29)
      See Also:
    • HEADER_BYTE_3

      public static final byte HEADER_BYTE_3
      Third byte of data header
      See Also:
    • HEADER_VERSION_0

      public static final int HEADER_VERSION_0
      Current version consists of four zero bits (nibble)
      See Also:
    • HEADER_BYTE_4

      public static final byte HEADER_BYTE_4
      Fourth byte of data header; contains version nibble, may have flags
      See Also:
    • HEADER_BIT_HAS_SHARED_NAMES

      public static final int HEADER_BIT_HAS_SHARED_NAMES
      Indicator bit that indicates whether encoded content may have Shared names (back references to recently encoded field names). If no header available, must be processed as if this was set to true. If (and only if) header exists, and value is 0, can parser omit storing of seen names, as it is guaranteed that no back references exist.
      See Also:
    • HEADER_BIT_HAS_SHARED_STRING_VALUES

      public static final int HEADER_BIT_HAS_SHARED_STRING_VALUES
      Indicator bit that indicates whether encoded content may have shared String values (back references to recently encoded 'short' String values, where short is defined as 64 bytes or less). If no header available, can be assumed to be 0 (false). If header exists, and bit value is 1, parsers has to store up to 1024 most recently seen distinct short String values.
      See Also:
    • HEADER_BIT_HAS_RAW_BINARY

      public static final int HEADER_BIT_HAS_RAW_BINARY
      Indicator bit that indicates whether encoded content may contain raw (unquoted) binary values. If no header available, can be assumed to be 0 (false). If header exists, and bit value is 1, parser can not assume that specific byte values always have default meaning (specifically, content end marker 0xFF and header signature can be contained in binary values)

      Note that this bit being true does not automatically mean that such raw binary content indeed exists; just that it may exist. This because header is written before any binary data may be written.

      See Also:
    • TOKEN_PREFIX_INTEGER

      public static final int TOKEN_PREFIX_INTEGER
      See Also:
    • TOKEN_PREFIX_FP

      public static final int TOKEN_PREFIX_FP
      See Also:
    • TOKEN_PREFIX_SHARED_STRING_SHORT

      public static final int TOKEN_PREFIX_SHARED_STRING_SHORT
      See Also:
    • TOKEN_PREFIX_SHARED_STRING_LONG

      public static final int TOKEN_PREFIX_SHARED_STRING_LONG
      See Also:
    • TOKEN_PREFIX_TINY_ASCII

      public static final int TOKEN_PREFIX_TINY_ASCII
      See Also:
    • TOKEN_PREFIX_SMALL_ASCII

      public static final int TOKEN_PREFIX_SMALL_ASCII
      See Also:
    • TOKEN_PREFIX_TINY_UNICODE

      public static final int TOKEN_PREFIX_TINY_UNICODE
      See Also:
    • TOKEN_PREFIX_SHORT_UNICODE

      public static final int TOKEN_PREFIX_SHORT_UNICODE
      See Also:
    • TOKEN_PREFIX_SMALL_INT

      public static final int TOKEN_PREFIX_SMALL_INT
      See Also:
    • TOKEN_PREFIX_MISC_OTHER

      public static final int TOKEN_PREFIX_MISC_OTHER
      See Also:
    • TOKEN_LITERAL_EMPTY_STRING

      public static final byte TOKEN_LITERAL_EMPTY_STRING
      See Also:
    • TOKEN_LITERAL_NULL

      public static final byte TOKEN_LITERAL_NULL
      See Also:
    • TOKEN_LITERAL_FALSE

      public static final byte TOKEN_LITERAL_FALSE
      See Also:
    • TOKEN_LITERAL_TRUE

      public static final byte TOKEN_LITERAL_TRUE
      See Also:
    • TOKEN_LITERAL_START_ARRAY

      public static final byte TOKEN_LITERAL_START_ARRAY
      See Also:
    • TOKEN_LITERAL_END_ARRAY

      public static final byte TOKEN_LITERAL_END_ARRAY
      See Also:
    • TOKEN_LITERAL_START_OBJECT

      public static final byte TOKEN_LITERAL_START_OBJECT
      See Also:
    • TOKEN_LITERAL_END_OBJECT

      public static final byte TOKEN_LITERAL_END_OBJECT
      See Also:
    • INT_MISC_BINARY_7BIT

      public static final int INT_MISC_BINARY_7BIT
      See Also:
    • INT_MISC_BINARY_RAW

      public static final int INT_MISC_BINARY_RAW
      See Also:
    • TOKEN_MISC_LONG_TEXT_ASCII

      public static final byte TOKEN_MISC_LONG_TEXT_ASCII
      Type (for misc, other) used for variable length UTF-8 encoded text, when it is known to only contain ASCII chars. Note: 2 LSB are reserved for future use; must be zeroes for now
      See Also:
    • TOKEN_MISC_LONG_TEXT_UNICODE

      public static final byte TOKEN_MISC_LONG_TEXT_UNICODE
      Type (for misc, other) used for variable length UTF-8 encoded text, when it is NOT known to only contain ASCII chars (which means it MAY have multi-byte characters) Note: 2 LSB are reserved for future use; must be zeroes for now
      See Also:
    • TOKEN_MISC_BINARY_7BIT

      public static final byte TOKEN_MISC_BINARY_7BIT
      Type (for misc, other) used for "safe" (encoded by only using 7 LSB, giving 8/7 expansion ratio). This is usually done to ensure that certain bytes are never included in encoded data (like 0xFF) Note: 2 LSB are reserved for future use; must be zeroes for now
      See Also:
    • TOKEN_MISC_BINARY_RAW

      public static final byte TOKEN_MISC_BINARY_RAW
      Raw binary data marker is specifically chosen as separate from other types, since it can have significant impact on framing (or rather fast scanning based on structure and framing markers).
      See Also:
    • TOKEN_MISC_INTEGER_32

      public static final int TOKEN_MISC_INTEGER_32
      Numeric subtype (2 LSB) indicating 32-bit integer (int)
      See Also:
    • TOKEN_MISC_INTEGER_64

      public static final int TOKEN_MISC_INTEGER_64
      Numeric subtype (2 LSB) indicating 32-bit integer (long)
      See Also:
    • TOKEN_MISC_INTEGER_BIG

      public static final int TOKEN_MISC_INTEGER_BIG
      Numeric subtype (2 LSB) for indicating BigInteger type.
      See Also:
    • TOKEN_MISC_FLOAT_32

      public static final int TOKEN_MISC_FLOAT_32
      Numeric subtype (2 LSB) for indicating 32-bit IEEE single precision floating point number.
      See Also:
    • TOKEN_MISC_FLOAT_64

      public static final int TOKEN_MISC_FLOAT_64
      Numeric subtype (2 LSB) indicating 64-bit IEEE double precision floating point number.
      See Also:
    • TOKEN_MISC_FLOAT_BIG

      public static final int TOKEN_MISC_FLOAT_BIG
      Numeric subtype (2 LSB) for indicating BigDecimal type.
      See Also:
    • TOKEN_KEY_EMPTY_STRING

      public static final byte TOKEN_KEY_EMPTY_STRING
      Let's use same code for empty key as for empty String value
      See Also:
    • TOKEN_PREFIX_KEY_SHARED_LONG

      public static final int TOKEN_PREFIX_KEY_SHARED_LONG
      See Also:
    • TOKEN_KEY_LONG_STRING

      public static final byte TOKEN_KEY_LONG_STRING
      See Also:
    • TOKEN_PREFIX_KEY_SHARED_SHORT

      public static final int TOKEN_PREFIX_KEY_SHARED_SHORT
      See Also:
    • TOKEN_PREFIX_KEY_ASCII

      public static final int TOKEN_PREFIX_KEY_ASCII
      See Also:
    • TOKEN_PREFIX_KEY_UNICODE

      public static final int TOKEN_PREFIX_KEY_UNICODE
      See Also:
    • sUtf8UnitLengths

      public static final int[] sUtf8UnitLengths
      Additionally we can combine UTF-8 decoding info into similar data table. Values indicate "byte length - 1"; meaning -1 is used for invalid bytes, 0 for single-byte codes, 1 for 2-byte codes and 2 for 3-byte codes.
  • Constructor Details

    • SmileConstants

      public SmileConstants()