Class CharsetUTF8

All Implemented Interfaces:
Comparable<Charset>
Direct Known Subclasses:
CharsetCESU8

class CharsetUTF8 extends CharsetICU
  • Field Details

    • fromUSubstitution

      private static final byte[] fromUSubstitution
    • BITMASK_FROM_UTF8

      private static final int[] BITMASK_FROM_UTF8
    • isCESU8

      private final boolean isCESU8
  • Constructor Details

    • CharsetUTF8

      public CharsetUTF8(String icuCanonicalName, String javaCanonicalName, String[] aliases)
  • Method Details

    • encodeHeadOf1

      private static final byte encodeHeadOf1(int char32)
    • encodeHeadOf2

      private static final byte encodeHeadOf2(int char32)
    • encodeHeadOf3

      private static final byte encodeHeadOf3(int char32)
    • encodeHeadOf4

      private static final byte encodeHeadOf4(int char32)
    • encodeThirdToLastTail

      private static final byte encodeThirdToLastTail(int char32)
    • encodeSecondToLastTail

      private static final byte encodeSecondToLastTail(int char32)
    • encodeLastTail

      private static final byte encodeLastTail(int char32)
    • newDecoder

      public CharsetDecoder newDecoder()
      Specified by:
      newDecoder in class Charset
    • newEncoder

      public CharsetEncoder newEncoder()
      Specified by:
      newEncoder in class Charset
    • getUnicodeSetImpl

      void getUnicodeSetImpl(UnicodeSet setFillIn, int which)
      Description copied from class: CharsetICU
      This follows ucnv.c method ucnv_detectUnicodeSignature() to detect the start of the stream for example U+FEFF (the Unicode BOM/signature character) that can be ignored. Detects Unicode signature byte sequences at the start of the byte stream and returns number of bytes of the BOM of the indicated Unicode charset. 0 is returned when no Unicode signature is recognized.
      Specified by:
      getUnicodeSetImpl in class CharsetICU