Package org.jcodings.unicode
Class UnicodeEncoding
- java.lang.Object
-
- org.jcodings.Encoding
-
- org.jcodings.AbstractEncoding
-
- org.jcodings.MultiByteEncoding
-
- org.jcodings.unicode.UnicodeEncoding
-
- All Implemented Interfaces:
java.lang.Cloneable
- Direct Known Subclasses:
BaseUTF8Encoding
,FixedWidthUnicodeEncoding
,UTF16BEEncoding
,UTF16LEEncoding
public abstract class UnicodeEncoding extends MultiByteEncoding
-
-
Nested Class Summary
Nested Classes Modifier and Type Class Description (package private) static class
UnicodeEncoding.CodeRangeEntry
-
Field Summary
Fields Modifier and Type Field Description (package private) static int[]
CaseFold_From
(package private) static int[]
CaseFold_Locale_From
(package private) static int[][]
CaseFold_Locale_To
(package private) static int[][]
CaseFold_To
(package private) static int[]
CaseUnfold_11_From
(package private) static int[]
CaseUnfold_11_Locale_From
(package private) static int[][]
CaseUnfold_11_Locale_To
(package private) static int[][]
CaseUnfold_11_To
(package private) static int[][]
CaseUnfold_12
(package private) static int[][]
CaseUnfold_12_Locale
(package private) static int[][]
CaseUnfold_13
(package private) static CaseInsensitiveBytesHash<java.lang.Integer>
CTypeNameHash
(package private) static IntHash<int[]>
FoldHash
private static int
PROPERTY_NAME_MAX_SIZE
(package private) static IntHash<int[]>
Unfold1Hash
(package private) static IntArrayHash<int[]>
Unfold2Hash
(package private) static IntArrayHash<int[]>
Unfold3Hash
(package private) static short[]
UNICODE_ISO_8859_1_CTypeTable
-
Fields inherited from class org.jcodings.AbstractEncoding
EMPTY_FOLD_CODES
-
Fields inherited from class org.jcodings.Encoding
CHAR_INVALID, charset, hashCode, isAsciiCompatible, isDummy, isFixedWidth, isSingleByte, maxLength, minLength, name, NEW_LINE
-
-
Constructor Summary
Constructors Modifier Constructor Description protected
UnicodeEncoding(java.lang.String name, int minLength, int maxLength, int[] EncLen)
protected
UnicodeEncoding(java.lang.String name, int minLength, int maxLength, int[] EncLen, int[][] Trans)
-
Method Summary
All Methods Static Methods Instance Methods Concrete Methods Modifier and Type Method Description void
applyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, java.lang.Object arg)
onigenc_ascii_apply_all_case_fold / used also by multibyte encodingsCaseFoldCodeItem[]
caseFoldCodesByString(int flag, byte[] bytes, int p, int end)
onigenc_ascii_get_case_fold_codes_by_str / used also by multibyte encodingsprotected int[]
ctypeCodeRange(int ctype)
java.lang.String
getCharsetName()
private static CaseInsensitiveBytesHash<java.lang.Integer>
initializeCTypeNameTable()
private static IntHash<int[]>
initializeFoldHash()
private static IntHash<int[]>
initializeUnfold1Hash()
private static IntArrayHash<int[]>
initializeUnfold2Hash()
private static IntArrayHash<int[]>
initializeUnfold3Hash()
boolean
isCodeCType(int code, int ctype)
Perform a check whether given code is of given character type (e.g.int
mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold)
onigenc_ascii_mbc_case_foldint
propertyNameToCType(byte[] name, int p, int end)
onigenc_minimum_property_name_to_ctype notably overridden by unicode encodings-
Methods inherited from class org.jcodings.MultiByteEncoding
length, lengthForTwoUptoFour, mb2CodeToMbc, mb2CodeToMbcLength, mb2IsCodeCType, mb4CodeToMbc, mb4CodeToMbcLength, mb4IsCodeCType, mbnMbcCaseFold, mbnMbcToCode, missing, missing, safeLengthForUptoFour, safeLengthForUptoFourGreatedThan127, safeLengthForUptoThree, safeLengthForUptoTwo, strCodeAt, strLength
-
Methods inherited from class org.jcodings.AbstractEncoding
asciiApplyAllCaseFold, asciiCaseFoldCodesByString, asciiMbcCaseFold, isCodeCTypeInternal, isNewLine
-
Methods inherited from class org.jcodings.Encoding
asciiToLower, asciiToUpper, codeToMbc, codeToMbcLength, ctypeCodeRange, digitVal, equals, getCharset, getIndex, getName, hashCode, isAlnum, isAlpha, isAscii, isAscii, isAsciiCompatible, isBlank, isCntrl, isDigit, isDummy, isFixedWidth, isGraph, isLower, isMbcAscii, isMbcCrnl, isMbcHead, isMbcWord, isNewLine, isPrint, isPunct, isReverseMatchAllowed, isSbWord, isSingleByte, isSpace, isUpper, isWord, isWordGraphPrint, isXDigit, leftAdjustCharHead, length, load, maxLength, maxLengthDistance, mbcodeStartPosition, mbcToCode, minLength, odigitVal, prevCharHead, replicate, rightAdjustCharHead, rightAdjustCharHeadWithPrev, setName, setName, step, stepBack, strByteLengthNull, strLengthNull, strNCmp, toLowerCaseTable, toString, xdigitVal
-
-
-
-
Field Detail
-
PROPERTY_NAME_MAX_SIZE
private static final int PROPERTY_NAME_MAX_SIZE
- See Also:
- Constant Field Values
-
UNICODE_ISO_8859_1_CTypeTable
static final short[] UNICODE_ISO_8859_1_CTypeTable
-
CTypeNameHash
static final CaseInsensitiveBytesHash<java.lang.Integer> CTypeNameHash
-
CaseFold_From
static final int[] CaseFold_From
-
CaseFold_To
static final int[][] CaseFold_To
-
CaseFold_Locale_From
static final int[] CaseFold_Locale_From
-
CaseFold_Locale_To
static final int[][] CaseFold_Locale_To
-
CaseUnfold_11_From
static final int[] CaseUnfold_11_From
-
CaseUnfold_11_To
static final int[][] CaseUnfold_11_To
-
CaseUnfold_11_Locale_From
static final int[] CaseUnfold_11_Locale_From
-
CaseUnfold_11_Locale_To
static final int[][] CaseUnfold_11_Locale_To
-
CaseUnfold_12
static final int[][] CaseUnfold_12
-
CaseUnfold_12_Locale
static final int[][] CaseUnfold_12_Locale
-
CaseUnfold_13
static final int[][] CaseUnfold_13
-
FoldHash
static final IntHash<int[]> FoldHash
-
Unfold1Hash
static final IntHash<int[]> Unfold1Hash
-
Unfold2Hash
static final IntArrayHash<int[]> Unfold2Hash
-
Unfold3Hash
static final IntArrayHash<int[]> Unfold3Hash
-
-
Method Detail
-
getCharsetName
public java.lang.String getCharsetName()
- Overrides:
getCharsetName
in classEncoding
-
isCodeCType
public boolean isCodeCType(int code, int ctype)
Description copied from class:Encoding
Perform a check whether given code is of given character type (e.g. used by isWord(someByte) and similar methods)- Specified by:
isCodeCType
in classEncoding
- Parameters:
code
- a code point of a characterctype
- a character type to check against Oniguruma equivalent:is_code_ctype
-
ctypeCodeRange
protected final int[] ctypeCodeRange(int ctype)
-
propertyNameToCType
public int propertyNameToCType(byte[] name, int p, int end)
Description copied from class:AbstractEncoding
onigenc_minimum_property_name_to_ctype notably overridden by unicode encodings- Overrides:
propertyNameToCType
in classAbstractEncoding
-
mbcCaseFold
public int mbcCaseFold(int flag, byte[] bytes, IntHolder pp, int end, byte[] fold)
Description copied from class:AbstractEncoding
onigenc_ascii_mbc_case_fold- Overrides:
mbcCaseFold
in classAbstractEncoding
- Parameters:
flag
- case fold flagpp
- anIntHolder
that points at character headfold
- a buffer where to extract case folded character Oniguruma equivalent:mbc_case_fold
-
applyAllCaseFold
public void applyAllCaseFold(int flag, ApplyAllCaseFoldFunction fun, java.lang.Object arg)
Description copied from class:AbstractEncoding
onigenc_ascii_apply_all_case_fold / used also by multibyte encodings- Overrides:
applyAllCaseFold
in classAbstractEncoding
- Parameters:
flag
- case fold flagfun
- case folding functor (look at:ApplyCaseFold
)arg
- case folding functor argument (look at:ApplyCaseFoldArg
) Oniguruma equivalent:apply_all_case_fold
-
caseFoldCodesByString
public CaseFoldCodeItem[] caseFoldCodesByString(int flag, byte[] bytes, int p, int end)
Description copied from class:AbstractEncoding
onigenc_ascii_get_case_fold_codes_by_str / used also by multibyte encodings- Overrides:
caseFoldCodesByString
in classAbstractEncoding
-
initializeCTypeNameTable
private static CaseInsensitiveBytesHash<java.lang.Integer> initializeCTypeNameTable()
-
initializeFoldHash
private static IntHash<int[]> initializeFoldHash()
-
initializeUnfold1Hash
private static IntHash<int[]> initializeUnfold1Hash()
-
initializeUnfold2Hash
private static IntArrayHash<int[]> initializeUnfold2Hash()
-
initializeUnfold3Hash
private static IntArrayHash<int[]> initializeUnfold3Hash()
-
-