Class MatchRatingApproachEncoder
java.lang.Object
org.apache.commons.codec.language.MatchRatingApproachEncoder
- All Implemented Interfaces:
Encoder
,StringEncoder
Match Rating Approach Phonetic Algorithm Developed by Western Airlines in 1977.
This class is immutable and thread-safe.
- Since:
- 1.8
- See Also:
-
Field Summary
Fields -
Constructor Summary
Constructors -
Method Summary
Modifier and TypeMethodDescription(package private) String
Cleans up a name: 1.final Object
Encodes an Object using the Match Rating Approach algorithm.final String
Encodes a String using the Match Rating Approach (MRA) algorithm.(package private) String
getFirst3Last3
(String name) Gets the first and last 3 letters of a name (if > 6 characters) Else just returns the name.(package private) int
getMinRating
(int sumLength) Obtains the min rating of the length sum of the 2 names.boolean
isEncodeEquals
(String name1, String name2) Determines if two names are homophonous via Match Rating Approach (MRA) algorithm.(package private) boolean
Determines if a letter is a vowel.(package private) int
leftToRightThenRightToLeftProcessing
(String name1, String name2) Processes the names from left to right (first) then right to left removing identical letters in same positions.(package private) String
removeAccents
(String accentedWord) Removes accented letters and replaces with non-accented ASCII equivalent Case is preserved.(package private) String
removeDoubleConsonants
(String name) Replaces any double consonant pair with the single letter equivalent.(package private) String
removeVowels
(String name) Deletes all vowels unless the vowel begins the word.
-
Field Details
-
SPACE
- See Also:
-
EMPTY
- See Also:
-
PLAIN_ASCII
The plain letter equivalent of the accented letters.- See Also:
-
UNICODE
Unicode characters corresponding to various accented letters. For example: Ú is U acute etc...- See Also:
-
DOUBLE_CONSONANT
-
-
Constructor Details
-
MatchRatingApproachEncoder
public MatchRatingApproachEncoder()
-
-
Method Details
-
cleanName
Cleans up a name: 1. Upper-cases everything 2. Removes some common punctuation 3. Removes accents 4. Removes any spaces.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name
- The name to be cleaned- Returns:
- The cleaned name
-
encode
Encodes an Object using the Match Rating Approach algorithm. Method is here to satisfy the requirements of the Encoder interface Throws an EncoderException if input object is not of type java.lang.String.- Specified by:
encode
in interfaceEncoder
- Parameters:
pObject
- Object to encode- Returns:
- An object (or type java.lang.String) containing the Match Rating Approach code which corresponds to the String supplied.
- Throws:
EncoderException
- if the parameter supplied is not of type java.lang.String
-
encode
Encodes a String using the Match Rating Approach (MRA) algorithm.- Specified by:
encode
in interfaceStringEncoder
- Parameters:
name
- String object to encode- Returns:
- The MRA code corresponding to the String supplied
-
getFirst3Last3
Gets the first and last 3 letters of a name (if > 6 characters) Else just returns the name.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name
- The string to get the substrings from- Returns:
- Annexed first and last 3 letters of input word.
-
getMinRating
int getMinRating(int sumLength) Obtains the min rating of the length sum of the 2 names. In essence the larger the sum length the smaller the min rating. Values strictly from documentation.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
sumLength
- The length of 2 strings sent down- Returns:
- The min rating value
-
isEncodeEquals
Determines if two names are homophonous via Match Rating Approach (MRA) algorithm. It should be noted that the strings are cleaned in the same way asencode(String)
.- Parameters:
name1
- First of the 2 strings (names) to comparename2
- Second of the 2 names to compare- Returns:
true
if the encodings are identicalfalse
otherwise.
-
isVowel
Determines if a letter is a vowel.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
letter
- The letter under investigation- Returns:
- True if a vowel, else false
-
leftToRightThenRightToLeftProcessing
Processes the names from left to right (first) then right to left removing identical letters in same positions. Then subtracts the longer string that remains from 6 and returns this.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name1
- name2- Returns:
- the length as above
-
removeAccents
Removes accented letters and replaces with non-accented ASCII equivalent Case is preserved. http://www.codecodex.com/wiki/Remove_accent_from_letters_%28ex_.%C3%A9_to_e%29- Parameters:
accentedWord
- The word that may have accents in it.- Returns:
- De-accented word
-
removeDoubleConsonants
Replaces any double consonant pair with the single letter equivalent.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name
- String to have double consonants removed- Returns:
- Single consonant word
-
removeVowels
Deletes all vowels unless the vowel begins the word.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name
- The name to have vowels removed- Returns:
- De-voweled word
-