Class CzechStemmer


  • public class CzechStemmer
    extends java.lang.Object
    Light Stemmer for Czech.

    Implements the algorithm described in: Indexing and stemming approaches for the Czech language http://portal.acm.org/citation.cfm?id=1598600

    • Constructor Summary

      Constructors 
      Constructor Description
      CzechStemmer()  
    • Method Summary

      All Methods Instance Methods Concrete Methods 
      Modifier and Type Method Description
      private int normalize​(char[] s, int len)  
      private int removeCase​(char[] s, int len)  
      private int removePossessives​(char[] s, int len)  
      int stem​(char[] s, int len)
      Stem an input buffer of Czech text.
      • Methods inherited from class java.lang.Object

        clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
    • Constructor Detail

      • CzechStemmer

        public CzechStemmer()
    • Method Detail

      • stem

        public int stem​(char[] s,
                        int len)
        Stem an input buffer of Czech text.
        Parameters:
        s - input buffer
        len - length of input buffer
        Returns:
        length of input buffer after normalization

        NOTE: Input is expected to be in lowercase, but with diacritical marks

      • removeCase

        private int removeCase​(char[] s,
                               int len)
      • removePossessives

        private int removePossessives​(char[] s,
                                      int len)
      • normalize

        private int normalize​(char[] s,
                              int len)