public class ADNameSampleStream extends Object implements ObjectStream<NameSample>
The data contains four named entity types: Person, Organization, Group,
Place, Event, ArtProd, Abstract, Thing, Time and Numeric.
Data can be found on this web site:
http://www.linguateca.pt/floresta/corpus.html
Information about the format:
Susana Afonso.
"Árvores deitadas: Descrição do formato e das opções de análise na Floresta Sintáctica"
.
12 de Fevereiro de 2006.
http://www.linguateca.pt/documentos/Afonso2006ArvoresDeitadas.pdf
Detailed info about the NER tagset: http://beta.visl.sdu.dk/visl/pt/info/portsymbol.html#semtags_names
Note: Do not use this class, internal use only!
Constructor and Description |
---|
ADNameSampleStream(InputStream in,
String charsetName,
boolean splitHyphenatedTokens)
Creates a new
NameSample stream from a InputStream |
ADNameSampleStream(ObjectStream<String> lineStream,
boolean splitHyphenatedTokens)
Creates a new
NameSample stream from a line stream, i.e. |
Modifier and Type | Method and Description |
---|---|
void |
close()
Closes the
ObjectStream and releases all allocated
resources. |
NameSample |
read()
Returns the next object.
|
void |
reset()
Repositions the stream at the beginning and the previously seen object sequence
will be repeated exactly.
|
public ADNameSampleStream(ObjectStream<String> lineStream, boolean splitHyphenatedTokens)
NameSample
stream from a line stream, i.e.
ObjectStream
< String
>, that could be a
PlainTextByLineStream
object.lineStream
- a stream of lines as String
splitHyphenatedTokens
- if true hyphenated tokens will be separated: "carros-monstro" >
"carros" "-" "monstro"public ADNameSampleStream(InputStream in, String charsetName, boolean splitHyphenatedTokens)
NameSample
stream from a InputStream
in
- the Corpus InputStream
charsetName
- the charset of the Arvores Deitadas CorpussplitHyphenatedTokens
- if true hyphenated tokens will be separated: "carros-monstro" >
"carros" "-" "monstro"public NameSample read() throws IOException
ObjectStream
read
in interface ObjectStream<NameSample>
IOException
public void reset() throws IOException, UnsupportedOperationException
ObjectStream
reset
in interface ObjectStream<NameSample>
IOException
UnsupportedOperationException
public void close() throws IOException
ObjectStream
ObjectStream
and releases all allocated
resources. After close was called its not allowed to call
read or reset.close
in interface ObjectStream<NameSample>
IOException
Copyright © 2018 The Apache Software Foundation. All rights reserved.