public abstract class AbstractBatchedObjectColumnProcessor<T extends Context> extends AbstractObjectProcessor<T> implements Processor<T>
Processor
implementation for converting batches of rows extracted from any implementation of AbstractParser
into columns of objects.
This uses the value conversions provided by Conversion
instances.
For each row processed, a sequence of conversions will be executed to generate the appropriate object. Each resulting object will then be stored in a list that contains the values of the corresponding column.
During the execution of the process, the batchProcessed(int)
method will be invoked after a given number of rows has been processed.
The user can access the lists with values parsed for all columns using the methods getColumnValuesAsList()
,
getColumnValuesAsMapOfIndexes()
and getColumnValuesAsMapOfNames()
.
After batchProcessed(int)
is invoked, all values will be discarded and the next batch of column values will be accumulated.
This process will repeat until there's no more rows in the input.
AbstractParser
,
Processor
,
BatchedColumnReader
,
Conversion
,
AbstractObjectProcessor
Constructor and Description |
---|
AbstractBatchedObjectColumnProcessor(int rowsPerBatch)
Constructs a abstract batched column processor configured to invoke the
batchesProcessed method after a given number of rows has been processed. |
Modifier and Type | Method and Description |
---|---|
abstract void |
batchProcessed(int rowsInThisBatch)
Callback to the user, where the lists with values parsed for all columns can be accessed using the methods
getColumnValuesAsList() ,
getColumnValuesAsMapOfIndexes() and getColumnValuesAsMapOfNames() . |
int |
getBatchesProcessed()
Returns the number of batches already processed
|
List<Object> |
getColumn(int columnIndex)
Returns the values of a given column.
|
<V> List<V> |
getColumn(int columnIndex,
Class<V> columnType)
Returns the values of a given column.
|
List<Object> |
getColumn(String columnName)
Returns the values of a given column.
|
<V> List<V> |
getColumn(String columnName,
Class<V> columnType)
Returns the values of a given column.
|
List<List<Object>> |
getColumnValuesAsList()
Returns the values processed for each column
|
Map<Integer,List<Object>> |
getColumnValuesAsMapOfIndexes()
Returns a map of column indexes and their respective list of values parsed from the input.
|
Map<String,List<Object>> |
getColumnValuesAsMapOfNames()
Returns a map of column names and their respective list of values parsed from the input.
|
String[] |
getHeaders()
Returns the column headers.
|
int |
getRowsPerBatch()
Returns the number of rows processed in each batch
|
void |
processEnded(T context)
This method will by invoked by the parser once, after the parsing process stopped and all resources were closed.
|
void |
processStarted(T context)
This method will by invoked by the parser once, when it is ready to start processing the input.
|
void |
putColumnValuesInMapOfIndexes(Map<Integer,List<Object>> map)
Fills a given map associating each column index to its list of values
|
void |
putColumnValuesInMapOfNames(Map<String,List<Object>> map)
Fills a given map associating each column name to its list o values
|
void |
rowProcessed(Object[] row,
T context)
Invoked by the processor after all values of a valid record have been processed and converted into an Object array.
|
rowProcessed
applyConversions, convertAll, convertFields, convertIndexes, convertType, handleConversionError, initializeConversions, reverseConversions, toDataProcessingException
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait
rowProcessed
public AbstractBatchedObjectColumnProcessor(int rowsPerBatch)
batchesProcessed
method after a given number of rows has been processed.rowsPerBatch
- the number of rows to process in each batch.public void processStarted(T context)
Processor
processStarted
in interface Processor<T extends Context>
processStarted
in class AbstractObjectProcessor<T extends Context>
context
- A contextual object with information and controls over the current state of the parsing processpublic void rowProcessed(Object[] row, T context)
AbstractObjectProcessor
rowProcessed
in class AbstractObjectProcessor<T extends Context>
row
- object array created with the information extracted by the parser and then converted.context
- A contextual object with information and controls over the current state of the parsing processpublic void processEnded(T context)
Processor
It will always be called by the parser: in case of errors, if the end of the input us reached, or if the user stopped the process manually using Context.stop()
.
processEnded
in interface Processor<T extends Context>
processEnded
in class AbstractObjectProcessor<T extends Context>
context
- A contextual object with information and controls over the state of the parsing processpublic final String[] getHeaders()
CommonSettings.getHeaders()
or the headers parsed in
the input when CommonSettings.getHeaders()
equals to true
public final List<List<Object>> getColumnValuesAsList()
public final void putColumnValuesInMapOfNames(Map<String,List<Object>> map)
map
- the map to hold the values of each columnpublic final void putColumnValuesInMapOfIndexes(Map<Integer,List<Object>> map)
map
- the map to hold the values of each columnpublic final Map<String,List<Object>> getColumnValuesAsMapOfNames()
public final Map<Integer,List<Object>> getColumnValuesAsMapOfIndexes()
public List<Object> getColumn(String columnName)
columnName
- the name of the column in the input.public List<Object> getColumn(int columnIndex)
columnIndex
- the position of the column in the input (0-based).public <V> List<V> getColumn(String columnName, Class<V> columnType)
V
- the type of data in that columncolumnName
- the name of the column in the input.columnType
- the type of data in that columnpublic <V> List<V> getColumn(int columnIndex, Class<V> columnType)
V
- the type of data in that columncolumnIndex
- the position of the column in the input (0-based).columnType
- the type of data in that columnpublic int getRowsPerBatch()
public int getBatchesProcessed()
public abstract void batchProcessed(int rowsInThisBatch)
getColumnValuesAsList()
,
getColumnValuesAsMapOfIndexes()
and getColumnValuesAsMapOfNames()
.rowsInThisBatch
- the number of rows processed in the current batch. This corresponds to the number of elements of each list of each column.Copyright © 2019 uniVocity Software Pty Ltd. All rights reserved.