Installation
Install the Excel Extractor plugin using npm:Configuration & Parameters
The Excel Extractor accepts the following configuration options:Basic Options
Parameter | Type | Default | Description |
---|---|---|---|
raw | boolean | false | Extract raw, underlying cell values instead of formatted text |
rawNumbers | boolean | false | Extract raw numeric values instead of formatted numbers |
dateNF | string | undefined | Specific date format string for interpreting dates |
chunkSize | number | 10000 | Number of records to process in each batch |
parallel | number | 1 | Number of chunks to process concurrently |
skipEmptyLines | boolean | false | Skip rows that are entirely empty |
debug | boolean | false | Enable verbose logging for troubleshooting |
Advanced Options
Parameter | Type | Default | Description |
---|---|---|---|
cascadeRowValues | boolean | false | Fill empty cells with values from the cell above |
cascadeHeaderValues | boolean | false | Fill empty header cells with values from the cell to the left |
headerDetectionOptions | object | See below | Configure header row detection |
mergedCellOptions | object | undefined | Define how to handle merged cells |
Header Detection Options
Default configuration:'default'
- Scans first 10 rows and selects the one with most non-empty cells'explicitHeaders'
- Use when headers are explicitly defined'specificRows'
- Define specific row numbers as headers'dataRowAndSubHeaderDetection'
- Advanced detection for complex header structures
Merged Cell Options
Configure treatment of merged cells with these options:Treatment | Description |
---|---|
applyToAll | Copy merged cell value to all cells in the range |
applyToTopLeft | Keep value only in top-left cell |
coalesce | Keep first row/column and remove others |
concatenate | Combine values with a separator |
Usage Examples
Configuration Example
Advanced Header Detection
Merged Cell Handling
Direct Parser Usage
Troubleshooting
Large File Handling
Debug Mode
Enable debug mode for detailed logging:Notes
Default Behavior
- Header Detection: By default, the plugin scans the first 10 rows and selects the one with the most non-empty cells as the header row
- Empty Rows: Empty rows are included as empty records unless
skipEmptyLines
is set totrue
- Merged Cells: Handled by the underlying library’s default behavior unless custom options are provided
- Chunk Processing: Data is processed in batches of 10,000 records by default
Important Considerations
- Server-Side Only: This plugin is designed to run in a server-side environment and should be used within a Flatfile listener
- Duplicate Headers: If a sheet contains duplicate column headers, the plugin automatically makes them unique by appending a suffix (e.g., ‘Name’, ‘Name_1’, ‘Name_2’)
- Empty Headers: Empty header cells are renamed to ‘empty’ (e.g., ‘empty’, ‘empty_1’)
- Trailing Empty Rows: The parser automatically trims any fully empty rows from the end of a sheet before processing
- Memory Limitations: The plugin has built-in handling for extremely large files that can cause memory issues, throwing a user-friendly error when files are too large
Cascading Behavior
- Row Cascading: When
cascadeRowValues
is enabled, empty cells are filled with values from the cell above. The cascade resets on a completely blank row or a new value in the column - Header Cascading: When
cascadeHeaderValues
is enabled, empty header cells are filled with values from the cell to the left. The cascade resets on a blank column or a new value