[AURON #1850] Introduce Flink RowData to Arrow conversion#1930
[AURON #1850] Introduce Flink RowData to Arrow conversion#1930x-tong wants to merge 2 commits intoapache:masterfrom
Conversation
Implement Flink RowData to Arrow format conversion for Auron-Flink integration. Key components: - FlinkArrowUtils: Type conversion between Flink LogicalType and Arrow types - FlinkArrowWriter: Converts Flink RowData to Arrow VectorSchemaRoot - FlinkArrowFieldWriter: Field-level writers for all supported types - FlinkArrowFFIExporter: Exports Arrow data via FFI for native consumption Supported types: - Primitive: Boolean, TinyInt, SmallInt, Int, BigInt, Float, Double - String/Binary: VarChar, Char, VarBinary, Binary - Temporal: Date, Time, Timestamp, LocalZonedTimestamp - Complex: Array, Map, Row/Struct - Decimal (128-bit)
There was a problem hiding this comment.
Pull request overview
This PR implements Flink RowData to Arrow format conversion for the Auron-Flink integration. It introduces new utilities and writers to convert Flink's table data structures to Apache Arrow format, enabling efficient data exchange between Flink and native code via the Arrow C Data Interface.
Changes:
- Added FlinkArrowUtils for type conversion between Flink LogicalType and Arrow types
- Implemented FlinkArrowWriter and FlinkArrowFieldWriter for converting RowData to Arrow vectors
- Added FlinkArrowFFIExporter for asynchronous FFI-based data export with producer-consumer pattern
- Included comprehensive unit tests for all conversion and export functionality
Reviewed changes
Copilot reviewed 8 out of 9 changed files in this pull request and generated 11 comments.
Show a summary per file
| File | Description |
|---|---|
| FlinkArrowUtils.java | Provides utilities for converting Flink types to Arrow types and creating Arrow schemas |
| FlinkArrowFieldWriter.java | Implements field writers for all supported Flink types with recursive handling for complex types |
| FlinkArrowWriter.java | Main writer class that orchestrates conversion of RowData to VectorSchemaRoot |
| FlinkArrowFFIExporter.java | Asynchronous exporter using double-queue pattern for safe FFI data export |
| FlinkArrowUtilsTest.java | Tests type conversion logic for all supported types |
| FlinkArrowWriterTest.java | Tests data writing for basic, complex, and edge cases |
| FlinkArrowFFIExporterTest.java | Tests FFI export functionality with native library availability checks |
| pom.xml | Adds required Arrow and Flink dependencies |
| .gitignore | Adds IDE/LSP configuration patterns |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
...xtension/auron-flink-runtime/src/main/java/org/apache/auron/flink/arrow/FlinkArrowUtils.java
Show resolved
Hide resolved
...ion/auron-flink-runtime/src/test/java/org/apache/auron/flink/arrow/FlinkArrowWriterTest.java
Outdated
Show resolved
Hide resolved
...uron-flink-runtime/src/test/java/org/apache/auron/flink/arrow/FlinkArrowFFIExporterTest.java
Outdated
Show resolved
Hide resolved
...on/auron-flink-runtime/src/main/java/org/apache/auron/flink/arrow/FlinkArrowFFIExporter.java
Show resolved
Hide resolved
...ion/auron-flink-runtime/src/test/java/org/apache/auron/flink/arrow/FlinkArrowWriterTest.java
Outdated
Show resolved
Hide resolved
...on/auron-flink-runtime/src/main/java/org/apache/auron/flink/arrow/FlinkArrowFFIExporter.java
Show resolved
Hide resolved
...on/auron-flink-runtime/src/main/java/org/apache/auron/flink/arrow/FlinkArrowFFIExporter.java
Outdated
Show resolved
Hide resolved
...tension/auron-flink-runtime/src/main/java/org/apache/auron/flink/arrow/FlinkArrowWriter.java
Show resolved
Hide resolved
...on/auron-flink-runtime/src/main/java/org/apache/auron/flink/arrow/FlinkArrowFFIExporter.java
Show resolved
Hide resolved
...on/auron-flink-runtime/src/main/java/org/apache/auron/flink/arrow/FlinkArrowFFIExporter.java
Show resolved
Hide resolved
- Add try-finally resource protection in producer thread - Return false on InterruptedException to avoid deadlock - Fix comments: clarify nanoseconds vs microseconds - Add Javadoc for resource management
|
@x-tong is it possible to split this up into separate PRs to make it easier to review? |
I will do this this week. |
|
@ShreyeshArangath I've split this PR into 3 smaller PRs for easier review:
I'll keep this PR open for reference until all parts are merged. |
) ## Summary Part 1/3 of Flink RowData to Arrow conversion implementation (split from #1930). This PR adds the foundational type conversion utilities: - `FlinkArrowUtils`: Conversion form Flink RowData to Arrow types - Support for all common Flink types including primitives, temporal, and complex types - Comprehensive unit tests for type conversion ## Follow-up PRs - Part 2: FlinkArrowFieldWriter + FlinkArrowWriter - Part 3: FlinkArrowFFIExporter ## Test plan - [x] Unit tests for FlinkArrowUtils type conversion - [x] Build passes with `./auron-build.sh --pre --sparkver 3.5 --scalaver 2.12 -Pflink-1.18 -DskipBuildNative` - [x] Code formatted with `./dev/reformat` Related: #1850
Summary
Test plan
./auron-build.sh --pre --sparkver 3.5 --scalaver 2.12 -DskipBuildNative./dev/reformatCloses #1850