Skip to content

How to iterate over a record batch? #468

@tisonkun

Description

@tisonkun

Describe the usage question you have. Please include as many useful details as possible.

Now I have a VectorSchemaRoot. I can see to iterate over the batch with getVector and then getObject.

But the return value is of type Obejct. And I wonder how I can downcast it for some useful class I can retrieve the real value (string, int, float, etc.).

I know we have the field info of each vetcor, but I don't know the mapping between field type to real Java class. It looks over challenge to remember all the mapping by reverse engineering the code, and it may change as version evolves.

I checked https://arrow.apache.org/docs/java/index.html but all the pages tell about constructing a batch and how to move it from one place to another, rather than tell about how to read and dump a batch to a typed two-dimensional matrix.

The most trivial usage, contentToTSVString, call Object::toString on each cell. But I don't thing we should convert all the values to String and reparse it to concrete type.

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions