-
Notifications
You must be signed in to change notification settings - Fork 107
Description
Describe the usage question you have. Please include as many useful details as possible.
Platform: Ubuntu 20.04.6 LTS
Arrow version: 15.
Client: C++
Server: Java
I have a Java arrow flight server that uses doExchange to read the context, my server code is as follows, I have identified in multiple thread environment, in the tests many calls would stuck at the line FlightDescriptor descriptor = reader.getDescriptor(); in the second time(Which means the first call in the same thread usually is not blocked), and which makes the following requests also stucks.
log.info("Start to call doExchange.........");
// Trying to
try (BufferAllocator allocator = allocatorPool.submit(
() -> this.allocator.newChildAllocator("exchange", 0, Long.MAX_VALUE)).get()) {
FlightDescriptor descriptor = reader.getDescriptor();
List<String> path = descriptor.getPath();
String type = path.get(0);
String funcSignature = path.get(1);
...
The client code is as follows
std::vector<std::shared_ptr<RecordBatch>> UDFClient::Call(
const std::vector<std::string>& paths,
std::shared_ptr<RecordBatch>& batch) const {
// Create a FlightDescriptor using a path
FlightDescriptor descriptor = FlightDescriptor::Path(paths);
auto exchange_result = Client_->DoExchange(descriptor);
if (!exchange_result.ok()) {
throw std::runtime_error("Do exchange descriptor: "
+ exchange_result.status().ToString());
}
auto exchange = std::move(exchange_result.ValueUnsafe());
...
I understand reader.getDescriptor(); is a blocking call that will use future to wait the descriptor sent. Since doExchange in the server has been called, I assume the client auto exchange_result = Client_->DoExchange(descriptor); has been received. I can't figure what could make reader.getDescriptor(); stuck, I am reporting this to the community that want to know is this a bug or there is something I did wrong. Thank you
This happened in the multiple threads environment. We also check the lock with jstack that shows SettableFuture<FlightDescriptor> descriptor is the one not set that caused the lock.
"pool-1-thread-16" #31 prio=5 os_prio=0 tid=0x00007f6d34009000 nid=0x3a19 waiting on condition [0x00007f6d253ea000]
java.lang.Thread.State: WAITING (parking)
at sun.misc.Unsafe.park(Native Method)
- parking to wait for <0x000000076d596180> (a com.google.common.util.concurrent.SettableFuture)
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:175)
at com.google.common.util.concurrent.AbstractFuture.get(AbstractFuture.java:563)
at com.google.common.util.concurrent.AbstractFuture$TrustedFuture.get(AbstractFuture.java:110)
at org.apache.arrow.flight.FlightStream.getDescriptor(FlightStream.java:158)
at com.bytedance.dp.udf.UDFProducer.doExchange(UDFProducer.java:266)
at org.apache.arrow.flight.FlightService.lambda$doExchangeCustom$2(FlightService.java:382)
at org.apache.arrow.flight.FlightService$$Lambda$65/298187580.run(Unknown Source)
at io.grpc.Context$1.run(Context.java:566)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)
Component(s)
C++, FlightRPC, Java