Skip to content

Conversation

@pgoron
Copy link

@pgoron pgoron commented Jul 28, 2021

When importing a large table, md files can be removed before being processed, resulting in a NoSuchFileException. BulkLoader.openFile() is supposed to returrn null when facing IOException to skip processing of the file and avoid throwing unhandled exception.

    public static SSTableReader openFile(Pair<Descriptor, Set<Component>> p, CFMetaData cfm) {
        try {
            // To conserve memory, open SSTableReaders without bloom
            // filters and discard
            // the index summary after calculating the file sections to
            // stream and the estimated
            // number of keys for each endpoint. See CASSANDRA-5555 for
            // details.
            return openForBatch(p.left, p.right, cfm);
        } catch (Exception e) {
            logger.warn("Skipping file {}, error opening it: {}", p.left.baseFilename(), e.getMessage());
        }
        return null;
    }

Unfortunately org.apache.cassandra.io.util.ChannelProxy.openChannel() is wrapping IOException in RuntimeException making exception handler in BulkLoader.openFile() useless.

    public static FileChannel openChannel(File file)
    {
        try
        {
            return FileChannel.open(file.toPath(), StandardOpenOption.READ);
        }
        catch (IOException e)
        {
            throw new RuntimeException(e);
        }
    }

Below an example of uncaught exception interrupting import:

java.lang.RuntimeException: java.nio.file.NoSuchFileException: /var/opt/cassandra/data/biggraphite/datapoints_360p_3600s_aggr-6c402f30e40311e7b12e356deff79235/md-8887-big-Index.db
        at org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:55)
        at org.apache.cassandra.io.util.ChannelProxy.<init>(ChannelProxy.java:66)
        at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:315)
        at org.apache.cassandra.io.sstable.format.SSTableReader.buildSummary(SSTableReader.java:840)
        at org.apache.cassandra.io.sstable.format.SSTableReader.openForBatch(SSTableReader.java:451)
        at com.scylladb.tools.BulkLoader.openFile(BulkLoader.java:1520)
        at com.scylladb.tools.BulkLoader.process(BulkLoader.java:1565)
        at com.scylladb.tools.BulkLoader.lambda$main$1(BulkLoader.java:1367)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.NoSuchFileException: /var/opt/cassandra/data/biggraphite/datapoints_360p_3600s_aggr-6c402f30e40311e7b12e356deff79235/md-8887-big-Index.db
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
        at java.nio.channels.FileChannel.open(FileChannel.java:287)
        at java.nio.channels.FileChannel.open(FileChannel.java:335)
        at org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:51)
        ... 12 more

When importing a large table, md files can be removed before being
processed, resulting in a NoSuchFileException. BulkLoader.openFile()
is supposed to returrn null when facing IOException to skip processing
of the file and avoid throwing unhandled exception.

    public static SSTableReader openFile(Pair<Descriptor, Set<Component>> p, CFMetaData cfm) {
        try {
            // To conserve memory, open SSTableReaders without bloom
            // filters and discard
            // the index summary after calculating the file sections to
            // stream and the estimated
            // number of keys for each endpoint. See CASSANDRA-5555 for
            // details.
            return openForBatch(p.left, p.right, cfm);
        } catch (Exception e) {
            logger.warn("Skipping file {}, error opening it: {}", p.left.baseFilename(), e.getMessage());
        }
        return null;
    }

Unfortunately
org.apache.cassandra.io.util.ChannelProxy.openChannel() is
wrapping IOException in RuntimeException making exception handler
in BulkLoader.openFile() useless.

    public static FileChannel openChannel(File file)
    {
        try
        {
            return FileChannel.open(file.toPath(), StandardOpenOption.READ);
        }
        catch (IOException e)
        {
            throw new RuntimeException(e);
        }
    }

Below an example of uncaught exception interrupting import:
java.lang.RuntimeException: java.nio.file.NoSuchFileException: /var/opt/cassandra/data/biggraphite/datapoints_360p_3600s_aggr-6c402f30e40311e7b12e356deff79235/md-8887-big-Index.db
        at org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:55)
        at org.apache.cassandra.io.util.ChannelProxy.<init>(ChannelProxy.java:66)
        at org.apache.cassandra.io.util.RandomAccessReader.open(RandomAccessReader.java:315)
        at org.apache.cassandra.io.sstable.format.SSTableReader.buildSummary(SSTableReader.java:840)
        at org.apache.cassandra.io.sstable.format.SSTableReader.openForBatch(SSTableReader.java:451)
        at com.scylladb.tools.BulkLoader.openFile(BulkLoader.java:1520)
        at com.scylladb.tools.BulkLoader.process(BulkLoader.java:1565)
        at com.scylladb.tools.BulkLoader.lambda$main$1(BulkLoader.java:1367)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
        at java.lang.Thread.run(Thread.java:748)
Caused by: java.nio.file.NoSuchFileException: /var/opt/cassandra/data/biggraphite/datapoints_360p_3600s_aggr-6c402f30e40311e7b12e356deff79235/md-8887-big-Index.db
        at sun.nio.fs.UnixException.translateToIOException(UnixException.java:86)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:102)
        at sun.nio.fs.UnixException.rethrowAsIOException(UnixException.java:107)
        at sun.nio.fs.UnixFileSystemProvider.newFileChannel(UnixFileSystemProvider.java:177)
        at java.nio.channels.FileChannel.open(FileChannel.java:287)
        at java.nio.channels.FileChannel.open(FileChannel.java:335)
        at org.apache.cassandra.io.util.ChannelProxy.openChannel(ChannelProxy.java:51)
        ... 12 more
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant