-
Notifications
You must be signed in to change notification settings - Fork 248
Add pg_clickhouse #729
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Add pg_clickhouse #729
Conversation
42bc98b to
ea70055
Compare
It dupes the clickhouse benchmark config to load data and the Postgres config to run the queries. All queries push down to ClickHouse without error.
ea70055 to
454e65e
Compare
| then | ||
| cd /tmp || exit | ||
| curl https://clickhouse.com/ | sh | ||
| sudo ./clickhouse install --noninteractive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think a ClickHouse installation isn't necessary (in fact, all the sudo calls mess with the system). The first run of ./clickhouse will unpack the binary into the local directory. We
- can remove l. 9,
- create
compression.yamlin subfolderconfig.d/of the local working directory (l. 20) - remove
sudofrom l. 23 and run the database in the background (append&) - move the files in a subfolder
user_filesof the local working directory withoutsudo(l. 45). I thinkchown(l. 46) isn't needed as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you want to make the same changes to clickhouse/benchmark.sh? I just copied that code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, okay, then such changes would have a too large blast radius.
I'll run this script tomorrow or on Friday in our automation. This will generate the results file that can then be uploaded to https://benchmark.clickhouse.com/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, this?
--- a/pg_clickhouse/clickhouse.sh
+++ b/pg_clickhouse/clickhouse.sh
@@ -6,7 +6,6 @@ if [ ! -x /usr/bin/clickhouse ]
then
cd /tmp || exit
curl https://clickhouse.com/ | sh
- sudo ./clickhouse install --noninteractive
rm clickhouse
cd - || exit
fi
@@ -20,7 +19,7 @@ compression:
" | sudo tee /etc/clickhouse-server/config.d/compression.yaml
fi;
-sudo clickhouse start
+clickhouse start &
for _ in {1..300}
do
@@ -42,8 +41,7 @@ clickhouse-client < create"$SUFFIX".sql
# seq 1 | xargs -P100 -I{} bash -c 'wget --continue --progress=dot:giga https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_{}.parquet'
seq 0 99 | xargs -P100 -I{} bash -c 'wget --continue --progress=dot:giga https://datasets.clickhouse.com/hits_compatible/athena_partitioned/hits_{}.parquet'
-sudo mv hits_*.parquet /var/lib/clickhouse/user_files/
-sudo chown clickhouse:clickhouse /var/lib/clickhouse/user_files/hits_*.parquet
+mv hits_*.parquet /var/lib/clickhouse/user_files/
echo -n "Load time: "
clickhouse-client --time --query "INSERT INTO hits SELECT * FROM file('hits_*.parquet')" --max-insert-threads $(( $(nproc) / 4 ))
It dupes the clickhouse benchmark config to load data and the Postgres config to run the queries. All queries push down to ClickHouse without error.
Resolves ClickHouse/pg_clickhouse#82.