diff --git a/website/docs/engine-flink/ddl.md b/website/docs/engine-flink/ddl.md index 324c753050..b333a14001 100644 --- a/website/docs/engine-flink/ddl.md +++ b/website/docs/engine-flink/ddl.md @@ -27,7 +27,7 @@ The following properties can be set if using the Fluss catalog: | bootstrap.servers | required | (none) | Comma separated list of Fluss servers. | | default-database | optional | fluss | The default database to use when switching to this catalog. | | client.security.protocol | optional | PLAINTEXT | The security protocol used to communicate with brokers. Currently, only `PLAINTEXT` and `SASL` are supported, the configuration value is case insensitive. | -| `client.security.{protocol}.*` | optional | (none) | Client-side configuration properties for a specific authentication protocol. E.g., client.security.sasl.jaas.config. More Details in [authentication](../security/authentication.md) | +| `client.security.{protocol}.*` | optional | (none) | Client-side configuration properties for a specific authentication protocol. E.g., client.security.sasl.jaas.config. More Details in [authentication](security/authentication.md) | | `{lake-format}.*` | optional | (none) | Extra properties to be passed to the lake catalog. This is useful for configuring sensitive settings, such as the username and password required for lake catalog authentication. E.g., `paimon.jdbc.password = pass`. | The following statements assume that the current catalog has been switched to the Fluss catalog using the `USE CATALOG ` statement. @@ -62,7 +62,7 @@ DROP DATABASE my_db; ### Primary Key Table -The following SQL statement will create a [Primary Key Table](table-design/table-types/pk-table/index.md) with a primary key consisting of shop_id and user_id. +The following SQL statement will create a [Primary Key Table](table-design/table-types/pk-table.md) with a primary key consisting of shop_id and user_id. ```sql title="Flink SQL" CREATE TABLE my_pk_table ( shop_id BIGINT, diff --git a/website/docs/engine-flink/delta-joins.md b/website/docs/engine-flink/delta-joins.md index deeb9ed74f..96f0974963 100644 --- a/website/docs/engine-flink/delta-joins.md +++ b/website/docs/engine-flink/delta-joins.md @@ -130,7 +130,7 @@ For example: - Full primary key: `(city_id, order_id)` - Bucket key: `city_id` -This yields an **index** on the prefix key `city_id`, so that you can perform [Prefix Key Lookup](/docs/engine-flink/lookups/#prefix-lookup) by the `city_id`. +This yields an **index** on the prefix key `city_id`, so that you can perform [Prefix Key Lookup](/engine-flink/lookups.md#prefix-lookup) by the `city_id`. In this setup: * The delta join operator uses the prefix key (`city_id`) to retrieve only relevant right-side records matching each left-side event. diff --git a/website/docs/engine-flink/options.md b/website/docs/engine-flink/options.md index a7114af492..3c06c3ccbe 100644 --- a/website/docs/engine-flink/options.md +++ b/website/docs/engine-flink/options.md @@ -83,7 +83,7 @@ See more details about [ALTER TABLE ... SET](engine-flink/ddl.md#set-properties) | table.datalake.freshness | Duration | 3min | It defines the maximum amount of time that the datalake table's content should lag behind updates to the Fluss table. Based on this target freshness, the Fluss service automatically moves data from the Fluss table and updates to the datalake table, so that the data in the datalake table is kept up to date within this target. If the data does not need to be as fresh, you can specify a longer target freshness time to reduce costs. | | table.datalake.auto-compaction | Boolean | false | If true, compaction will be triggered automatically when tiering service writes to the datalake. It is disabled by default. | | table.datalake.auto-expire-snapshot | Boolean | false | If true, snapshot expiration will be triggered automatically when tiering service commits to the datalake. It is disabled by default. | -| table.merge-engine | Enum | (None) | Defines the merge engine for the primary key table. By default, primary key table uses the [default merge engine(last_row)](table-design/table-types/pk-table/merge-engines/default.md). It also supports two merge engines are `first_row`, `versioned` and `aggregation`. The [first_row merge engine](table-design/table-types/pk-table/merge-engines/first-row.md) will keep the first row of the same primary key. The [versioned merge engine](table-design/table-types/pk-table/merge-engines/versioned.md) will keep the row with the largest version of the same primary key. The `aggregation` merge engine will aggregate rows with the same primary key using field-level aggregate functions. | +| table.merge-engine | Enum | (None) | Defines the merge engine for the primary key table. By default, primary key table uses the [default merge engine(last_row)](table-design/merge-engines/default.md). It also supports two merge engines are `first_row`, `versioned` and `aggregation`. The [first_row merge engine](table-design/merge-engines/first-row.md) will keep the first row of the same primary key. The [versioned merge engine](table-design/merge-engines/versioned.md) will keep the row with the largest version of the same primary key. The `aggregation` merge engine will aggregate rows with the same primary key using field-level aggregate functions. | | table.merge-engine.versioned.ver-column | String | (None) | The column name of the version column for the `versioned` merge engine. If the merge engine is set to `versioned`, the version column must be set. | | table.delete.behavior | Enum | ALLOW | Controls the behavior of delete operations on primary key tables. Three modes are supported: `ALLOW` (default for default merge engine) - allows normal delete operations; `IGNORE` - silently ignores delete requests without errors; `DISABLE` - rejects delete requests and throws explicit errors. This configuration provides system-level guarantees for some downstream pipelines (e.g., Flink Delta Join) that must not receive any delete events in the changelog of the table. For tables with `first_row` or `versioned` or `aggregation` merge engines, this option is automatically set to `IGNORE` and cannot be overridden. Note: For `aggregation` merge engine, when set to `allow`, delete operations will remove the entire record. This configuration only applicable to primary key tables. | | table.changelog.image | Enum | FULL | Defines the changelog image mode for primary key tables. This configuration is inspired by similar settings in database systems like MySQL's `binlog_row_image` and PostgreSQL's `replica identity`. Two modes are supported: `FULL` (default) - produces both UPDATE_BEFORE and UPDATE_AFTER records for update operations, capturing complete information about updates and allowing tracking of previous values; `WAL` - does not produce UPDATE_BEFORE records. Only INSERT, UPDATE_AFTER (and DELETE if allowed) records are emitted. When WAL mode is enabled with default merge engine (no merge engine configured) and full row updates (not partial update), an optimization is applied to skip looking up old values, and in this case INSERT operations are converted to UPDATE_AFTER events. This mode reduces storage and transmission costs but loses the ability to track previous values. Only applicable to primary key tables. | @@ -157,4 +157,4 @@ See more details about [ALTER TABLE ... SET](engine-flink/ddl.md#set-properties) | client.filesystem.security.token.renewal.time-ratio | Double | 0.75 | Ratio of the token's expiration time when new credentials for access filesystem should be re-obtained. | | client.metrics.enabled | Boolean | false | Enable metrics for client. When metrics is enabled, the client will collect metrics and report by the JMX metrics reporter. | | client.security.protocol | String | PLAINTEXT | The security protocol used to communicate with brokers. Currently, only `PLAINTEXT` and `SASL` are supported, the configuration value is case insensitive. | -| client.security.\{protocol\}.* | optional | (none) | Client-side configuration properties for a specific authentication protocol. E.g., client.security.sasl.jaas.config. More Details in [authentication](../security/authentication.md) | +| client.security.\{protocol\}.* | optional | (none) | Client-side configuration properties for a specific authentication protocol. E.g., client.security.sasl.jaas.config. More Details in [authentication](security/authentication.md) | diff --git a/website/docs/engine-flink/procedures.md b/website/docs/engine-flink/procedures.md index ac63113346..95e8724eb0 100644 --- a/website/docs/engine-flink/procedures.md +++ b/website/docs/engine-flink/procedures.md @@ -18,7 +18,7 @@ SHOW PROCEDURES; ## Access Control Procedures -Fluss provides procedures to manage Access Control Lists (ACLs) for security and authorization. See the [Security](../security/overview.md) documentation for more details. +Fluss provides procedures to manage Access Control Lists (ACLs) for security and authorization. See the [Security](/security/overview.md) documentation for more details. ### add_acl diff --git a/website/docs/maintenance/operations/graceful-shutdown.md b/website/docs/maintenance/operations/graceful-shutdown.md index feb4120cb5..423409fc79 100644 --- a/website/docs/maintenance/operations/graceful-shutdown.md +++ b/website/docs/maintenance/operations/graceful-shutdown.md @@ -131,6 +131,6 @@ Monitor shutdown-related metrics: ## See Also -- [Configuration](../configuration.md) -- [Monitoring and Observability](../observability/monitor-metrics.md) +- [Configuration](maintenance/configuration.md) +- [Monitoring and Observability](maintenance/observability/monitor-metrics.md) - [Upgrading Fluss](upgrading.md) \ No newline at end of file diff --git a/website/docs/maintenance/operations/updating-configs.md b/website/docs/maintenance/operations/updating-configs.md index 8b7a6a1e10..22cc77fa95 100644 --- a/website/docs/maintenance/operations/updating-configs.md +++ b/website/docs/maintenance/operations/updating-configs.md @@ -18,7 +18,7 @@ Currently, the supported dynamically updatable server configurations include: - `kv.rocksdb.shared-rate-limiter.bytes-per-sec`: Control RocksDB flush and compaction write rate shared across all RocksDB instances on the TabletServer. The rate limiter is always enabled. Set to a lower value (e.g., 100MB) to limit the rate, or a very high value to effectively disable rate limiting. -You can update the configuration of a cluster with [Java client](#using-java-client) or [Flink Procedures](../../engine-flink/procedures.md#cluster-configuration-procedures). +You can update the configuration of a cluster with [Java client](#using-java-client) or [Flink Procedures](engine-flink/procedures.md#cluster-configuration-procedures). ### Using Java Client diff --git a/website/docs/streaming-lakehouse/overview.md b/website/docs/streaming-lakehouse/overview.md index 1b6f9088f5..626c6ae0ef 100644 --- a/website/docs/streaming-lakehouse/overview.md +++ b/website/docs/streaming-lakehouse/overview.md @@ -43,4 +43,4 @@ Some powerful features it provides are: - **Analytical Streams**: The union reads help data streams to have the powerful analytics capabilities. This reduces complexity when developing streaming applications, simplifies debugging, and allows for immediate access to live data insights. - **Connect to Lakehouse Ecosystem**: Fluss keeps the table metadata in sync with data lake catalogs while compacting data into Lakehouse. As a result, external engines like Spark, StarRocks, Flink, and Trino can read the data directly. They simply connect to the data lake catalog. -Currently, Fluss supports [Paimon](integrate-data-lakes/paimon.md), [Iceberg](integrate-data-lakes/iceberg.md), and [Lance](integrate-data-lakes/lance.md) as Lakehouse Storage, more kinds of data lake formats are on the roadmap. +Currently, Fluss supports [Paimon](streaming-lakehouse/integrate-data-lakes/paimon.md), [Iceberg](streaming-lakehouse/integrate-data-lakes/iceberg.md), and [Lance](streaming-lakehouse/integrate-data-lakes/lance.md) as Lakehouse Storage, more kinds of data lake formats are on the roadmap. diff --git a/website/docs/table-design/data-types.md b/website/docs/table-design/data-types.md index 0cbf9b9e61..9a20550080 100644 --- a/website/docs/table-design/data-types.md +++ b/website/docs/table-design/data-types.md @@ -1,6 +1,6 @@ --- title: "Data Types" -sidebar_position: 10 +sidebar_position: 5 --- # Data Types diff --git a/website/docs/table-design/table-types/pk-table/merge-engines/_category_.json b/website/docs/table-design/merge-engines/_category_.json similarity index 66% rename from website/docs/table-design/table-types/pk-table/merge-engines/_category_.json rename to website/docs/table-design/merge-engines/_category_.json index 1fd102371d..12edf31d62 100644 --- a/website/docs/table-design/table-types/pk-table/merge-engines/_category_.json +++ b/website/docs/table-design/merge-engines/_category_.json @@ -1,4 +1,4 @@ { "label": "Merge Engines", - "position": 2 + "position": 3 } diff --git a/website/docs/table-design/table-types/pk-table/merge-engines/aggregation.md b/website/docs/table-design/merge-engines/aggregation.md similarity index 98% rename from website/docs/table-design/table-types/pk-table/merge-engines/aggregation.md rename to website/docs/table-design/merge-engines/aggregation.md index ed44e2e256..c416c8a399 100644 --- a/website/docs/table-design/table-types/pk-table/merge-engines/aggregation.md +++ b/website/docs/table-design/merge-engines/aggregation.md @@ -1012,8 +1012,8 @@ For detailed information about Exactly-Once implementation, please refer to: [FI ## See Also -- [Default Merge Engine](./default.md) -- [FirstRow Merge Engine](./first-row.md) -- [Versioned Merge Engine](./versioned.md) -- [Primary Key Tables](../index.md) -- [Fluss Client API](../../../../apis/java-client.md) +- [Default Merge Engine](table-design/merge-engines/default.md) +- [FirstRow Merge Engine](table-design/merge-engines/first-row.md) +- [Versioned Merge Engine](table-design/merge-engines/versioned.md) +- [Primary Key Tables](table-design/table-types/pk-table.md) +- [Fluss Client API](apis/java-client.md) diff --git a/website/docs/table-design/table-types/pk-table/merge-engines/default.md b/website/docs/table-design/merge-engines/default.md similarity index 93% rename from website/docs/table-design/table-types/pk-table/merge-engines/default.md rename to website/docs/table-design/merge-engines/default.md index 189582f9c2..d4bc4c8c65 100644 --- a/website/docs/table-design/table-types/pk-table/merge-engines/default.md +++ b/website/docs/table-design/merge-engines/default.md @@ -9,7 +9,7 @@ sidebar_position: 2 ## Overview The **Default Merge Engine** behaves as a LastRow merge engine that retains the latest record for a given primary key. It supports all the operations: `INSERT`, `UPDATE`, `DELETE`. -Additionally, the default merge engine supports [Partial Update](table-design/table-types/pk-table/index.md#partial-update), which preserves the latest values for the specified update columns. +Additionally, the default merge engine supports [Partial Update](table-design/table-types/pk-table.md#partial-update), which preserves the latest values for the specified update columns. If the `'table.merge-engine'` property is not explicitly defined in the table properties when creating a Primary Key Table, the default merge engine will be applied automatically. diff --git a/website/docs/table-design/table-types/pk-table/merge-engines/first-row.md b/website/docs/table-design/merge-engines/first-row.md similarity index 100% rename from website/docs/table-design/table-types/pk-table/merge-engines/first-row.md rename to website/docs/table-design/merge-engines/first-row.md diff --git a/website/docs/table-design/table-types/pk-table/merge-engines/index.md b/website/docs/table-design/merge-engines/index.md similarity index 57% rename from website/docs/table-design/table-types/pk-table/merge-engines/index.md rename to website/docs/table-design/merge-engines/index.md index dfb6798853..1fc7f9bb13 100644 --- a/website/docs/table-design/table-types/pk-table/merge-engines/index.md +++ b/website/docs/table-design/merge-engines/index.md @@ -11,7 +11,7 @@ However, users can specify a different merge engine to customize the merging beh The following merge engines are supported: -1. [Default Merge Engine (LastRow)](table-design/table-types/pk-table/merge-engines/default.md) -2. [FirstRow Merge Engine](table-design/table-types/pk-table/merge-engines/first-row.md) -3. [Versioned Merge Engine](table-design/table-types/pk-table/merge-engines/versioned.md) -4. [Aggregation Merge Engine](table-design/table-types/pk-table/merge-engines/aggregation.md) +1. [Default Merge Engine (LastRow)](default.md) +2. [FirstRow Merge Engine](first-row.md) +3. [Versioned Merge Engine](versioned.md) +4. [Aggregation Merge Engine](aggregation.md) diff --git a/website/docs/table-design/table-types/pk-table/merge-engines/versioned.md b/website/docs/table-design/merge-engines/versioned.md similarity index 100% rename from website/docs/table-design/table-types/pk-table/merge-engines/versioned.md rename to website/docs/table-design/merge-engines/versioned.md diff --git a/website/docs/table-design/overview.md b/website/docs/table-design/overview.md index 700d40749c..99997d244c 100644 --- a/website/docs/table-design/overview.md +++ b/website/docs/table-design/overview.md @@ -1,7 +1,7 @@ --- sidebar_label: Overview title: Table Overview -sidebar_position: 2 +sidebar_position: 1 --- # Table Overview @@ -20,7 +20,7 @@ Tables are classified into two types based on the presence of a primary key: - Used for updating and managing data in business databases. - Support INSERT, UPDATE, and DELETE operations based on the defined primary key. -A Table becomes a [Partitioned Table](data-distribution/partitioning.md) when a partition column is defined. Data with the same partition value is stored in the same partition. Partition columns can be applied to both Log Tables and Primary Key Tables, but with specific considerations: +A Table becomes a [Partitioned Table](/table-design/data-distribution/partitioning.md) when a partition column is defined. Data with the same partition value is stored in the same partition. Partition columns can be applied to both Log Tables and Primary Key Tables, but with specific considerations: - **For Log Tables**, partitioning is commonly used for log data, typically based on date columns, to facilitate data separation and cleaning. - **For Primary Key Tables**, the partition column must be a subset of the primary key to ensure uniqueness. diff --git a/website/docs/table-design/table-types/pk-table/index.md b/website/docs/table-design/table-types/pk-table.md similarity index 89% rename from website/docs/table-design/table-types/pk-table/index.md rename to website/docs/table-design/table-types/pk-table.md index 261424b5b9..7331d76198 100644 --- a/website/docs/table-design/table-types/pk-table/index.md +++ b/website/docs/table-design/table-types/pk-table.md @@ -31,7 +31,7 @@ In Fluss primary key table, each row of data has a unique primary key. If multiple entries with the same primary key are written to the Fluss primary key table, only the last entry will be retained. -For [Partitioned Primary Key Table](table-design/data-distribution/partitioning.md), the primary key must contain the +For [Partitioned Primary Key Table](/table-design/data-distribution/partitioning.md), the primary key must contain the partition key. ## Bucket Assigning @@ -82,10 +82,10 @@ However, users can specify a different merge engine to customize the merging beh The following merge engines are supported: -1. [Default Merge Engine (LastRow)](merge-engines/default.md) -2. [FirstRow Merge Engine](merge-engines/first-row.md) -3. [Versioned Merge Engine](merge-engines/versioned.md) -4. [Aggregation Merge Engine](merge-engines/aggregation.md) +1. [Default Merge Engine (LastRow)](/table-design/merge-engines/default.md) +2. [FirstRow Merge Engine](/table-design/merge-engines/first-row.md) +3. [Versioned Merge Engine](/table-design/merge-engines/versioned.md) +4. [Aggregation Merge Engine](/table-design/merge-engines/aggregation.md) ## Changelog Generation @@ -147,13 +147,13 @@ For primary key tables, Fluss supports various kinds of querying abilities. For a primary key table, the default read method is a full snapshot followed by incremental data. First, the snapshot data of the table is consumed, followed by the changelog data of the table. -It is also possible to only consume the changelog data of the table. For more details, please refer to the [Flink Reads](../../../engine-flink/reads.md) +It is also possible to only consume the changelog data of the table. For more details, please refer to the [Flink Reads](/engine-flink/reads.md) ### Lookup -Fluss primary key table can lookup data by the primary keys. If the key exists in Fluss, lookup will return a unique row. It is always used in [Flink Lookup Join](../../../engine-flink/lookups.md#lookup). +Fluss primary key table can lookup data by the primary keys. If the key exists in Fluss, lookup will return a unique row. It is always used in [Flink Lookup Join](/engine-flink/lookups.md#lookup). ### Prefix Lookup Fluss primary key table can also do prefix lookup by the prefix subset primary keys. Unlike lookup, prefix lookup -will scan data based on the prefix of primary keys and may return multiple rows. It is always used in [Flink Prefix Lookup Join](../../../engine-flink/lookups.md#prefix-lookup). +will scan data based on the prefix of primary keys and may return multiple rows. It is always used in [Flink Prefix Lookup Join](/engine-flink/lookups.md#prefix-lookup). diff --git a/website/docs/table-design/table-types/pk-table/_category_.json b/website/docs/table-design/table-types/pk-table/_category_.json deleted file mode 100644 index 7374558c6a..0000000000 --- a/website/docs/table-design/table-types/pk-table/_category_.json +++ /dev/null @@ -1,4 +0,0 @@ -{ - "label": "Primary Key Table", - "position": 1 -}