Skip to content
Open
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/admin/trino/query-history.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/admin/trino/trino-cluster.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/admin/trino/worker-status.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/admin/trino/workers.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/guides/trino/queries.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/guides/trino/query-engine.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added docs/assets/images/guides/trino/sql-runner.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
85 changes: 85 additions & 0 deletions docs/setup_installation/admin/operationLogs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,85 @@
---
description: Guide on how to manage async service operations as a Hopsworks administrator
---

# Service Operations

Service Operations provides a centralized view of asynchronous operations executed across Hopsworks services. Operations are handled by a timer-based system that manages execution, retries, and tracking across multiple service handlers.

## Overview

Asynchronous operations in Hopsworks are processed by background timers rather than executing immediately. Each operation can be handled by multiple services, with each service registering its own handler. The Service Operations interface allows administrators to:

- Monitor ongoing and completed operations
- View operation status per service
- Track retry attempts and failures
- Reset failed operations
- View operation history

<figure>
<img src="../../../assets/images/admin/operation-logs/operation-logs.png" alt="Service Operations" />
<figcaption>Service Operations</figcaption>
</figure>

## Operation Status

Each operation displays status information for every service that has registered a handler. The status table shows:

- **Operation ID**: Unique identifier for the operation
- **Service**: The service handling this operation
- **Status**: Current state (pending, running, completed, failed)
- **Retry count**: Number of retry attempts made
- **Last execution**: Timestamp of the most recent execution attempt

Click on an operation to view detailed status messages and execution logs.

<figure>
<img src="../../../assets/images/admin/operation-logs/operation-logs-msg.png" alt="Service Operations status message" />
<figcaption>Service Operations status message</figcaption>
</figure>

## Retry Mechanism

When an operation fails for a service, the system automatically retries using exponential backoff:

- **Exponential backoff**: Retry intervals increase exponentially (e.g., 1 min, 2 min, 4 min, 8 min, etc.)
- **Daily limit**: Operations are retried up to a maximum number of times per day
- **Successful operations**: Automatically removed from the active list and moved to history

This approach prevents overwhelming services with continuous retries while allowing temporary failures to resolve automatically.

## Resetting Backoff

For operations with more than 10 retry attempts, administrators can manually reset the backoff:

1. Locate the failed operation in the Service Operations list
2. Click on the operation to view details
3. If the retry count exceeds 10, a "Reset Backoff" button will be available
4. Click to reset the backoff to its initial value

This allows the operation to retry sooner after you've resolved the underlying issue.

## Operation History

Completed and failed operations are retained in the system for a configurable number of days. After the retention period expires, operations are automatically purged from history. This keeps the operation log manageable while preserving recent historical data for troubleshooting and auditing.

## Configuration

Service Operations behavior can be customized through cluster configuration variables. To modify these settings, navigate to **Cluster Settings** → **Configuration** and search for the variable name.

**Available Variables:**

- **async_services_timer_enabled**: Enable or disable the async services timer (default: `true`)
- **async_services_timer_interval_ms**: Timer execution interval in milliseconds (default: `15000` = 15 seconds)
- **async_services_timer_delete_history_after_days**: Days to retain operation history (default: `7`)
- **async_services_timer_batch_size**: Maximum operations processed per timer execution (default: `1000`)

Adjust these values to balance system responsiveness, resource usage, and historical data retention.

## Best Practices

- **Monitor regularly**: Check Service Operations to identify recurring failures
- **Investigate failures**: Click on failed operations to review error messages and identify root causes
- **Reset when appropriate**: Use backoff reset after fixing issues to speed up recovery
- **Configure retention**: Set operation history retention based on compliance and troubleshooting needs
- **Track patterns**: Look for patterns in failures that may indicate systemic issues
100 changes: 100 additions & 0 deletions docs/setup_installation/admin/trino.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,100 @@
---
description: Guide on how to manage Trino as a Hopsworks administrator
---

# Query Engine (Trino)

As a Hopsworks administrator, you can monitor and manage the Trino cluster used for query execution across all projects. The admin interface provides cluster-wide visibility into resources, performance, and worker health.

## Cluster Overview

The cluster overview provides a comprehensive view of your Trino deployment, including:

- **Cluster status**: Overall health and availability
- **Active queries**: Total number of running queries across all projects
- **Worker nodes**: Number of active and total workers
- **Resource utilization**: Cluster-wide CPU and memory usage
- **Query throughput**: Average query execution times and data processed

Use this dashboard to monitor overall cluster health and identify capacity issues.

<figure>
<img src="../../../assets/images/admin/trino/trino-cluster.png" alt="cluster overview" />
<figcaption>Trino cluster overview</figcaption>
</figure>

## Query History

The query history shows all queries executed across the Trino cluster, regardless of project. This centralized view helps administrators:

- **Monitor usage patterns**: Identify peak usage times and resource-intensive queries
- **Troubleshoot issues**: Investigate failed or slow queries
- **Audit activity**: Track query execution by project and user
- **Optimize performance**: Identify queries that may need optimization

Each query entry displays:

- Query ID and text
- Project and user who executed it
- Status (running, completed, failed)
- Execution time and resources consumed
- Timestamp

Click any query to view detailed execution information.

<figure>
<img src="../../../assets/images/admin/trino/query-history.png" alt="query history" />
<figcaption>Trino query history</figcaption>
</figure>

## Managing Workers

The workers view displays all Trino nodes in the cluster. For each node, you can see:

- **Node IP**: IP address of the worker node
- **Node version**: Trino version running on the node
- **Coordinator or worker**: Role of the node (coordinator or worker)
- **State**: Current state of the node (active, idle, or offline)

This view helps you monitor the cluster topology and identify any nodes that may be offline or experiencing issues.

<figure>
<img src="../../../assets/images/admin/trino/workers.png" alt="workers" />
<figcaption>Trino workers</figcaption>
</figure>

### Worker Status Details

Click on a worker to view detailed status information:

- **Resource metrics**: Detailed CPU, memory, and network usage over time
- **Task breakdown**: Types and number of tasks being executed
- **Error logs**: Any errors or warnings from the worker
- **Configuration**: Worker settings and assigned resources
- **Performance history**: Historical performance trends

Use this detailed view to diagnose worker-specific issues and optimize resource allocation.

<figure>
<img src="../../../assets/images/admin/trino/worker-status.png" alt="worker status" />
<figcaption>Trino worker status</figcaption>
</figure>

## Configuration

Trino behavior can be customized through cluster configuration variables. To modify these settings, navigate to **Cluster Settings** → **Configuration** and search for the variable name.

**Available Variables:**

- **trino_enabled**: Enable or disable Trino cluster-wide (default: `true`)
- **trino_default_catalog**: Default catalog used for Superset queries (default: `hive`)

These settings control the availability and default behavior of the Trino query engine across your Hopsworks cluster.

## Best Practices for Trino Management

- **Monitor regularly**: Check cluster overview daily to spot trends and issues early
- **Review slow queries**: Investigate queries with long execution times in the query history
- **Balance workload**: Ensure workers are evenly distributed and not overloaded
- **Scale appropriately**: Add workers during peak usage periods if resources are constrained
- **Track growth**: Monitor query volume trends to plan for future capacity needs
155 changes: 155 additions & 0 deletions docs/user_guides/projects/trino/query_engine.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,155 @@
---
description: Guide on how to use Query Engine as a Hopsworks user
---

# Query Engine (Trino)

The Query Engine in Hopsworks is powered by Trino, a distributed SQL query engine that allows you to run interactive analytics on your data. Use it to explore feature groups, run ad-hoc queries, and analyze data across your project.

## Accessing the Query Engine

Navigate to the Query Engine from your project's left sidebar. The Query Engine interface provides access to the SQL runner, cluster information, and query history.

<figure>
<img src="../../../../assets/images/guides/trino/query-engine.png" alt="Query Engine" />
<figcaption>Query Engine</figcaption>
</figure>

## SQL Runner

The SQL runner is where you write and execute SQL queries against your data.

**To run a query:**

1. Write your SQL query in the editor
2. Select the database/catalog you want to query
3. Click "Run" to execute the query
4. View results in the table below the editor

The SQL runner supports standard SQL syntax and provides auto-completion for databases, tables, and columns.

<figure>
<img src="../../../../assets/images/guides/trino/sql-runner.png" alt="SQL runner" />
<figcaption>SQL runner</figcaption>
</figure>

### SQL Statement Syntax Help

Need help with SQL syntax? Click the help icon in the SQL runner to access the complete reference of all allowed SQL statement syntax. This includes SELECT statements, functions, data types, operators, and more. The syntax reference is readily available without leaving the query interface.

<figure>
<img src="../../../../assets/images/guides/trino/sql-statement-syntax.png" alt="SQL statement syntax" />
<figcaption>SQL statement syntax</figcaption>
</figure>

## Cluster Overview

The cluster overview shows the health and status of your Trino cluster. Here you can monitor:

- **Active workers**: Number of workers currently processing queries
- **Running queries**: Queries currently being executed
- **Resource utilization**: CPU and memory usage across the cluster
- **Worker status**: Health status of individual worker nodes

This information helps you understand cluster performance and capacity.

<figure>
<img src="../../../../assets/images/guides/trino/cluster-overview.png" alt="cluster overview" />
<figcaption>Query Engine cluster overview</figcaption>
</figure>

## Queries

The Queries tab displays a history of all executed queries. For each query, you can see:

- **Query ID**: Unique identifier for the query
- **Status**: Completed, failed, or running
- **Duration**: How long the query took to execute
- **User**: Who submitted the query
- **Timestamp**: When the query was run

Click on any query to view detailed execution information.

<figure>
<img src="../../../../assets/images/guides/trino/queries.png" alt="queries" />
<figcaption>Queries</figcaption>
</figure>

## Query Details

Clicking on a query opens the detailed view with comprehensive execution information.

### Overview

The overview tab shows query metadata, execution timeline, and performance metrics including:

- Query text
- Execution time
- Data processed
- Rows returned
- Resource consumption

<figure>
<img src="../../../../assets/images/guides/trino/query-details.png" alt="Query details" />
<figcaption>Query details</figcaption>
</figure>

### Live Plan

The live plan visualizes the query execution plan in real-time, showing how Trino processes your query across different stages and operators.

<figure>
<img src="../../../../assets/images/guides/trino/query-details-live-plan.png" alt="Query details live plan" />
<figcaption>Query details: live plan</figcaption>
</figure>

### Stages

The stages view breaks down query execution into individual stages, showing:

- Stage dependencies
- Data flow between stages
- Resource usage per stage
- Execution time for each stage

This helps identify performance bottlenecks in complex queries.

<figure>
<img src="../../../../assets/images/guides/trino/query-details-stage.png" alt="Query details stages" />
<figcaption>Query details: stages</figcaption>
</figure>

### Splits

Splits show how Trino parallelizes query execution. Each split represents a portion of data processed by a worker. View split-level metrics to understand query parallelism and data distribution.

<figure>
<img src="../../../../assets/images/guides/trino/query-details-split.png" alt="Query details split" />
<figcaption>Query details: split</figcaption>
</figure>

### References

The references tab lists all tables and data sources accessed by the query, helping you understand data dependencies.

<figure>
<img src="../../../../assets/images/guides/trino/query-details-ref.png" alt="Query details references" />
<figcaption>Query details: references</figcaption>
</figure>

### JSON

The JSON view provides the complete query execution plan and statistics in JSON format, useful for programmatic analysis or debugging.

<figure>
<img src="../../../../assets/images/guides/trino/query-details-json.png" alt="Query details json" />
<figcaption>Query details: json</figcaption>
</figure>

## Best Practices

- **Limit result sets**: Use `LIMIT` clauses for exploratory queries to reduce resource usage
- **Filter early**: Apply `WHERE` clauses to reduce data scanned
- **Monitor query performance**: Check the Queries tab to identify slow or failed queries
- **Use the live plan**: For complex queries, review the execution plan to optimize performance
- **Check cluster status**: Ensure adequate resources are available before running large queries
3 changes: 3 additions & 0 deletions mkdocs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -190,6 +190,7 @@ nav:
- Api Keys:
- Create API Key: user_guides/projects/api_key/create_api_key.md
- AWS IAM Roles: user_guides/projects/iam_role/iam_role_chaining.md
- Query Engine: user_guides/projects/trino/query_engine.md
- MLOps:
- user_guides/mlops/index.md
- Model Registry:
Expand Down Expand Up @@ -263,6 +264,8 @@ nav:
- Audit:
- Access Audit Logs: setup_installation/admin/audit/audit-logs.md
- Export Audit Logs: setup_installation/admin/audit/export-audit-logs.md
- Service Operations: setup_installation/admin/operationLogs.md
- Query Engine: setup_installation/admin/trino.md
- ArrowFlight Server with DuckDB: setup_installation/common/arrow_flight_duckdb.md
- Python API: python-api
- Java API: javadoc
Expand Down