Skip to content

[Bug][JobHistory] Global history download does not filter runtype parameter causing data inconsistency #5310

@v-kkhuang

Description

@v-kkhuang

Linkis Component

linkis-public-enhancements/linkis-jobhistory

What happened

English:

The global history download function does not filter the runtype parameter, causing downloaded data to be inconsistent with query results, affecting user experience and data accuracy.

Problem Description:

  1. Global history query supports filtering by runtype parameter
  2. Download function does not apply runtype filter, resulting in downloaded data potentially including records that don't match filter criteria
  3. Downloaded data must strictly match query results

Impact:

  • Data inconsistency between UI display and downloaded files
  • User confusion and loss of trust in the system
  • Potential data analysis errors from inconsistent datasets

中文:

全局历史下载功能未过滤runtype参数,导致下载的数据与查询结果不一致,影响用户体验和数据准确性。

问题描述:

  1. 全局历史查询时支持runtype参数筛选
  2. 下载数据时未过滤runtype参数,导致下载的数据可能包含不符合筛选条件的记录
  3. 需要确保下载的数据与查询结果严格一致

影响:

  • UI显示和下载文件之间的数据不一致
  • 用户困惑和对系统失去信任
  • 不一致的数据集可能导致数据分析错误

What you expected to happen

English:

When users filter global history by runtype parameter and download the results, the downloaded data should exactly match the filtered query results:

  1. Download function should apply the same runtype filter as the query
  2. Downloaded data should contain only records matching the filter criteria
  3. Parameter passing mechanism should be robust to avoid parameter loss
  4. Consistent behavior between query display and download

中文:

当用户通过runtype参数筛选全局历史并下载结果时,下载的数据应该与筛选后的查询结果完全匹配:

  1. 下载功能应该应用与查询相同的runtype筛选
  2. 下载的数据应该只包含匹配筛选条件的记录
  3. 参数传递机制应该健壮以避免参数丢失
  4. 查询显示和下载之间的行为一致

How to reproduce

English:

  1. Access global history page in Linkis
  2. Apply runtype filter (e.g., runtype=sql)
  3. Verify filtered results display correctly (e.g., only SQL tasks shown)
  4. Click download button to export filtered results
  5. Open downloaded file
  6. Observe that downloaded data includes records with different runtype values

Expected: Downloaded file contains only runtype=sql records
Actual: Downloaded file contains all records regardless of runtype


中文:

  1. 访问Linkis中的全局历史页面
  2. 应用runtype筛选(例如,runtype=sql)
  3. 验证筛选结果正确显示(例如,只显示SQL任务)
  4. 点击下载按钮导出筛选结果
  5. 打开下载的文件
  6. 观察下载的数据包含不同runtype值的记录

期望: 下载文件只包含runtype=sql的记录
实际: 下载文件包含所有记录,无论runtype值如何

Anything else

English:

Root Cause Analysis:

// Current implementation (buggy)
def downloadGlobalHistory(request: DownloadRequest): Unit = {
  val query = buildQuery(
    startDate = request.startDate,
    endDate = request.endDate,
    user = request.user
    // runtype parameter is missing!
  )
  val results = executeQuery(query)
  exportToFile(results)
}

Suggested Fix:

// Fixed implementation
def downloadGlobalHistory(request: DownloadRequest): Unit = {
  val query = buildQuery(
    startDate = request.startDate,
    endDate = request.endDate,
    user = request.user,
    runtype = request.runtype  // Add runtype parameter
  )
  val results = executeQuery(query)

  // Validate result count matches UI display
  validateResultCount(results.size, request.expectedCount)

  exportToFile(results)
}

Additional Improvements:

  1. Parameter Validation:

    def validateDownloadRequest(request: DownloadRequest): Unit = {
      require(request.startDate != null, "Start date is required")
      require(request.endDate != null, "End date is required")
      // Validate runtype if provided
      if (request.runtype != null) {
        require(VALID_RUNTYPES.contains(request.runtype),
                s"Invalid runtype: ${request.runtype}")
      }
    }
  2. Consistent Query Building:

    // Reuse same query builder for display and download
    def buildHistoryQuery(params: QueryParams): String = {
      val conditions = mutableListOf<String>()
    
      conditions.add(s"created_time >= '${params.startDate}'")
      conditions.add(s"created_time <= '${params.endDate}'")
    
      if (params.runtype != null) {
        conditions.add(s"runtype = '${params.runtype}'")
      }
    
      if (params.user != null) {
        conditions.add(s"execute_user = '${params.user}'")
      }
    
      s"SELECT * FROM job_history WHERE ${conditions.joinToString(" AND ")}"
    }
  3. Testing:

    class GlobalHistoryDownloadTest {
      test("download should respect runtype filter") {
        // Setup: Insert test data with different runtypes
        insertTestData(runtype = "sql", count = 10)
        insertTestData(runtype = "spark", count = 5)
    
        // Filter by runtype=sql and download
        val request = DownloadRequest(runtype = "sql")
        val downloadedRecords = downloadGlobalHistory(request)
    
        // Verify only SQL records are downloaded
        assert(downloadedRecords.size == 10)
        assert(downloadedRecords.forall(_.runtype == "sql"))
      }
    }

Implementation Checklist:

  • Add runtype parameter to download request
  • Update query builder to include runtype filter
  • Add parameter validation
  • Implement result count verification
  • Add unit tests for runtype filtering
  • Add integration tests for download consistency
  • Update API documentation

中文:

根本原因分析:

// 当前实现(有bug)
def downloadGlobalHistory(request: DownloadRequest): Unit = {
  val query = buildQuery(
    startDate = request.startDate,
    endDate = request.endDate,
    user = request.user
    // runtype参数缺失!
  )
  val results = executeQuery(query)
  exportToFile(results)
}

建议修复:

// 修复后的实现
def downloadGlobalHistory(request: DownloadRequest): Unit = {
  val query = buildQuery(
    startDate = request.startDate,
    endDate = request.endDate,
    user = request.user,
    runtype = request.runtype  // 添加runtype参数
  )
  val results = executeQuery(query)

  // 验证结果计数与UI显示匹配
  validateResultCount(results.size, request.expectedCount)

  exportToFile(results)
}

额外改进:

  1. 参数验证:

    def validateDownloadRequest(request: DownloadRequest): Unit = {
      require(request.startDate != null, "需要开始日期")
      require(request.endDate != null, "需要结束日期")
      // 如果提供了runtype则验证
      if (request.runtype != null) {
        require(VALID_RUNTYPES.contains(request.runtype),
                s"无效的runtype: ${request.runtype}")
      }
    }
  2. 一致的查询构建:

    // 为显示和下载重用相同的查询构建器
    def buildHistoryQuery(params: QueryParams): String = {
      val conditions = mutableListOf<String>()
    
      conditions.add(s"created_time >= '${params.startDate}'")
      conditions.add(s"created_time <= '${params.endDate}'")
    
      if (params.runtype != null) {
        conditions.add(s"runtype = '${params.runtype}'")
      }
    
      if (params.user != null) {
        conditions.add(s"execute_user = '${params.user}'")
      }
    
      s"SELECT * FROM job_history WHERE ${conditions.joinToString(" AND ")}"
    }
  3. 测试:

    class GlobalHistoryDownloadTest {
      test("下载应该遵循runtype筛选") {
        // 设置:插入不同runtype的测试数据
        insertTestData(runtype = "sql", count = 10)
        insertTestData(runtype = "spark", count = 5)
    
        // 按runtype=sql筛选并下载
        val request = DownloadRequest(runtype = "sql")
        val downloadedRecords = downloadGlobalHistory(request)
    
        // 验证只下载SQL记录
        assert(downloadedRecords.size == 10)
        assert(downloadedRecords.forall(_.runtype == "sql"))
      }
    }

实施清单:

  • 向下载请求添加runtype参数
  • 更新查询构建器以包含runtype筛选
  • 添加参数验证
  • 实现结果计数验证
  • 为runtype筛选添加单元测试
  • 为下载一致性添加集成测试
  • 更新API文档

Are you willing to submit a PR?

  • Yes I am willing to submit a PR!

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions