Skip to content

Conversation

@bosd
Copy link
Member

@bosd bosd commented Sep 28, 2025

No description provided.

@gemini-code-assist
Copy link

Summary of Changes

Hello @bosd, I'm Gemini Code Assist1! I'm currently reviewing this pull request and will post my feedback shortly. In the meantime, here's a summary to help you and other reviewers quickly get up to speed!

This pull request significantly enhances the data import capabilities by fortifying its resilience against common data quality issues, especially type mismatches and encoding problems. It introduces intelligent pre-processing and error recovery mechanisms that automatically detect and correct problematic data types before they reach Odoo, and provides more informative feedback when issues arise. Furthermore, it refines the handling of relational fields by supporting diverse ID formats and improving the efficiency and logging of relational updates.

Highlights

  • Enhanced Type Conversion for Odoo Imports: Introduced a new _safe_convert_field_value function to pre-process data, converting float strings (e.g., "1.0") to integers for Odoo's integer fields and handling empty numeric values, preventing common type-related errors during record creation.
  • Robust Error Handling and Fallback: Implemented specific error detection for 'tuple index out of range' and 'does not seem to be an integer' errors during both bulk load and individual create operations. If bulk load fails due to these type errors, the system now gracefully falls back to processing records individually.
  • Improved CSV Reading with Encoding Fallback: The _read_data_file function now includes an automatic fallback mechanism to try multiple common encodings (utf-8, latin-1, cp1252, iso-8859-1) if a UnicodeDecodeError occurs, making CSV parsing more resilient.
  • Pre-flight Type Correction using Polars: Added a new _auto_correct_field_types function that uses Polars to proactively identify and correct float-like string values in integer columns within the source CSV, saving a corrected file for import.
  • Flexible External ID Resolution: The _resolve_related_ids function now supports resolving both numeric database IDs (passed directly) and traditional XML IDs (e.g., module.name) for relational fields, improving flexibility and reducing resolution failures.
  • Refined Relational Import Logic: The run_write_tuple_import function has been refactored for better ID normalization, explicit handling of /id suffixes in column names, and delegation of Odoo write operations to a new _execute_write_tuple_updates helper, enhancing clarity and error reporting.
Using Gemini Code Assist

The full guide for Gemini Code Assist can be found on our documentation page, here are some quick tips.

Invoking Gemini

You can request assistance from Gemini at any point by creating a comment using either /gemini <command> or @gemini-code-assist <command>. Below is a summary of the supported commands on the current page.

Feature Command Description
Code Review /gemini review Performs a code review for the current pull request in its current state.
Pull Request Summary /gemini summary Provides a summary of the current pull request in its current state.
Comment @gemini-code-assist Responds in comments when explicitly tagged, both in pull request comments and review comments.
Help /gemini help Displays a list of available commands.

Customization

To customize Gemini Code Assist for GitHub experience, repository maintainers can create a configuration file and/or provide a custom code review style guide (such as PEP-8 for Python) by creating and adding files to a .gemini/ folder in the base of the repository. Detailed instructions can be found here.

Limitations & Feedback

Gemini Code Assist may make mistakes. Please leave feedback on any instances where its feedback is incorrect or counter productive. You can react with 👍 and 👎 on @gemini-code-assist comments. If you're interested in giving your feedback about your experience with Gemini Code Assist for Github and other Google products, sign up here.

You can also get AI-powered code generation, chat, as well as code reviews directly in the IDE at no cost with the Gemini Code Assist IDE Extension.

Footnotes

  1. Review the Privacy Notices, Generative AI Prohibited Use Policy, Terms of Service, and learn how to configure Gemini Code Assist in GitHub here. Gemini can make mistakes, so double check it and use code with caution.

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to data type handling and error resilience, particularly for relational fields and numeric conversions. The changes correctly handle external IDs suffixed with /id by treating them as strings during the import process, which is a crucial fix. The introduction of _safe_convert_field_value and pre-flight type correction using Polars are excellent additions for preventing common type-related errors. However, the implementation has introduced considerable code duplication, especially in import_threaded.py for retrieving model field metadata and performing type conversions. Refactoring these duplicated blocks into centralized helper functions would greatly improve the code's maintainability and readability. Additionally, there are a few areas where logic could be simplified or made more robust, such as in CSV reading and redundant connection fetching.

bosd and others added 23 commits September 28, 2025 17:30
…ed branch onto latest master\n- Resolved merge conflicts in import_threaded.py, importer.py, and relational_import.py\n- Maintained enhanced debug logging and field handling functionality\n- Preserved fixes for direct database ID handling and connection management
…rors that were being masked by generic IndexError handler
…ors by safely converting field values before sending to Odoo
…ex out of range errors occur in bulk load operations
…ers for integer fields to prevent tuple index out of range errors
…hen _fields is a function/method instead of a dictionary
…n _fields is a method vs dictionary property, and add safer IndexError catching in both batch and individual record processing
…ex out of range errors caused by float strings in integer fields
… errors: broaden detection to catch 'does not seem to be an integer' errors alongside 'tuple index out of range'
…atch and add proper IndexError handling with fallback to individual record processing
…oat strings in integer fields before import using Polars casting
…y checking model._fields before accessing it
…s in load batch processing and fall back to individual record creation
@bosd bosd force-pushed the fix-o2m-id-field-handling-rebased branch from 9abd787 to 0914ff5 Compare September 28, 2025 15:36
bosd and others added 4 commits September 28, 2025 17:36
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
Co-authored-by: gemini-code-assist[bot] <176961590+gemini-code-assist[bot]@users.noreply.github.com>
…ed branch onto latest master\n- Resolved merge conflicts in import_threaded.py, importer.py, and relational_import.py\n- Maintained enhanced debug logging and field handling functionality\n- Preserved fixes for direct database ID handling and connection management
bosd added 17 commits September 28, 2025 17:47
…ers for integer fields to prevent tuple index out of range errors
…hen _fields is a function/method instead of a dictionary
…n _fields is a method vs dictionary property, and add safer IndexError catching in both batch and individual record processing
…ex out of range errors caused by float strings in integer fields
… errors: broaden detection to catch 'does not seem to be an integer' errors alongside 'tuple index out of range'
…atch and add proper IndexError handling with fallback to individual record processing
…oat strings in integer fields before import using Polars casting
…y checking model._fields before accessing it
…s in load batch processing and fall back to individual record creation
…taFlow/odoo-data-flow into fix-o2m-id-field-handling-rebased
@bosd
Copy link
Member Author

bosd commented Sep 28, 2025

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant improvements to the data import process, focusing on robustness and error handling, particularly for data type mismatches that cause "tuple index out of range" errors. Key changes include refactoring CSV reading to handle encoding issues, adding safe type conversion logic before sending data to Odoo, and enhancing relational ID resolution to support both database and XML IDs.

My review focuses on several areas of code duplication that have been introduced, potential performance bottlenecks in file parsing and database updates, and the use of fixed delays (time.sleep) which could be replaced with more dynamic solutions. Addressing these points will further improve the maintainability and performance of this already much-improved import engine.

Comment on lines +511 to +521
elif field_type == "float":
try:
# Convert numeric strings to float
if str_value.replace(".", "").replace("-", "").isdigit():
return float(str_value)
else:
# Non-numeric string - leave as-is
return str_value
except (ValueError, TypeError):
# Conversion failed - leave as original string
return field_value

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The logic for converting a string to a float is a bit complex and can be simplified. The if str_value.replace(".", "").replace("-", "").isdigit(): check is not fully robust (e.g., it would fail for scientific notation or incorrectly handle multiple decimal points) and is somewhat redundant given the surrounding try-except block.

A more direct and robust approach would be to attempt the conversion and let the except block handle failures.

    elif field_type == "float":
        try:
            # Attempt to convert directly to float.
            return float(str_value)
        except (ValueError, TypeError):
            # Conversion failed - leave as original string for Odoo to handle.
            return field_value

Comment on lines +890 to +918
# PRE-PROCESSING: Clean up field values to prevent type errors
# This prevents "tuple index out of range" errors in Odoo server processing
model_fields = _get_model_fields(model)
if model_fields:
processed_load_lines = []
for row in load_lines:
processed_row = []
for i, value in enumerate(row):
if i < len(load_header):
field_name = load_header[i]
clean_field_name = field_name.split("/")[0]

field_type = "unknown"
if clean_field_name in model_fields:
field_info = model_fields[clean_field_name]
field_type = field_info.get("type", "unknown")

converted_value = _safe_convert_field_value(
field_name, value, field_type
)
processed_row.append(converted_value)
else:
processed_row.append(value)
processed_load_lines.append(processed_row)
load_lines = processed_load_lines
else:
log.debug(
"Model has no _fields attribute, using raw values for load method"
)

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

The pre-processing logic here to safely convert field values is a duplicate of the logic found in _create_batch_individually (lines 662-674).

To adhere to the DRY (Don't Repeat Yourself) principle and improve maintainability, this logic should be extracted into a single helper function. This function could take the header, data rows, and model fields as input and return the processed rows, and then be called from both _execute_load_batch and _create_batch_individually.

Comment on lines 675 to 698
# 4. Execute the updates
success = _execute_write_tuple_updates(
config, model, field, link_df, id_map, related_model_fk, original_filename
)

# Count successful updates - get from link_df
if link_df.height > 0:
successful_count = len(
[
row["external_id"]
for row in link_df.iter_rows(named=True)
if id_map.get(row["external_id"])
]
)
else:
successful_count = 0
failed_count = 0 if success else "unknown"

log.info(
f"Finished 'Write Tuple' for '{field}': "
f"{successful_count} successful, {failed_count} failed."
)

return success

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

medium

After this refactoring, the _create_relational_records function is no longer called from run_write_tuple_import and appears to be unused throughout the codebase. If it is indeed no longer needed, it should be removed to eliminate dead code and improve maintainability.

@bosd
Copy link
Member Author

bosd commented Sep 28, 2025

/gemini review

Copy link

@gemini-code-assist gemini-code-assist bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code Review

This pull request introduces significant enhancements to the data import process, focusing on robustness and error handling. Key changes include proactive data type conversion to prevent common errors with numeric fields, more resilient logic for reading CSV files with various encodings, and a major refactoring of how relational fields (especially many-to-many) are handled. The new approach for relational imports is more idiomatic to Odoo and more efficient.

My review focuses on a few areas for improvement:

  • A potential performance issue with large files due to in-memory file reading.
  • A resource leak from a temporary file that is not being cleaned up.
  • Opportunities to reduce code duplication and remove dead code for better maintainability.

Overall, this is a strong set of changes that will make the import tool much more reliable.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants