|
| 1 | +# Copilot Instructions |
| 2 | + |
| 3 | +## Python Code Style |
| 4 | + |
| 5 | +When generating Python code for the DataFusion Python project, adhere to the following style guidelines: |
| 6 | + |
| 7 | +1. Follow the [PEP 8](https://www.python.org/dev/peps/pep-0008/) style guide |
| 8 | +2. Comply with Ruff lint rules, specifically: |
| 9 | + |
| 10 | + - Use double quotes for string literals |
| 11 | + - Use explicit relative imports (e.g., `from .module import Class`) |
| 12 | + - Limit line length to 88 characters (Black formatter standard) (prevent E501, W505) |
| 13 | + - Use type hints for function parameters and return values |
| 14 | + - Avoid unused imports |
| 15 | + - Use f-strings instead of `.format()` or `%` formatting |
| 16 | + - Avoid unnecessary `else` statements after `return` |
| 17 | + - Use `isinstance()` instead of comparing types directly |
| 18 | + - Keep docstrings concise and follow Google-style docstring format |
| 19 | + - Assign exception message strings to variables before raising (prevent EM101) |
| 20 | + |
| 21 | + ```python |
| 22 | + # Correct: |
| 23 | + msg = "Invalid input value" |
| 24 | + raise ValueError(msg) |
| 25 | + |
| 26 | + # Incorrect: |
| 27 | + raise ValueError("Invalid input value") |
| 28 | + ``` |
| 29 | + |
| 30 | +3. Include comprehensive docstrings for: |
| 31 | + |
| 32 | + - Modules |
| 33 | + - Classes |
| 34 | + - Functions/methods |
| 35 | + |
| 36 | +4. For tests: |
| 37 | + - Use meaningful test function names that describe what is being tested |
| 38 | + - Group related tests in classes |
| 39 | + - Use pytest style assertions instead of unittest style |
| 40 | + |
| 41 | +## Rust Code Style |
| 42 | + |
| 43 | +When generating Rust code for DataFusion, follow these guidelines: |
| 44 | + |
| 45 | +1. Do not add unnecessary parentheses around `if` conditions: |
| 46 | + |
| 47 | + - Correct: `if some_condition` |
| 48 | + - Incorrect: `if (some_condition)` |
| 49 | + |
| 50 | +2. Follow the standard Rust style conventions from rustfmt |
| 51 | + |
| 52 | +### Code Organization |
| 53 | + |
| 54 | +- Keep functions concise and focused on a single task |
| 55 | +- Aim for functions under 40-50 lines of code |
| 56 | +- Break long or complex operations into smaller, well-named helper functions |
| 57 | +- Before creating new utility functions, check if existing helper functions in the codebase can be reused or extended |
| 58 | +- Consider adding parameters to existing functions rather than creating similar parallel implementations |
| 59 | +- Don't overengineer; avoid creating abstractions that are not needed |
| 60 | +- Look for similar patterns in the codebase and follow established conventions |
| 61 | + |
| 62 | +## Comments |
| 63 | + |
| 64 | +- Add meaningful comments for complex logic |
| 65 | +- Avoid obvious comments |
| 66 | +- Start inline comments with a capital letter |
| 67 | + |
| 68 | +## Example Style |
| 69 | + |
| 70 | +```python |
| 71 | +from typing import Optional, List |
| 72 | + |
| 73 | +def process_data(data: List[str], max_length: Optional[int] = None) -> List[str]: |
| 74 | + """Process a list of string data. |
| 75 | +
|
| 76 | + Args: |
| 77 | + data: List of strings to process |
| 78 | + max_length: Optional maximum length for each string |
| 79 | +
|
| 80 | + Returns: |
| 81 | + List of processed strings |
| 82 | +
|
| 83 | + Raises: |
| 84 | + ValueError: If invalid data is provided |
| 85 | + """ |
| 86 | + if not data: |
| 87 | + raise ValueError("Empty data list provided") |
| 88 | + |
| 89 | + result = [] |
| 90 | + for item in data: |
| 91 | + # Skip empty items |
| 92 | + if not item.strip(): |
| 93 | + continue |
| 94 | + |
| 95 | + processed = item.strip().lower() |
| 96 | + if max_length is not None and len(processed) > max_length: |
| 97 | + processed = processed[:max_length] |
| 98 | + |
| 99 | + result.append(processed) |
| 100 | + |
| 101 | + return result |
| 102 | +``` |
0 commit comments