|
| 1 | +# Instructions for GitHub Copilot |
| 2 | + |
| 3 | +Welcome to the laygo data processing library project! Your main goal is to help write clean, consistent, and well-tested Python code. Please adhere strictly to the following principles and conventions. |
| 4 | + |
| 5 | +## 1. General Coding Style & Principles |
| 6 | +* Language Version: All code should be compatible with Python 3.12+. This means you should use modern features like the | union operator for type hints and match statements where appropriate. |
| 7 | + |
| 8 | +* Formatting: Strictly follow the PEP 8 style guide. Use a linter like ruff or flake8 to enforce this. |
| 9 | + |
| 10 | +* Type Hinting: This is mandatory. |
| 11 | + |
| 12 | +* Provide type hints for all function/method arguments and return values. |
| 13 | + |
| 14 | +* Use modern type unions: int | None is preferred over Union[int, None]. |
| 15 | + |
| 16 | +* Use types from collections.abc (e.g., Iterable, Callable, Iterator) for abstract collections. |
| 17 | + |
| 18 | +* Utilize the existing type aliases defined at the top of the file (e.g., PipelineFunction, InternalTransformer) for consistency. |
| 19 | + |
| 20 | +* Docstrings: All public classes, methods, and functions must have Google-style docstrings. |
| 21 | + |
| 22 | +* Include a brief one-line summary. |
| 23 | + |
| 24 | +* Provide a more detailed explanation if necessary. |
| 25 | + |
| 26 | +* Use Args:, Returns:, and Raises: sections to document parameters, return values, and exceptions. |
| 27 | + |
| 28 | +Example Docstring Template: |
| 29 | + |
| 30 | +```py |
| 31 | +def my_method(self, parameter_a: str, parameter_b: int | None = None) -> bool: |
| 32 | + """A brief summary of what this method does. |
| 33 | +
|
| 34 | + A more detailed explanation of the method's behavior, its purpose, |
| 35 | + and any important side effects or notes for the user. |
| 36 | +
|
| 37 | + Args: |
| 38 | + parameter_a: Description of the first parameter. |
| 39 | + parameter_b: Description of the optional second parameter. |
| 40 | +
|
| 41 | + Returns: |
| 42 | + A description of the return value, explaining what True or False means. |
| 43 | +
|
| 44 | + Raises: |
| 45 | + ValueError: If parameter_a has an invalid format. |
| 46 | + """ |
| 47 | + # ... implementation ... |
| 48 | +``` |
| 49 | + |
| 50 | +* When checking code, do not worry about whitespaces. There is a formatter in place that will handle that for you. |
| 51 | + |
| 52 | +* Don't add obvious comments. For example, avoid comments like # This is a loop or # Increment i by 1. Instead, focus on explaining why something is done, not what is done. |
| 53 | + |
| 54 | +* Avoid using comments to disable code. If a piece of code is not needed, it should be removed entirely. Use version control to track changes instead. |
| 55 | + |
| 56 | +* `# type: ignore` comments are ok only if there are no other options. For example, you know that the underlying code works correctly, but it's just a limitation of python in play. |
| 57 | + |
| 58 | +## 2. Naming Conventions |
| 59 | +* Consistency in naming is crucial for readability. |
| 60 | + |
| 61 | +* Functions & Methods: Use snake_case (e.g., build_chunk_generator, short_circuit). |
| 62 | + |
| 63 | +* Variables: Use snake_case (e.g., chunk_size, prev_transformer, loop_transformer). |
| 64 | + |
| 65 | +* Classes: Use PascalCase (e.g., Transformer, ErrorHandler, PipelineContext). |
| 66 | + |
| 67 | +* Constants: Use UPPER_SNAKE_CASE (e.g., DEFAULT_CHUNK_SIZE). |
| 68 | + |
| 69 | +* Internal Methods/Attributes: Prefix with a single underscore _ (e.g., _pipe). |
| 70 | + |
| 71 | +* Descriptiveness: |
| 72 | + |
| 73 | + * Functions used for filtering should be named predicate. |
| 74 | + |
| 75 | + * Functions passed to loop should be named condition. |
| 76 | + |
| 77 | + * Transformers passed into methods like loop or tap should be named loop_transformer or tapped_transformer. |
| 78 | + |
| 79 | +## 3. Transformer Class Specifics |
| 80 | +* Chainability: Every pipeline operation (map, filter, loop, etc.) must return self to allow for method chaining. |
| 81 | + |
| 82 | +* Immutability of Logic: Operations should not modify the Transformer instance in place but rather compose a new self.transformer function by wrapping the previous one. The _pipe method is the primary mechanism for this. |
| 83 | + |
| 84 | +* Context Awareness: When adding a new method that accepts a function (like map or filter), always check if that function is "context-aware" using the is_context_aware helper. Provide a separate execution path for both context-aware and non-aware functions. |
| 85 | + |
| 86 | +* Overloading: For methods that can accept multiple distinct types (like tap accepting a Callable or a Transformer), use the @overload decorator to provide clear type hints for each signature. |
| 87 | + |
| 88 | +## 4. Writing and Adding Tests |
| 89 | + |
| 90 | +* All new functionality must be accompanied by comprehensive tests using pytest. |
| 91 | + |
| 92 | +* File Location: Tests for laygo/transformers/transformer.py are located in tests/test_transformer.py. |
| 93 | + |
| 94 | +* Test Organization: |
| 95 | + |
| 96 | + * Group related tests into classes. The class name should follow the pattern Test<FeatureGroup>, for example: TestTransformerBasics, TestTransformerOperations, TestTransformerContextSupport, TestTransformerErrorHandling |
| 97 | + |
| 98 | + * When adding tests for a new method, add them to the most relevant existing test class. If the method introduces a new category of functionality, create a new Test... class for it. |
| 99 | + |
| 100 | +* Test Naming: Test methods must be descriptive and follow the pattern test_<method>_<scenario>. test_map_simple_transformation, test_loop_with_max_iterations, test_filter_with_empty_list, test_catch_with_context_aware_error |
| 101 | + |
| 102 | +* Test Structure (Arrange-Act-Assert): |
| 103 | + |
| 104 | + * Arrange: Set up all necessary data, including input lists, PipelineContext objects, and the Transformer instance itself. |
| 105 | + |
| 106 | + * Act: Execute the transformer on the data. The result should usually be materialized into a list, e.g., result = list(transformer(data)). |
| 107 | + |
| 108 | + * Assert: Check that the output is correct. If the operation has side effects (like in tap or loop), assert that the side effects are also correct. |
| 109 | + |
| 110 | +* Coverage for New Methods: When adding tests for a new method (e.g., a hypothetical my_new_op), ensure you cover: |
| 111 | + |
| 112 | + * Basic functionality (the "happy path"). |
| 113 | + |
| 114 | + * Context-aware version of the functionality. |
| 115 | + |
| 116 | + * Edge cases, such as an empty input list ([]), a list with a single element, and a case where the operation results in an empty list. |
| 117 | + |
| 118 | +* Interaction with other operations in a chain. |
| 119 | + |
| 120 | +* Behavior with different chunk sizes to ensure chunking does not affect the outcome. |
| 121 | + |
| 122 | +## 5. Documentation |
| 123 | +* Whenever you're adding new functionality, make sure you create documentation in the wiki folder and link it in the Home.md file. |
| 124 | + |
| 125 | +* Do not go overboard with examples. The goal is to give a clear understanding of how to use the new functionality, not to provide exhaustive examples. |
| 126 | + |
| 127 | +## 6. Examples |
| 128 | +* There should be an examples folder in the root of the repository. |
| 129 | + |
| 130 | +* The folder should contain example scripts with clear names that indicate their purpose, such as example_basic_pipeline.py, example_context_aware_operations.py, and example_error_handling.py. |
| 131 | + |
| 132 | +* Each example script should include a brief comment at the top explaining what the script demonstrates. |
| 133 | + |
| 134 | +* The examples should be runnable as standalone scripts, meaning they should not rely on any external setup or configuration. |
0 commit comments