-
Notifications
You must be signed in to change notification settings - Fork 1k
feat: add optional notebook cell output inclusion to process_notebook utility
#128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
This addresses #119 |
cyclotruc
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tested this and seems very good thank you, the notebooks are now spot on!
|
Thanks @filipchristiansen !! This PR was merged way faster lol , I would recommend that we try add cell metadata # Cell: cellNumber ; Type: cellType
cellData
# Output
"""
Output also inside doc string
""" Although this is great work too !! According to the diff , we haven't added functionality to query-parsing for include output , we can create a issue for that and include this feature request in that issue |
Could you please specify exactly what metadata you want included and why? Example
Is there any other metadata you want included? |
I don't intent on sounding rude , but suppose we have a notebook cell 1 we have some markdown **Welcome to Gitingest**consider empty or non parse-able cells here cell 4 """
This Doc-string is just for showcase
"""
Class Hi:
def init(self):
print("hi")Then its py-script according to current merge would be """
**Welcome to Gitingest**
"""
"""
This Doc-string is just for showcase
"""
Class Hi:
def init(self):
print("hi")
# Output
# hiwhat I suggest is # Cell 1 ; Type: Markdown
"""
**Welcome to Gitingest**
"""
# Cell 2 ; Type: Code
"""
This Doc-string is just for showcase
"""
Class Hi:
def init(self):
print("hi")
# Output
"""
hi # output should also be enclosed in docstring , It looks more neat and is genrally used for multi-line comment
""" Then I would leave it up to you and @cyclotruc to decide which is more suited and would further increase UX of GitIngest As for regard to include other metadata Not in my Also as for Cell number , Increment only when you parse it , if its empty or un parse-able don't increment cellNumber |
|
I meant to tag you and not @cyclotruc in my previous comment.
What do you mean? Why would it be rude to suggest a feature? :)
I agree.
I still do not understand what you mean by "query-parsing for include output"? The whole purpose of this PR was to include output, and it has now been merged. Please explain what you mean. You still did not mention the motivation for including cell numbers and cell types. I am not saying we should not include it—I am just wondering whether it is motivated. |
This PR introduces an optional parameter (
include_output) to theprocess_notebookfunction, enabling control over whether code cell outputs are included in the generated Python script.Key Changes
InvalidNotebookError: Added a new custom exception to better handle invalid or non-decodable notebook files.process_notebookenhancements:include_output) defaulting toTrue.# Output: ...) in the generated script ifinclude_output=True._process_cell,_extract_output) for better readability.InvalidNotebookErroris now caught alongsideOSErrorwhen reading notebook files inquery_ingestion.py.test_notebook_utils.py(including tests for both enabling and disabling outputs).Why This Change?
Previously, the script generation process did not account for cell outputs, which can be important for understanding context in code cells or for debugging. By making this feature optional, users can choose to omit outputs in large or potentially sensitive notebooks.
Example
A single-cell notebook with:
will produce:
when
include_outputisTrue(default).