From d81cd87411eec07972abc73177c7af39ec317963 Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Thu, 6 Feb 2025 15:05:18 -0500 Subject: [PATCH 1/5] Add github action to codespell master on push and PRs --- .github/workflows/codespell.yml | 25 +++++++++++++++++++++++++ 1 file changed, 25 insertions(+) create mode 100644 .github/workflows/codespell.yml diff --git a/.github/workflows/codespell.yml b/.github/workflows/codespell.yml new file mode 100644 index 0000000..b026c85 --- /dev/null +++ b/.github/workflows/codespell.yml @@ -0,0 +1,25 @@ +# Codespell configuration is within .codespellrc +--- +name: Codespell + +on: + push: + branches: [master] + pull_request: + branches: [master] + +permissions: + contents: read + +jobs: + codespell: + name: Check for spelling errors + runs-on: ubuntu-latest + + steps: + - name: Checkout + uses: actions/checkout@v4 + - name: Annotate locations with typos + uses: codespell-project/codespell-problem-matcher@v1 + - name: Codespell + uses: codespell-project/actions-codespell@v2 From 8e08655460cef0f036bc51fa78d48389b7ddd8e8 Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Thu, 6 Feb 2025 15:05:18 -0500 Subject: [PATCH 2/5] Add rudimentary codespell config --- .codespellrc | 6 ++++++ 1 file changed, 6 insertions(+) create mode 100644 .codespellrc diff --git a/.codespellrc b/.codespellrc new file mode 100644 index 0000000..4389e72 --- /dev/null +++ b/.codespellrc @@ -0,0 +1,6 @@ +[codespell] +# Ref: https://github.com/codespell-project/codespell#using-a-config-file +skip = .git*,*.pdf,.codespellrc +check-hidden = true +# ignore-regex = +# ignore-words-list = From e7d82bf19a98ca35f1ad1ec8484427d289f53d06 Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Thu, 6 Feb 2025 15:05:18 -0500 Subject: [PATCH 3/5] Add pre-commit definition for codespell --- .pre-commit-config.yaml | 6 ++++++ 1 file changed, 6 insertions(+) diff --git a/.pre-commit-config.yaml b/.pre-commit-config.yaml index 5436120..dab27a2 100644 --- a/.pre-commit-config.yaml +++ b/.pre-commit-config.yaml @@ -13,3 +13,9 @@ repos: rev: 19.3b0 hooks: - id: black + +- repo: https://github.com/codespell-project/codespell + # Configuration for codespell is in .codespellrc + rev: v2.4.0 + hooks: + - id: codespell From 7de24db23eff80bc983f061ade353e3cc52002d3 Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Thu, 6 Feb 2025 15:05:28 -0500 Subject: [PATCH 4/5] [DATALAD RUNCMD] run codespell throughout fixing typos automagically (but ignoring overall fail due to ambigous ones) === Do not change lines below === { "chain": [], "cmd": "codespell -w || :", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "." } ^^^ Do not change lines above ^^^ --- Brainhack_2019_Issue.md | 4 ++-- README.md | 6 +++--- segstats_jsonld/fs_to_nidm.py | 2 +- segstats_jsonld/mapping_data/FreeSurferColorLUT.txt | 4 ++-- 4 files changed, 8 insertions(+), 8 deletions(-) diff --git a/Brainhack_2019_Issue.md b/Brainhack_2019_Issue.md index d820a7b..c75f97f 100644 --- a/Brainhack_2019_Issue.md +++ b/Brainhack_2019_Issue.md @@ -1,7 +1,7 @@ # Name of your awesome project Script to Export Freesurfer-based Parcellation/Segmentation Stats and Provenance as JSON-LD and NIDM ## Project Description -This project ultimately aims to facilitate both query and analysis of parcellation/segmentation based regional statistics across popular softwares such as [Freesurfer](https://surfer.nmr.mgh.harvard.edu/), [FSL](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki), and [ANTS](http://stnava.github.io/ANTs/). Currently each software produces its own output format and brain region labels are specific to the atlas used in generating the regional statistics. This makes life difficult when trying to search for "nucleaus accumbens" volume, for example, across the different software products. Further, knowing which version of the software tool used and what atlas and version of the atlas in a structured representation facilitating query is lacking. To this end we propose augmenting the various segmentation tools with scripts that will: (1) map atlas-specific anatomical nomeclature to anatomical concepts hosted in terminology resources (e.g. InterLex); (2) capture better structured provenance about the input image(s) and the atlases used for the segmentation; (3) export the segmentation results and the provenance as either [JSON-LD](https://json-ld.org/), [NIDM](http://nidm.nidash.org/) which can then link the derived data to broader records of the original project metadata, or as an additional component of a BIDS derivative. +This project ultimately aims to facilitate both query and analysis of parcellation/segmentation based regional statistics across popular software such as [Freesurfer](https://surfer.nmr.mgh.harvard.edu/), [FSL](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki), and [ANTS](http://stnava.github.io/ANTs/). Currently each software produces its own output format and brain region labels are specific to the atlas used in generating the regional statistics. This makes life difficult when trying to search for "nucleaus accumbens" volume, for example, across the different software products. Further, knowing which version of the software tool used and what atlas and version of the atlas in a structured representation facilitating query is lacking. To this end we propose augmenting the various segmentation tools with scripts that will: (1) map atlas-specific anatomical nomeclature to anatomical concepts hosted in terminology resources (e.g. InterLex); (2) capture better structured provenance about the input image(s) and the atlases used for the segmentation; (3) export the segmentation results and the provenance as either [JSON-LD](https://json-ld.org/), [NIDM](http://nidm.nidash.org/) which can then link the derived data to broader records of the original project metadata, or as an additional component of a BIDS derivative. We aim to tackle this problem in steps. For this hackathon project we'll be focusing on conversion from Freesurfer's [mri_segstats](https://surfer.nmr.mgh.harvard.edu/fswiki/mri_segstat) program output along with some additional parsing/conversion of Freesurfer log files. @@ -9,7 +9,7 @@ We aim to tackle this problem in steps. For this hackathon project we'll be foc Python and structural neuroimaging experience. If one has experience with [rdflib](https://github.com/RDFLib/rdflib) or [PROV](https://github.com/trungdong/prov) that would also be helpful. A good appreciation of Japanese sake may also help for late night discussion. ## Integration -This project will need expertise in programming, structural neuroimaging, and anatomy. To make this project sucessful we need individuals who have skills in any of these domains to help with: (1) understand Freesurfer's segmentation results format and log files; (2) programming up a script in Python; (3) understand anatomy well enough to select the proper anatomical concept that maps to a specific atlas designation of a region and ***can define new anatomy terms where needed, linking them to broader concepts*** to facilitate segmentation results queries across softwares. +This project will need expertise in programming, structural neuroimaging, and anatomy. To make this project successful we need individuals who have skills in any of these domains to help with: (1) understand Freesurfer's segmentation results format and log files; (2) programming up a script in Python; (3) understand anatomy well enough to select the proper anatomical concept that maps to a specific atlas designation of a region and ***can define new anatomy terms where needed, linking them to broader concepts*** to facilitate segmentation results queries across software. ## Preparation material * [Freesurfer](https://surfer.nmr.mgh.harvard.edu/) diff --git a/README.md b/README.md index 351e8f1..4ca00ee 100644 --- a/README.md +++ b/README.md @@ -1,7 +1,7 @@ # Making Freesurfer FAIR Script to Export Freesurfer-based Parcellation/Segmentation Stats and Provenance as JSON-LD and NIDM ## Project Description -This project ultimately aims to facilitate both query and analysis of parcellation/segmentation based regional statistics across popular softwares such as [Freesurfer](https://surfer.nmr.mgh.harvard.edu/), [FSL](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki), and [ANTS](http://stnava.github.io/ANTs/). Currently each software produces its own output format and brain region labels are specific to the atlas used in generating the regional statistics. This makes life difficult when trying to search for "nucleus accumbens" volume, for example, across the different software products. Further, knowing which version of the software tool used and what atlas and version of the atlas in a structured representation facilitating query is lacking. To this end we propose augmenting the various segmentation tools with scripts that will: (1) map atlas-specific anatomical nomeclature to anatomical concepts hosted in terminology resources (e.g. InterLex); (2) capture better structured provenance about the input image(s) and the atlases used for the segmentation; (3) export the segmentation results and the provenance as either [JSON-LD](https://json-ld.org/), [NIDM](http://nidm.nidash.org/) which can then link the derived data to broader records of the original project metadata, or as an additional component of a BIDS derivative. +This project ultimately aims to facilitate both query and analysis of parcellation/segmentation based regional statistics across popular software such as [Freesurfer](https://surfer.nmr.mgh.harvard.edu/), [FSL](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki), and [ANTS](http://stnava.github.io/ANTs/). Currently each software produces its own output format and brain region labels are specific to the atlas used in generating the regional statistics. This makes life difficult when trying to search for "nucleus accumbens" volume, for example, across the different software products. Further, knowing which version of the software tool used and what atlas and version of the atlas in a structured representation facilitating query is lacking. To this end we propose augmenting the various segmentation tools with scripts that will: (1) map atlas-specific anatomical nomeclature to anatomical concepts hosted in terminology resources (e.g. InterLex); (2) capture better structured provenance about the input image(s) and the atlases used for the segmentation; (3) export the segmentation results and the provenance as either [JSON-LD](https://json-ld.org/), [NIDM](http://nidm.nidash.org/) which can then link the derived data to broader records of the original project metadata, or as an additional component of a BIDS derivative. We aim to tackle this problem in steps. For this hackathon project we'll be focusing on conversion from Freesurfer's [mri_segstats](https://surfer.nmr.mgh.harvard.edu/fswiki/mri_segstat) program output along with some additional parsing/conversion of Freesurfer log files. The conversion is driven by a function which queries InterLex and develops a JSON structure which defines the atlas terminology and the measures being output. @@ -17,7 +17,7 @@ Python and structural neuroimaging experience. If one has experience with [rdfl - JB Poline ## Integration -This project will need expertise in programming, structural neuroimaging, and anatomy. To make this project sucessful we need individuals who have skills in any of these domains to help with: (1) understand Freesurfer's segmentation results format and log files; (2) programming up a script in Python; (3) understand anatomy well enough to select the proper anatomical concept that maps to a specific atlas designation of a region and ***can define new anatomy terms where needed, linking them to broader concepts*** to facilitate segmentation results queries across softwares. +This project will need expertise in programming, structural neuroimaging, and anatomy. To make this project successful we need individuals who have skills in any of these domains to help with: (1) understand Freesurfer's segmentation results format and log files; (2) programming up a script in Python; (3) understand anatomy well enough to select the proper anatomical concept that maps to a specific atlas designation of a region and ***can define new anatomy terms where needed, linking them to broader concepts*** to facilitate segmentation results queries across software. ## Preparation material * [Freesurfer](https://surfer.nmr.mgh.harvard.edu/) @@ -62,7 +62,7 @@ optional arguments: sidecar file with those mappings for automated runs of future CSV files with the same set of variables. -subjid SUBJID, --subjid SUBJID If a path to a URL or a stats fileis supplied via the -f/--seg_file parameters then -subjid parameter must be set - withthe subject identifier to be used in the NIDM files + with the subject identifier to be used in the NIDM files -o OUTPUT_DIR, --output OUTPUT_DIR Output filename with full path -j, --jsonld If flag set then NIDM file will be written as JSONLD instead of TURTLE diff --git a/segstats_jsonld/fs_to_nidm.py b/segstats_jsonld/fs_to_nidm.py index a4910a9..fb4b623 100755 --- a/segstats_jsonld/fs_to_nidm.py +++ b/segstats_jsonld/fs_to_nidm.py @@ -326,7 +326,7 @@ def map_csv_variables_to_freesurfer_cdes(df,id_field,outdir,csv_file,json_map=No def url_validator(url): ''' - Tests whether url is a valide url + Tests whether url is a valid url :param url: url to test :return: True for valid url else False ''' diff --git a/segstats_jsonld/mapping_data/FreeSurferColorLUT.txt b/segstats_jsonld/mapping_data/FreeSurferColorLUT.txt index d579323..c23b164 100644 --- a/segstats_jsonld/mapping_data/FreeSurferColorLUT.txt +++ b/segstats_jsonld/mapping_data/FreeSurferColorLUT.txt @@ -173,7 +173,7 @@ 169 Left-Basal-Ganglia 236 13 127 0 176 Right-Basal-Ganglia 236 13 126 0 -# Label names and colors for Brainstem consituents +# Label names and colors for Brainstem constituents # No. Label Name: R G B A 170 brainstem 119 159 176 0 171 DCG 119 0 176 0 @@ -432,7 +432,7 @@ # created by mri_aparc2aseg in which the aseg cortex label is replaced # by the labels in the aparc. It also supports wm labels that will # eventually be created by mri_aparc2aseg. Otherwise, the aseg labels -# do not change from above. The cortical lables are the same as in +# do not change from above. The cortical labels are the same as in # colortable_desikan_killiany.txt, except that left hemisphere has # 1000 added to the index and the right has 2000 added. The label # names are also prepended with ctx-lh or ctx-rh. The white matter From 6ef381fe94dbbde98f8cf33e7508d7da7d6bae0b Mon Sep 17 00:00:00 2001 From: Yaroslav Halchenko Date: Thu, 6 Feb 2025 15:05:45 -0500 Subject: [PATCH 5/5] [DATALAD RUNCMD] Do interactive fixing of some ambigous typos === Do not change lines below === { "chain": [], "cmd": "codespell -w -i 3 -C 2", "exit": 0, "extra_inputs": [], "inputs": [], "outputs": [], "pwd": "." } ^^^ Do not change lines above ^^^ --- segstats_jsonld/mapping_data/FreeSurferColorLUT.txt | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/segstats_jsonld/mapping_data/FreeSurferColorLUT.txt b/segstats_jsonld/mapping_data/FreeSurferColorLUT.txt index c23b164..2cfe156 100644 --- a/segstats_jsonld/mapping_data/FreeSurferColorLUT.txt +++ b/segstats_jsonld/mapping_data/FreeSurferColorLUT.txt @@ -439,7 +439,7 @@ # labels are the same as in colortable_desikan_killiany.txt, except # that left hemisphere has 3000 added to the index and the right has # 4000 added. The label names are also prepended with wm-lh or wm-rh. -# Centrum semiovale is also labled with 5001 (left) and 5002 (right). +# Centrum semiovale is also labeled with 5001 (left) and 5002 (right). # Even further below are the color tables for aparc.a2005s and aparc.a2009s. #No. Label Name: R G B A