Skip to content

Commit 7edc9e5

Browse files
authored
Merge branch 'master' into phylogenetics-fix
2 parents afc3e92 + 14c7f5c commit 7edc9e5

18 files changed

+148
-102
lines changed

.gitattributes

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1 @@
1+
example*/*.html linguist-vendored

.travis.yml

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,8 @@
1+
language: python
2+
python:
3+
- "3.6"
4+
cache: pip
5+
install:
6+
- pip install -r requirements.txt
7+
script:
8+
- python tests.py

README.md

Lines changed: 10 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,6 @@
11
# Ten Simple Rules for Reproducible Research in Jupyter Notebooks
2+
[![Build Status](https://api.travis-ci.com/jupyter-guide/ten-rules-jupyter.svg?branch=master)](https://www.travis-ci.org/jupyter-guide/ten-rules-jupyter)
3+
[![GitHub License](https://img.shields.io/github/license/jupyter-guide/ten-rules-jupyter.svg)](https://github.com/sbl-sdsc/mmtf-spark/blob/master/LICENSE)
24

35
This repository is a supplement to the "Ten Simple Rules for Reproducible Research in Jupyter Notebook" paper [ref].
46

@@ -16,11 +18,11 @@ notebooks in your web browser using the Binder ([mybinder.org](https://mybinder.
1618

1719
| Nbviewer | Jupyter Notebook | Jupyter Lab | HTML |
1820
| --- | -- | --- | --- |
19-
| [0-Workflow.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F0-Workflow.ipynb) | [HTML](https://htmlpreview.github.io?https://github.com/jupyter-guide/ten-rules-jupyter/blob/master/example1/0-Workflow.html) |
20-
| [1-CreateDataset.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/1-CreateDataset.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F1-CreateDataset.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F1-CreateDataset.ipynb) | [HTML](https://htmlpreview.github.io?https://github.com/jupyter-guide/ten-rules-jupyter/blob/master/example1/1-CreateDataset.html) |
21-
| [2-CalculateFeatures.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/2-CalculateFeatures.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F2-CalculateFeatures.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F2-CalculateFeatures.ipynb) | [HTML](https://htmlpreview.github.io?https://github.com/jupyter-guide/ten-rules-jupyter/blob/master/example1/2-CalculateFeatures.html) |
22-
| [3-FitModel.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/3-FitModel.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F3-FitModel.ipynb) |[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F3-FitModel.ipynb) | [HTML](https://htmlpreview.github.io?https://github.com/jupyter-guide/ten-rules-jupyter/blob/master/example1/3-FitModel.html) |
23-
| [4-Predict.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/4-Predict.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F4-Predict.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F4-Predict.ipynb)| [HTML](https://htmlpreview.github.io?https://github.com/jupyter-guide/ten-rules-jupyter/blob/master/example1/4-Predict.html) |
21+
| [0-Workflow.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F0-Workflow.ipynb) | [HTML](https://cdn.rawgit.com/jupyter-guide/ten-rules-jupyter/dd3a89ad/example1/0-Workflow.html) |
22+
| [1-CreateDataset.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/1-CreateDataset.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F1-CreateDataset.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F1-CreateDataset.ipynb) | [HTML](https://cdn.rawgit.com/jupyter-guide/ten-rules-jupyter/dd3a89ad/example1/1-CreateDataset.html) |
23+
| [2-CalculateFeatures.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/2-CalculateFeatures.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F2-CalculateFeatures.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F2-CalculateFeatures.ipynb) | [HTML](https://cdn.rawgit.com/jupyter-guide/ten-rules-jupyter/dd3a89ad/example1/2-CalculateFeatures.html) |
24+
| [3-FitModel.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/3-FitModel.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F3-FitModel.ipynb) |[![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F3-FitModel.ipynb) | [HTML](https://cdn.rawgit.com/jupyter-guide/ten-rules-jupyter/dd3a89ad/example1/3-FitModel.html) |
25+
| [4-Predict.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example1/4-Predict.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example1%2F4-Predict.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example1%2F4-Predict.ipynb)| [HTML](https://cdn.rawgit.com/jupyter-guide/ten-rules-jupyter/dd3a89ad/example1/4-Predict.html) |
2426

2527
---
2628

@@ -34,6 +36,6 @@ This example demonstrates a reproducible 2-step workflow for simulating a phylog
3436

3537
| Nbviewer | Jupyter Notebook | Jupyter Lab | HTML |
3638
| --- | -- | --- | --- |
37-
| [0-Workflow.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example2/0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example2%2F0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example2%2F0-Workflow.ipynb) | [HTML](https://htmlpreview.github.io?https://github.com/jupyter-guide/ten-rules-jupyter/blob/master/example2/0-Workflow.html) |
38-
| [1-SimulateTree.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example2/1-SimulateTree.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example2%2F1-SimulateTree.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example2%2F1-SimulateTree.ipynb) | [HTML](https://htmlpreview.github.io?https://github.com/jupyter-guide/ten-rules-jupyter/blob/master/example2/1-SimulateTree.html) |
39-
| [2-SimulateSequences.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example2/2-SimulateSequences.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example2%2F2-SimulateSequences.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example2%2F2-SimulateSequences.ipynb) | [HTML](https://htmlpreview.github.io?https://github.com/jupyter-guide/ten-rules-jupyter/blob/master/example2/2-SimulateSequences.html) |
39+
| [0-Workflow.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example2/0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example2%2F0-Workflow.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example2%2F0-Workflow.ipynb) | [HTML](https://cdn.rawgit.com/jupyter-guide/ten-rules-jupyter/dd3a89ad/example2/0-Workflow.html) |
40+
| [1-SimulateTree.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example2/1-SimulateTree.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example2%2F1-SimulateTree.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example2%2F1-SimulateTree.ipynb) | [HTML](https://cdn.rawgit.com/jupyter-guide/ten-rules-jupyter/dd3a89ad/example2/1-SimulateTree.html) |
41+
| [2-SimulateSequences.ipynb](https://nbviewer.jupyter.org/github/jupyter-guide/ten-rules-jupyter/blob/master/example2/2-SimulateSequences.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?filepath=example2%2F2-SimulateSequences.ipynb) | [![Binder](https://mybinder.org/badge.svg)](https://mybinder.org/v2/gh/jupyter-guide/ten-rules-jupyter/master?urlpath=lab/tree/example2%2F2-SimulateSequences.ipynb) | [HTML](https://cdn.rawgit.com/jupyter-guide/ten-rules-jupyter/dd3a89ad/example2/2-SimulateSequences.html) |

example1/0-Workflow.html

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11783,12 +11783,12 @@ <h1 id="Predict-Fold-Type-of-a-Protein-from-Protein-Sequence">Predict Fold Type
1178311783
</div>
1178411784
<div class="inner_cell">
1178511785
<div class="text_cell_render border-box-sizing rendered_html">
11786-
<p><strong>The notebooks in this directory demonstrate the "Ten Rules for Reproducible Research in Jupyter Notebooks". Throughout the notebooks we refer to some the rules we applied.</strong></p>
11786+
<p><strong>The notebooks in this directory demonstrate and apply the "Ten Rules for Reproducible Research in Jupyter Notebooks". Throughout the notebooks we refer to some the rules we applied.</strong></p>
1178711787
<p><strong>For example, this notebook demonstrates:</strong></p>
1178811788
<hr>
11789-
<p><strong>Rule 1: Tell a Story for an Audience.</strong> This notebook was developed for biologists to learn how to apply a simple machine learning model to protein sequences.</p>
11789+
<p><strong>Rule 1: Tell a Story for an Audience.</strong> This notebook was developed to learn how to apply a simple machine learning model to predict protein features based on protein sequences.</p>
1179011790
<p><strong>Rule 3: Build a Pipeline.</strong> This notebook describes the entire workflow from data preparation, feature calculation, model fitting, to prediction. The modularity makes it easy to replace one of the steps, for example, use a different method to calculate features or apply a different machine learning model.</p>
11791-
<p><strong>Rule 5: Use Cell, Section adn Notebook Divisions to Make Steps Clear.</strong> We broke the workflow into separate notebooks and use this top-level notebook to explain and orchestrate the workflow.</p>
11791+
<p><strong>Rule 5: Use Cell, Section and Notebook Divisions to Make Steps Clear.</strong> We broke the workflow into separate notebooks and use this top-level notebook to explain and organize the workflow.</p>
1179211792
<hr>
1179311793

1179411794
</div>
@@ -11806,7 +11806,7 @@ <h2 id="Introduction">Introduction<a class="anchor-link" href="#Introduction">&#
1180611806
</div>
1180711807
<div class="inner_cell">
1180811808
<div class="text_cell_render border-box-sizing rendered_html">
11809-
<p>Protein chains fold in regular patterns. Secondary structure describes the geometry of segments of a protein chain. The most common secondary structure elements are</p>
11809+
<p>Proteins have four different levels of structure – primary, secondary, tertiary and quaternary. Secondary structure describes the geometry of segments of a protein chain. The most common secondary structure elements are:</p>
1181011810
<ul>
1181111811
<li>Alpha helices</li>
1181211812
<li>Beta sheets</li>
@@ -11819,7 +11819,7 @@ <h2 id="Introduction">Introduction<a class="anchor-link" href="#Introduction">&#
1181911819
</div>
1182011820
<div class="inner_cell">
1182111821
<div class="text_cell_render border-box-sizing rendered_html">
11822-
<p>We can classify proteins into three major fold classes based on their predominant secondary structure content</p>
11822+
<p>We can classify proteins into three major fold classes based on their predominant secondary structure content:</p>
1182311823
<ul>
1182411824
<li>alpha: contains predominantly alpha helices</li>
1182511825
<li>beta: contains predominantly beta sheets</li>
@@ -11833,7 +11833,7 @@ <h2 id="Introduction">Introduction<a class="anchor-link" href="#Introduction">&#
1183311833
</div>
1183411834
<div class="inner_cell">
1183511835
<div class="text_cell_render border-box-sizing rendered_html">
11836-
<h2 id="Goal">Goal<a class="anchor-link" href="#Goal">&#182;</a></h2><p>This notebook demonstrates how to create a reproducible record to create a machine learning model. We train a simple model to predict the fold class of a protein given its protein sequence using a representative set of 3D structures from the Protein Data Bank.</p>
11836+
<h2 id="Goal">Goal<a class="anchor-link" href="#Goal">&#182;</a></h2><p>This notebook demonstrates how to create a reproducible record using a machine learning model. We train the model to predict the fold class of a protein given its amino acid sequence using a representative set of 3D structures from the Protein Data Bank.</p>
1183711837
<p><strong>Run the following notebooks and explore how we applied the Ten Simple Rules.</strong></p>
1183811838

1183911839
</div>
@@ -11887,7 +11887,7 @@ <h2 id="2.-Calculate-Features">2. Calculate Features<a class="anchor-link" href=
1188711887
</div>
1188811888
<div class="inner_cell">
1188911889
<div class="text_cell_render border-box-sizing rendered_html">
11890-
<p>Protein sequences cannot be directly used for machine learning. Here use the Word2vec method to calculate a fixed-sized feature vector for each protein sequence.</p>
11890+
<p>Protein sequences cannot be directly used for machine learning. Here we use the Word2vec method to calculate a fixed-sized feature vector for each protein sequence.</p>
1189111891
<p>Run the following notebook to calculate feature vectors.</p>
1189211892

1189311893
</div>
@@ -12044,7 +12044,7 @@ <h2 id="Version-and-Hardware-Information">Version and Hardware Information<a cla
1204412044
<div class="inner_cell">
1204512045
<div class="text_cell_render border-box-sizing rendered_html">
1204612046
<hr>
12047-
<p><strong>Authors:</strong> Peter W. Rose, Shih-Cheng Huang, UC San Diego, October 1, 2018</p>
12047+
<p><strong>Authors:</strong> <a href="mailto:pwrose.ucsd@gmail.com">Peter W. Rose</a>, Shih-Cheng Huang, UC San Diego, October 1, 2018</p>
1204812048
<hr>
1204912049

1205012050
</div>

example1/0-Workflow.ipynb

Lines changed: 8 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -11,17 +11,17 @@
1111
"cell_type": "markdown",
1212
"metadata": {},
1313
"source": [
14-
"**The notebooks in this directory demonstrate the \"Ten Rules for Reproducible Research in Jupyter Notebooks\". Throughout the notebooks we refer to some the rules we applied.**\n",
14+
"**The notebooks in this directory demonstrate and apply the \"Ten Rules for Reproducible Research in Jupyter Notebooks\". Throughout the notebooks we refer to some the rules we applied.**\n",
1515
"\n",
1616
"**For example, this notebook demonstrates:**\n",
1717
"\n",
1818
"---\n",
1919
"\n",
20-
"**Rule 1: Tell a Story for an Audience.** This notebook was developed for biologists to learn how to apply a simple machine learning model to protein sequences.\n",
20+
"**Rule 1: Tell a Story for an Audience.** This notebook was developed to learn how to apply a simple machine learning model to predict protein features based on protein sequences.\n",
2121
"\n",
2222
"**Rule 3: Build a Pipeline.** This notebook describes the entire workflow from data preparation, feature calculation, model fitting, to prediction. The modularity makes it easy to replace one of the steps, for example, use a different method to calculate features or apply a different machine learning model.\n",
2323
"\n",
24-
"**Rule 5: Use Cell, Section adn Notebook Divisions to Make Steps Clear.** We broke the workflow into separate notebooks and use this top-level notebook to explain and orchestrate the workflow.\n",
24+
"**Rule 5: Use Cell, Section and Notebook Divisions to Make Steps Clear.** We broke the workflow into separate notebooks and use this top-level notebook to explain and organize the workflow.\n",
2525
"\n",
2626
"---"
2727
]
@@ -37,7 +37,7 @@
3737
"cell_type": "markdown",
3838
"metadata": {},
3939
"source": [
40-
"Protein chains fold in regular patterns. Secondary structure describes the geometry of segments of a protein chain. The most common secondary structure elements are\n",
40+
"Proteins have four different levels of structure – primary, secondary, tertiary and quaternary. Secondary structure describes the geometry of segments of a protein chain. The most common secondary structure elements are:\n",
4141
"* Alpha helices\n",
4242
"* Beta sheets"
4343
]
@@ -46,7 +46,7 @@
4646
"cell_type": "markdown",
4747
"metadata": {},
4848
"source": [
49-
"We can classify proteins into three major fold classes based on their predominant secondary structure content\n",
49+
"We can classify proteins into three major fold classes based on their predominant secondary structure content:\n",
5050
"* alpha: contains predominantly alpha helices\n",
5151
"* beta: contains predominantly beta sheets\n",
5252
"* alpha+beta: contains alpha helices and beta sheets"
@@ -57,7 +57,7 @@
5757
"metadata": {},
5858
"source": [
5959
"## Goal\n",
60-
"This notebook demonstrates how to create a reproducible record to create a machine learning model. We train a simple model to predict the fold class of a protein given its protein sequence using a representative set of 3D structures from the Protein Data Bank.\n",
60+
"This notebook demonstrates how to create a reproducible record using a machine learning model. We train the model to predict the fold class of a protein given its amino acid sequence using a representative set of 3D structures from the Protein Data Bank.\n",
6161
"\n",
6262
"**Run the following notebooks and explore how we applied the Ten Simple Rules.**"
6363
]
@@ -103,7 +103,7 @@
103103
"cell_type": "markdown",
104104
"metadata": {},
105105
"source": [
106-
"Protein sequences cannot be directly used for machine learning. Here use the Word2vec method to calculate a fixed-sized feature vector for each protein sequence.\n",
106+
"Protein sequences cannot be directly used for machine learning. Here we use the Word2vec method to calculate a fixed-sized feature vector for each protein sequence.\n",
107107
"\n",
108108
"Run the following notebook to calculate feature vectors. "
109109
]
@@ -230,7 +230,7 @@
230230
"source": [
231231
"---\n",
232232
"\n",
233-
"**Authors:** Peter W. Rose, Shih-Cheng Huang, UC San Diego, October 1, 2018\n",
233+
"**Authors:** [Peter W. Rose](mailto:pwrose.ucsd@gmail.com), Shih-Cheng Huang, UC San Diego, October 1, 2018\n",
234234
"\n",
235235
"---"
236236
]

0 commit comments

Comments
 (0)