1- FRNN - PPPL deep learning disruption prediction package
2- =======================================================
1+ ## FRNN - PPPL deep learning disruption prediction package
32
43The FRNN code workflow is similar to that characteristic of typical distributed deep learning projects.
54First, the raw data is preprocessed and normalized. The pre-processing step involves cutting, resampling,
@@ -26,8 +25,7 @@ consisting of 4 core modules:
2625
2726This is a pure Python implementation for Python versions 2.6 and 2.7.
2827
29- Installation
30- ============
28+ ## Installation
3129
3230The package comes with a standard setup script and a list of dependencies which include: mpi4py, Theano,
3331Keras, h5py, Pathos. It also requires a standard set of CUDA drivers to run on GPU.
@@ -46,13 +44,11 @@ python setup.py install
4644
4745with ` sudo ` if superuser permissions are needed or ` --home=~ ` to install in a home directory. The latter option requires an appropriate ` PYTHONPATH ` .
4846
49- Module index
50- ============
47+ ## Module index
5148
52- Tutorials
53- =========
49+ ## Tutorials
5450
55- ## Sample usage on Tiger
51+ ### Sample usage on Tiger
5652
5753``` bash
5854module load anaconda cudatoolkit/7.5 cudann openmpi/intel-16.0/1.8.8/64
@@ -62,17 +58,17 @@ python setup.py install
6258
6359Where ` environment ` should contain the Python packages as per ` requirements.txt ` file.
6460
65- ### Preprocessing
61+ #### Preprocessing
6662
6763``` bash
6864python guarantee_preprocessed.py
6965```
7066
71- ### Training and inference
67+ #### Training and inference
7268
7369Use Slurm scheduler to perform batch or interactive analysis on Tiger cluster.
7470
75- #### Batch analysis
71+ ##### Batch analysis
7672
7773For batch analysis, make sure to allocate 1 process per GPU:
7874
@@ -89,7 +85,7 @@ where X is the number of nodes for distibuted data parallel training.
8985sbatch slurm.cmd
9086```
9187
92- #### Interactive analysis
88+ ##### Interactive analysis
9389
9490The workflow is to request an interactive session:
9591
@@ -109,5 +105,4 @@ mpirun -npernode 4 python examples/mpi_learn.py
109105Note: there is Theano compilation going on in the 1st epoch which will distort timing. It is recommended to perform testing setting ` num_epochs >= 2 ` in ` conf.py ` .
110106
111107
112- Status
113- ======
108+ ## Status
0 commit comments