Skip to content

Commit 33a97b5

Browse files
Update README.md
1 parent 8e35236 commit 33a97b5

File tree

1 file changed

+57
-3
lines changed

1 file changed

+57
-3
lines changed

README.md

Lines changed: 57 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,56 @@
1-
# plasma-python
2-
PPPL deep learning disruption prediction package
1+
FRNN - PPPL deep learning disruption prediction package
2+
=======================================================
3+
4+
The FRNN code workflow is similar to that characteristic of typical distributed deep learning projects.
5+
First, the raw data is preprocessed and normalized. The pre-processing step involves cutting, resampling,
6+
and structuring the data - as well as determining and validating the disruptive properties of
7+
the shots considered. Various options for normalization are implemented.
8+
9+
Secondly, with respect to distributed data-parallel training of the model, the associated parameters are check-pointed after each epoch on the disk, in HDF5 file format. Finally – regarding the cross validation and prediction step on
10+
unlabeled data, it is planned to also implement a hyper-parameter tuning; approach using a random search algorithm.
11+
12+
The results are stored as HDF5 files, including the final neural network model parameters together with
13+
statistical summaries of the variables used during training to allow researchers to produce learning
14+
curves and performance summary plots.
15+
16+
The Fusion Recurrent Neural Net (FRNN) deep learning code is implemented as a Python package
17+
consisting of 4 core modules:
18+
19+
- models: Python classes necessary to construct, train and optimize deep RNN models. Including a distributed data-parallel implementation of mini-batch gradient descent with MPI
20+
21+
- preprocessors: signal preprocessing and normalization classes, including the methods necessary to prepare physical data for stateful RNN training.
22+
23+
- primitives: contains abstractions specific to the domain implemented as Python classes. For instance: Shot - a measurement of plasma current as a function of time. The Shot object contains attributes corresponding to unique identifier of a shot, disruption time in milliseconds, time profile of the shot converted to time-to- disruption values, validity of a shot (whether plasma current reaches a certain value during the shot), etc
24+
25+
- utilities: a set of auxiliary functions for preprocessing, performance evaluation and learning curves analysis
26+
27+
This is a pure Python implementation for Python versions 2.6 and 2.7.
28+
29+
Installation
30+
============
31+
32+
The package comes with a standard setup script and a list of dependencies which include: mpi4py, Theano,
33+
Keras, h5py, Pathos. It also requires a standard set of CUDA drivers to run on GPU.
34+
35+
Run:
36+
```bash
37+
pip install -i https://testpypi.python.org/pypi plasma
38+
```
39+
optionally add `--user` to install in a home directory.
40+
41+
Alternatively, use the setup script:
42+
43+
```bash
44+
python setup.py install
45+
```
46+
47+
with `sudo` if superuser permissions are needed or `--home=~` to install in a home directory. The latter option requires an appropriate `PYTHONPATH`.
48+
49+
Module index
50+
============
51+
52+
Tutorials
53+
=========
354

455
## Sample usage on Tiger
556

@@ -38,7 +89,6 @@ where X is the number of nodes for distibuted data parallel training.
3889
sbatch slurm.cmd
3990
```
4091

41-
4292
#### Interactive analysis
4393

4494
The workflow is to request an interactive session:
@@ -57,3 +107,7 @@ mpirun -npernode 4 python examples/mpi_learn.py
57107
```
58108

59109
Note: there is Theano compilation going on in the 1st epoch which will distort timing. It is recommended to perform testing setting `num_epochs >= 2` in `conf.py`.
110+
111+
112+
Status
113+
======

0 commit comments

Comments
 (0)