Skip to content

Commit e8f2834

Browse files
Merge branch 'stl10'
2 parents d1f9c09 + 28fbcb2 commit e8f2834

File tree

4 files changed

+68
-1
lines changed

4 files changed

+68
-1
lines changed

README.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,6 +22,7 @@ Hi there, and welcome to the `extra-keras-datasets` module! This extension to th
2222
* [KMNIST-K49](#kmnist-k49)
2323
* [SVHN-Normal](#svhn-normal)
2424
* [SVHN-Extra](#svhn-extra)
25+
* [STL-10](#stl-10)
2526
- [Contributors and other references](#contributors-and-other-references)
2627
- [License](#license)
2728

@@ -154,6 +155,18 @@ from extra-keras-datasets import svhn
154155

155156
---
156157

158+
### STL-10
159+
The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms. It contains 5.000 training images and 8.000 testing images, and represents 10 classes in total (airplane, bird, car, cat, deer, dog, horse, monkey, ship, truck).
160+
161+
```
162+
from extra-keras-datasets import stl10
163+
(input_train, target_train), (input_test, target_test) = stl10.load_data()
164+
```
165+
166+
<a href="./assets/stl10.png"><img src="./assets/stl10.png" width="100%" style="border: 3px solid #f6f8fa;" /></a>
167+
168+
---
169+
157170
## Contributors and other references
158171
* **EMNIST dataset:**
159172
* Cohen, G., Afshar, S., Tapson, J., & van Schaik, A. (2017). EMNIST: an extension of MNIST to handwritten letters. Retrieved from http://arxiv.org/abs/1702.05373
@@ -162,6 +175,8 @@ from extra-keras-datasets import svhn
162175
* Clanuwat, T., Bober-Irizar, M., Kitamoto, A., Lamb, A., Yamamoto, K., & Ha, D. (2018). Deep learning for classical Japanese literature. arXiv preprint arXiv:1812.01718. Retrieved from https://arxiv.org/abs/1812.01718
163176
* **SVHN dataset:**
164177
* Netzer, Y., Wang, T., Coates, A., Bissacco, A., Wu, B., & Ng, A. Y. (2011). Reading digits in natural images with unsupervised feature learning. Retrieved from http://ufldl.stanford.edu/housenumbers/nips2011_housenumbers.pdf / http://ufldl.stanford.edu/housenumbers/
178+
* **STL-10 dataset:**
179+
* Coates, A., Ng, A., & Lee, H. (2011, June). An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics (pp. 215-223). Retrieved from http://cs.stanford.edu/~acoates/papers/coatesleeng_aistats_2011.pdf
165180

166181
## License
167182
The licenseable parts of this repository are licensed under a [MIT License](./LICENSE), so you're free to use this repo in your machine learning projects / blogs / exercises, and so on. Happy engineering! 🚀

assets/stl10.png

887 KB
Loading

extra_keras_datasets/__init__.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -2,4 +2,5 @@
22

33
from . import emnist
44
from . import kmnist
5-
from . import svhn
5+
from . import svhn
6+
from . import stl10

extra_keras_datasets/stl10.py

Lines changed: 51 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,51 @@
1+
'''
2+
Import the STL-10 dataset
3+
Source: https://cs.stanford.edu/~acoates/stl10/
4+
Description: The STL-10 dataset is an image recognition dataset for developing unsupervised feature learning, deep learning, self-taught learning algorithms.
5+
6+
~~~ Important note ~~~
7+
Please cite the following paper when using or referencing the dataset:
8+
Coates, A., Ng, A., & Lee, H. (2011, June). An analysis of single-layer networks in unsupervised feature learning. In Proceedings of the fourteenth international conference on artificial intelligence and statistics (pp. 215-223). Retrieved from http://cs.stanford.edu/~acoates/papers/coatesleeng_aistats_2011.pdf
9+
10+
'''
11+
12+
from keras.utils.data_utils import get_file
13+
from scipy import io as sio
14+
import shutil
15+
import numpy as np
16+
17+
def load_data(path='stl10_matlab.tar.gz'):
18+
"""Loads the STL-10 dataset.
19+
# Arguments
20+
path: path where to cache the dataset locally
21+
(relative to ~/.keras/datasets).
22+
# Returns
23+
Tuple of Numpy arrays: `(input_train, target_train), (input_test, target_test)`.
24+
"""
25+
path = get_file(path,
26+
origin='http://ai.stanford.edu/~acoates/stl10/stl10_matlab.tar.gz')
27+
28+
# Temporarily extract .tar.gz in local path
29+
local_targz_path = './stl-10'
30+
shutil.unpack_archive(path, local_targz_path)
31+
32+
# Load data from Matlab file
33+
# Source: https://stackoverflow.com/a/53547262
34+
train = sio.loadmat(f'{local_targz_path}/stl10_matlab/train.mat', )
35+
test = sio.loadmat(f'{local_targz_path}/stl10_matlab/test.mat')
36+
37+
# Remove temporary file
38+
shutil.rmtree(local_targz_path, ignore_errors=True)
39+
40+
# Define training data
41+
input_train = train['X'].reshape((-1, 3, 96, 96))
42+
input_train = np.transpose(input_train, (0, 3, 2, 1))
43+
target_train = train['y'].flatten()
44+
45+
# Define testing data
46+
input_test = test['X'].reshape((-1, 3, 96, 96))
47+
input_test = np.transpose(input_test, (0, 3, 2, 1))
48+
target_test = test['y'].flatten()
49+
50+
# Return data
51+
return (input_train, target_train), (input_test, target_test)

0 commit comments

Comments
 (0)