Skip to content

Commit 475debf

Browse files
authored
Merge branch 'main' into add-demo
2 parents 70176d5 + 356d82b commit 475debf

File tree

2 files changed

+48
-9
lines changed

2 files changed

+48
-9
lines changed

LICENSE

Lines changed: 25 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,25 @@
1+
BSD 2-Clause License
2+
3+
Copyright (c) 2020, Raivo Eli Koot
4+
All rights reserved.
5+
6+
Redistribution and use in source and binary forms, with or without
7+
modification, are permitted provided that the following conditions are met:
8+
9+
1. Redistributions of source code must retain the above copyright notice, this
10+
list of conditions and the following disclaimer.
11+
12+
2. Redistributions in binary form must reproduce the above copyright notice,
13+
this list of conditions and the following disclaimer in the documentation
14+
and/or other materials provided with the distribution.
15+
16+
THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS"
17+
AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
18+
IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE
19+
DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE
20+
FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
21+
DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR
22+
SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER
23+
CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY,
24+
OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE
25+
OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.

README.md

Lines changed: 23 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,6 +1,8 @@
1-
[![Documentation Status](https://readthedocs.org/projects/video-dataset-loading-pytorch/badge/?version=latest)](https://video-dataset-loading-pytorch.readthedocs.io/en/latest/?badge=latest)
1+
[![Documentation Status](https://readthedocs.org/projects/video-dataset-loading-pytorch/badge/?version=latest)](https://video-dataset-loading-pytorch.readthedocs.io/en/latest/?badge=latest)
22
# Efficient Video Dataset Loading, Preprocessing, and Augmentation
33
Author: [Raivo Koot](https://github.com/RaivoKoot)
4+
https://video-dataset-loading-pytorch.readthedocs.io/en/latest/VideoDataset.html
5+
If you find the code useful, please star the repository.
46

57
If you are completely unfamiliar with loading datasets in PyTorch using `torch.utils.data.Dataset` and `torch.utils.data.DataLoader`, I recommend
68
getting familiar with these first through [this](https://pytorch.org/tutorials/beginner/data_loading_tutorial.html) or
@@ -10,7 +12,7 @@ getting familiar with these first through [this](https://pytorch.org/tutorials/b
1012
The VideoFrameDataset class serves to `easily`, `efficiently` and `effectively` load video samples from video datasets in PyTorch.
1113
1) Easily because this dataset class can be used with custom datasets with minimum effort and no modification. The class merely expects the
1214
video dataset to have a certain structure on disk and expects a .txt annotation file that enumerates each video sample. Details on this
13-
can be found below and at `https://video-dataset-loading-pytorch.readthedocs.io/`.
15+
can be found below and at `https://video-dataset-loading-pytorch.readthedocs.io/en/latest/VideoDataset.html`.
1416
2) Efficiently because the video loading pipeline that this class implements is very fast. This minimizes GPU waiting time during training by eliminating input bottlenecks
1517
that can slow down training time by several folds.
1618
3) Effectively because the implemented sampling strategy for video frames is very strong. Video training using the entire sequence of
@@ -21,7 +23,8 @@ This approach has shown to be very effective and is taken from
2123

2224
In conjunction with PyTorch's DataLoader, the VideoFrameDataset class returns video batch tensors of size `BATCH x FRAMES x CHANNELS x HEIGHT x WIDTH`.
2325

24-
For a demo, visit `demo.py`.
26+
For a demo, visit `demo.py`.
27+
2528
### QuickDemo (demo.py)
2629
```python
2730
root = os.path.join(os.getcwd(), 'demo_dataset') # Folder in which all videos lie in a specific structure
@@ -49,15 +52,16 @@ for image in frames:
4952
plt.show()
5053
plt.pause(1)
5154
```
52-
55+
![alt text](https://github.com/RaivoKoot/images/blob/main/Action_Video.jpg "Action Video")
5356
# Table of Contents
5457
- [1. Requirements](#1-requirements)
5558
- [2. Custom Dataset](#2-custom-dataset)
5659
- [3. Video Frame Sampling Method](#3-video-frame-sampling-method)
5760
- [4. Alternate Video Frame Sampling Methods](#4-alternate-vide-frame-sampling-methods)
5861
- [5. Using VideoFrameDataset for Training](#5-using-videoframedataset-for-training)
5962
- [6. Conclusion](#6-conclusion)
60-
- [7. Acknowledgements](#7-acknowledgements)
63+
- [7. Upcoming Features](#7-upcoming-features)
64+
- [8. Acknowledgements](#8-acknowledgements)
6165

6266
### 1. Requirements
6367
```
@@ -119,12 +123,13 @@ When loading a video, only a number of its frames are loaded. They are chosen in
119123
1. The frame indices [1,N] are divided into NUM_SEGMENTS even segments. From each segment, FRAMES_PER_SEGMENT consecutive indices are chosen at random.
120124
This results in NUM_SEGMENTS*FRAMES_PER_SEGMENT chosen indices, whose frames are loaded as PIL images and put into a list and returned when calling
121125
`dataset[i]`.
126+
![alt text](https://github.com/RaivoKoot/images/blob/main/Sparse_Temporal_Sampling.jpg "Sparse-Temporal-Sampling-Strategy")
122127

123128
### 4. Alternate Video Frame Sampling Methods
124129
If you do not want to use sparse temporal sampling and instead want to sample a single N-frame continuous
125130
clip from a video, this is possible. Set `NUM_SEGMENTS=1` and `FRAMES_PER_SEGMENT=N`. Because VideoFrameDataset
126131
will chose a random start index per segment and take `NUM_SEGMENTS` continuous frames from each sampled start
127-
index, this will result in a single N-frame continuous clip per video. An example of this is in `demo.py`.
132+
index, this will result in a single N-frame continuous clip per video. An example of this is in `demo.py`.
128133

129134
### 5. Using VideoFrameDataset for training
130135
As demonstrated in `demo.py`, we can use PyTorch's `torch.utils.data.DataLoader` class with VideoFrameDataset to take care of shuffling, batching, and more.
@@ -134,12 +139,21 @@ We can further chain preprocessing and augmentation functions that act on batche
134139

135140
As of `torchvision 0.8.0`, all torchvision transforms can now also operate on batches of images, and they apply deterministic or random transformations
136141
on the batch identically on all images of the batch. Therefore, any torchvision transform can be used here to apply video-uniform preprocessing and augmentation.
137-
142+
143+
REMEMBER:
144+
Pytorch transforms are applied to individual dataset samples (in this case a video frame PIL list, or a frame tensor after `imglist_totensor()`) before
145+
batching. So, any transforms used here must expect its input to be a frame tensor of shape `FRAMES x CHANNELS x HEIGHT x WIDTH` or a list of PIL images if `imglist_totensor()` is not used.
138146
### 6. Conclusion
139147
A proper code-based explanation on how to use VideoFrameDataset for training is provided in `demo.py`
140148

141-
### 7. Acknowledgements
142-
We thank the authors of TSN for their [codebase](https://github.com/yjxiong/tsn-pytorch), from which we took VideoFrameDataset and adapted it.
149+
### 7. Upcoming Features
150+
- [x] Add demo for sampling a single continous-frame clip from videos.
151+
- [ ] Add support for arbitrary labels that are more than just a single integer.
152+
- [ ] Add support for specifying START_FRAME and END_FRAME for a video instead of NUM_FRAMES.
153+
154+
### 8. Acknowledgements
155+
We thank the authors of TSN for their [codebase](https://github.com/yjxiong/tsn-pytorch), from which we took VideoFrameDataset and adapted it
156+
for general use and compatibility.
143157
```
144158
@InProceedings{wang2016_TemporalSegmentNetworks,
145159
title={Temporal Segment Networks: Towards Good Practices for Deep Action Recognition},

0 commit comments

Comments
 (0)