Skip to content

Commit c3fa1e4

Browse files
committed
modified README
1 parent 7497b96 commit c3fa1e4

File tree

2 files changed

+151
-2
lines changed

2 files changed

+151
-2
lines changed

README.md

Lines changed: 151 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,2 +1,152 @@
11
# yolov5-svhn-detection
2-
Pytorch implementation of homework 2 for VRDL course in 2021 Fall semester at NYCU.
2+
[![License: MIT](https://img.shields.io/badge/License-MIT-yellow.svg)](https://opensource.org/licenses/MIT)
3+
4+
### [Report](./REPORT.pdf)
5+
6+
by [Zhi-Yi Chin](https://joycenerd.github.io/)
7+
8+
This repository is implementation of homework2 for IOC5008 Selected Topics in Visual Recognition using Deep Learning course in 2021 fall semester at National Yang Ming Chiao Tung University.
9+
10+
In this homework, we participate in the SVHN detection competition hosted on [CodaLab](https://competitions.codalab.org/competitions/35888?secret_key=7e3231e6-358b-4f06-a528-0e3c8f9e328e). The [Street View House Numbers (SVHN) dataset](http://ufldl.stanford.edu/housenumbers/) contains 33,402 training images and 13,068 testing images. We are required to train not only an accurate but fast digit detector. The submission format should follow COCO results. To test the detection model's speed, we must benchmark the detection model in the Google Colab environment and screenshot the results.
11+
12+
## Getting the code
13+
14+
You can download a copy of all the files in this repository by cloning this repository:
15+
16+
```
17+
git clone https://github.com/joycenerd/yolov5-svhn-detection.git
18+
```
19+
20+
## Requirements
21+
22+
You need to have [Anaconda](https://www.anaconda.com/) or Miniconda already installed in your environment. To install requirements:
23+
```
24+
conda env create --name detect python=3
25+
conda activate detect
26+
cd yolov5
27+
pip install -r requirements.txt
28+
```
29+
30+
## Dataset
31+
32+
You can download the raw data after you have registered the challenge mention above.
33+
34+
### Data pre-processing
35+
36+
#### 1. Turning the `.mat` label file into YOLO format annotations.
37+
38+
```
39+
python mat2yolo.py --data-root <path_to_data_root_dir>
40+
```
41+
* input: your data root directory, inside this directory you should have `train/` which saves all the training images and `digitStruct.mat` which is the original label file.
42+
* output: `<path_to_data_root_dir>/labels/all_train/` -> inside this folder there will have text files with the name same as the training image name, they are YOLO format annotations.
43+
44+
#### 2. Train validation split -> Split the original training data into 80% training and 20% validation.
45+
```
46+
python train_valid_split.py --data-root <path_to_data_root_dir> --ratio 0.2
47+
```
48+
* input: same as last step plus the output of last step
49+
* output:
50+
* `<path_to_data_root_dir>/images/`: inside this folder will have two subfolder `train/` (training images) and `valid/` (validation images).
51+
* `<path_to_data_root_dir>/labels/train/`: text files that contain training labels
52+
* `<path_to_data_root_dir>/labels/valid/`: text files that contain validation labels
53+
54+
#### 3. Data configuration
55+
56+
Got to `yolov5/data/custom-data.yml` and modified `path`, `train`, `val` and `test` path
57+
58+
59+
## Training
60+
61+
You should have Graphics card to train the model. For your reference, we trained on 2 NVIDIA RTX 1080Ti for 14 hours. Before training, you should download `yolov5s.pt` from `https://github.com/ultralytics/yolov5/releases/tag/v6.0`.
62+
63+
Recommended training command:
64+
```
65+
cd yolov5
66+
python train.py --weights <yolo5s.pt_file> --cfg models/yolov5s.yaml --data data/custom-data.yaml --epochs 150 --cache --device 0,1 --workers 4 --project <train_log_dir> --save-period 5
67+
```
68+
There are more setting arguments you can tune in `yolov5/train.py`, our recommendation is first stick with default setting.
69+
70+
The logging directory will be generated in the path you specified for `--project` and if this is your first experiment there will be a subdirectory name `exp/`, if second `exp2` and so on. Inside this logging directory you can find:
71+
* `weights/`: All the training checkpoints will be saved inside here. Checkpoints is saved every 5 epochs and `best.pth` save the current best model and `last.pt` save the latest model.
72+
* tensorboard
73+
* Some miscellaneous information about the data and current hyperparameter
74+
75+
## Testing
76+
You can test your training results by running this command:
77+
```
78+
python test.py [-h] [--data-root DATA_ROOT] [--ckpt CKPT] [--img-size IMG_SIZE]
79+
[--num-classes NUM_CLASSES] [--net NET] [--gpu GPU]
80+
81+
optional arguments:
82+
-h, --help show this help message and exit
83+
--data-root DATA_ROOT
84+
data root dir
85+
--ckpt CKPT checkpoint path
86+
--img-size IMG_SIZE image size in model
87+
--num-classes NUM_CLASSES
88+
number of classes
89+
--net NET which model
90+
--gpu GPU gpu id
91+
```
92+
93+
## Submit the results
94+
Run this command to `zip` your submission file:
95+
```
96+
zip answer.zip answer.txt
97+
```
98+
You can upload `answer.zip` to the challenge. Then you can get your testing score.
99+
100+
## Pre-trained models
101+
102+
Click into [Releases](https://github.com/joycenerd/bird-images-classification/releases). Under **EfficientNet-b4 model** download `efficientnet-b4_best_model.pth`. This pre-trained model get accuracy 72.53% on the test set.
103+
104+
Recommended testing command:
105+
```
106+
python test.py --data-root <path_to_data> --ckpt <path_to_checkpoint> --img-size 380 --net efficientnet-b4 --gpu 0
107+
```
108+
109+
`answer.txt` will be generated in this directory. This file is the submission file.
110+
111+
## Inference
112+
To reproduce our results, run this command:
113+
```
114+
python inference.py --data-root <path_to_data> --ckpt <pre-trained_model_path> --img-size 380 --net efficientnet-b4 --gpu 0
115+
```
116+
117+
## Reproducing Submission
118+
119+
To reproduce our submission without retraining, do the following steps
120+
121+
1. [Getting the code](#getting-the-code)
122+
2. [Install the dependencies](#requirements)
123+
2. [Download the data](#dataset)
124+
4. [Download pre-trained models](#pre-trained-models)
125+
3. [Inference](#inference)
126+
4. [Submit the results](#submit-the-results)
127+
128+
## Results
129+
130+
Our model achieves the following performance:
131+
132+
| | EfficientNet-b4 w/o sched | EfficientNet-b4 with sched |
133+
|-----|---------------------------|----------------------------|
134+
| acc | 55.29% | 72.53% |
135+
136+
## Citation
137+
If you find our work useful in your project, please cite:
138+
139+
```bibtex
140+
@misc{
141+
title = {bird_image_classification},
142+
author = {Zhi-Yi Chin},
143+
url = {https://github.com/joycenerd/bird-images-classification},
144+
year = {2021}
145+
}
146+
```
147+
148+
## Contributing
149+
150+
If you'd like to contribute, or have any suggestions, you can contact us at [joycenerd.cs09@nycu.edu.tw](mailto:joycenerd.cs09@nycu.edu.tw) or open an issue on this GitHub repository.
151+
152+
All contributions welcome! All content in this repository is licensed under the MIT license.

train_valid_split.py

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,7 +9,6 @@
99
parser=argparse.ArgumentParser()
1010
parser.add_argument('--data-root',type=str,default='/eva_data/zchin/vrdl_hw2_data',help='trainig image saving directory')
1111
parser.add_argument('--ratio',type=float,default=0.2,help='validation data ratio')
12-
parser.add_argument('--data-dir',type=str,default='./data',help='directory to save train valid split results')
1312
args=parser.parse_args()
1413

1514
if __name__=='__main__':

0 commit comments

Comments
 (0)