You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+23-3Lines changed: 23 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -3,7 +3,7 @@
3
3
The aim of this repository is to show a baseline model for text classification by implementing a LSTM-based model coded in PyTorch. In order to provide a better understanding of the model, it will be used a dataset of Tweets provided by Kaggle.
4
4
5
5
## 1. Data
6
-
As it was mentioned above, the dataset we are woking with is about Tweets regarding fake news. The head of the dataset looks like this:
6
+
As it was mentioned above, the implemented dataset is about Tweets regarding fake news. The ``raw``dataset contains some unnecessary columns which are going to be removed in the preprocessing step, in the end, we will be working with a dataset with a head such as this:
7
7
8
8
|id| text | target |
9
9
| ------------- | ------------- | ------------- |
@@ -12,7 +12,7 @@ As it was mentioned above, the dataset we are woking with is about Tweets regard
12
12
| 3 | INEC Office in Abia Set Ablaze - http://t.co/3ImaomknnA| 1 |
13
13
| 4 | Building the perfect tracklist to life leave the streets ablaze | 0 |
14
14
15
-
This dataset can be found in ``data/tweets.csv``.
15
+
This raw dataset can be found in ``data/tweets.csv``.
16
16
17
17
## 2. The model
18
18
As it was already commented, the aim of this repository is to provide a base line model for text classfication. In this sense, the model is based on a two-stacked LSTM layers followed by two linear layers. The dataset is preprocessed through a tokens-based technique, then tokens are associated to an embedding layer. The following image describes the pipeline of the model.
@@ -21,13 +21,33 @@ As it was already commented, the aim of this repository is to provide a base lin
21
21
Working on
22
22
23
23
## 4. How to use it
24
+
The model can be executed easily by typing:
24
25
```
25
-
python -B main.py
26
+
python main.py
26
27
```
27
28
You can define some hyperparameters manually, such as:
0 commit comments