Cryptojacking Detection Validation

Empirical Validation for AI-Based Cloud Cryptojacking Detection

Overview

This repo contains the validation code for our systematic literature review:

Detecting Cryptojacking in Cloud Environments: A Systematic Review of AI-Based Defenses, Deployment Challenges, and Research Gaps
Amitabh Chakravorty, Nelly Elsayed
School of Information Technology, University of Cincinnati
International Journal of Information Security (Springer Nature)

We took representative ML models from the reviewed literature and tested them on public datasets to see how well reported results actually hold up. The short answer: they often don't. Reported accuracies are frequently inflated by data leakage and testing only on attack types the model already knows.

Key Findings

Our Results (After Fixing Data Leakage + Rigorous Evaluation)

Dataset	Best Model	Accuracy	F1-Score	Training Time
DS2OS	XGBoost	96.26%	0.9695	3.16s
NSL-KDD	XGBoost	80.82%	0.8073	3.67s

How That Compares to the Literature

Study	Model	Dataset	Reported	Ours	Gap
Tekin et al.	RF	DS2OS	~99.00%	96.26%	-2.74%
Tiwari et al.	LightGBM	DS2OS	98.52%	96.26%	-2.26%
Safaei Pour et al.	RF	NSL-KDD	99.60%	77.17%	-22.43%
Safaei Pour et al.	Gradient Boosting	NSL-KDD	99.60%	78.25%	-21.35%

Why the gap?

DS2OS (~3% drop): Original studies kept identifier columns (timestamp, sourceID, sourceAddress) that leak the target variable. Once you remove those, the model has to actually learn behavioral patterns instead of memorizing IDs.
NSL-KDD (~22% drop): Original studies tested on random splits of training data, so models only ever saw attack types they'd been trained on. We used the official KDDTest+ holdout, which includes novel attacks (mscan, saint, apache2, processtable) absent from training. That's a much harder test, and it's closer to what real deployment looks like.

Note: Both datasets are proxies for cloud cryptojacking. No public dataset captures actual cloud VM, container, or Kubernetes telemetry with labeled cryptomining activity. That's one of the biggest findings from our review.

Repository Structure

cryptojacking-validation/
├── README.md
├── requirements.txt
├── LICENSE
│
├── notebooks/                   # Run these in order
│   ├── 1_Master.ipynb          # Setup + data download
│   ├── 2_Exploration.ipynb     # Dataset exploration
│   ├── 3_Preprocessing.ipynb   # Cleaning, SMOTE, scaling
│   └── 4_Models.ipynb          # Training + evaluation
│
├── data/
│   ├── raw/                    # Downloaded datasets
│   └── processed/              # Preprocessed arrays
│
├── models/                      # Saved .pkl files
│
├── results/
│   ├── figures/                # Plots and visualizations
│   └── metrics/                # CSV metrics
│
├── scripts/
│   └── utils.py                # Helper functions
│
└── docs/
    └── METHODOLOGY.md          # Detailed methodology

Quick Start

Google Colab (Recommended)

Click the Colab badge above
Run notebooks in order: 1_Master > 2_Exploration > 3_Preprocessing > 4_Models
You'll need a Kaggle account and API key for DS2OS download

Local Setup

git clone https://github.com/AmitabhCh822/cryptojacking-validation.git
cd cryptojacking-validation

python -m venv venv
source venv/bin/activate  # Windows: venv\Scripts\activate

pip install -r requirements.txt
jupyter notebook

Datasets

DS2OS (Distributed Smart Space Orchestration System)

Source: Kaggle
Samples: 357,952
Original features: 12 (IoT device telemetry)
After preprocessing: 5 features (sourceType, sourceLocation, destinationServiceAddress, destinationServiceType, destinationLocation)
Removed for leakage: sourceID, sourceAddress, timestamp, value, accessedNodeAddress, accessedNodeType, operation
Class split: 97.2% normal, 2.8% attack

NSL-KDD

Source: UNB CIC
Train: 125,973 samples (KDDTrain+)
Test: 22,544 samples (KDDTest+, includes novel attack types not seen in training)
Features: 41 (network traffic patterns)
Class split: ~53% normal, ~47% attack

Why proxy datasets?

Our review found zero public datasets with real cloud cryptojacking telemetry. The closest options (CREMEv2, VIKRANT honeypot, AWS simulation repo) only capture host-level sequences or network flows. None of them include hypervisor metrics, Kubernetes pod stats, or container runtime telemetry. See Section 4.4.1 of the paper for the full breakdown.

Models

We picked six model families based on what shows up most in the reviewed literature. Classical ML covers 57% of the studies we reviewed.

Model	Config
Random Forest	100 estimators, max_depth=20
XGBoost	100 estimators, max_depth=10, lr=0.1
LightGBM	100 estimators, max_depth=10
Decision Tree	max_depth=15
KNN	5 neighbors
Gradient Boosting	100 estimators, max_depth=5, lr=0.1

We used configs commonly reported in the literature. No automated tuning. The point here is reproducibility, not chasing the highest number.

Results

DS2OS (After Data Leakage Removal)

Model	Accuracy	F1-Score	Precision	Time
Random Forest	96.26%	0.9695	0.9830	21.92s
XGBoost	96.26%	0.9695	0.9830	3.16s
LightGBM	96.26%	0.9695	0.9830	3.78s
Gradient Boosting	96.23%	0.9693	0.9829	45.31s
Decision Tree	96.26%	0.9695	0.9830	0.55s
KNN	99.21%	0.9915	0.9921	1.19s

All tree-based models land at basically the same accuracy (~96.26%). That happens because only 5 low-cardinality features survive after you strip out the leaky columns. KNN hits 99.21% but trades off attack recall to get there.

NSL-KDD (Official KDDTest+ with Novel Attacks)

Model	Accuracy	F1-Score	Precision	Time
Random Forest	77.17%	0.7686	0.8345	8.61s
XGBoost	80.82%	0.8073	0.8527	3.67s
LightGBM	80.35%	0.8023	0.8503	3.44s
Gradient Boosting	78.25%	0.7802	0.8391	47.81s
Decision Tree	77.66%	0.7740	0.8354	1.01s
KNN	76.76%	0.7639	0.8347	0.12s

77 to 81%. Way below the ~99% you see in published papers. The difference comes down to one thing: we tested on attacks the models never trained on. That's the reality of deploying a cryptojacking detector where attackers keep changing their techniques.

What We Learned

Data leakage matters. Removing identifier columns from DS2OS drops accuracy by ~3%. Studies that kept those columns were essentially memorizing record IDs, not learning attack behavior.
Generalization is the real test. The 22% accuracy drop on NSL-KDD shows that models tested only on familiar attacks massively overstate how well they'll work in production.
XGBoost and LightGBM hit the sweet spot. Both achieve top accuracy in 3-4 seconds. Gradient Boosting takes 45+ seconds for no real improvement.
Cross-dataset transfer doesn't work. 5 IoT features and 41 network features are fundamentally different modalities. You can't just move a model from one to the other.
Class imbalance needs handling. Without SMOTE on DS2OS, models hit 97% accuracy by predicting everything as normal. Zero attack recall.

Preprocessing Pipeline

Raw Data
    │
    ├── Check for data leakage
    │   └── Remove identifiers correlated with the target
    │
    ├── Label encode categorical features
    │
    ├── Stratified 70/30 train/test split
    │
    ├── SMOTE on training set (DS2OS only, 1:1 ratio)
    │   └── NSL-KDD is already ~53/47, no resampling needed
    │
    └── StandardScaler (zero mean, unit variance)

KDDTest+ is used as-is for the NSL-KDD test set. Resampling it would defeat the purpose of testing on novel attacks.

Reproducing Results

We report single stratified train-test splits to match how the primary studies we're comparing against ran their experiments. That means there's some partition-dependent variance. Treat the numbers as point estimates, not guarantees.

Environment: Google Colab standard runtime, Python 3.10, scikit-learn 1.3.0, XGBoost 2.0.0, LightGBM 4.0.0.

Citation

Software / Replication Package

@software{chakravorty2026cryptojacking_code,
  title   = {Cryptojacking Validation: AI Against Cloud Cryptojacking (Replication Package)},
  author  = {Chakravorty, Amitabh},
  year    = {2026},
  version = {v1.0.0},
  publisher = {Zenodo},
  doi     = {10.5281/zenodo.18565269},
  url     = {https://github.com/AmitabhCh822/cryptojacking-validation}
}

License

MIT License. See LICENSE for details.

Acknowledgments

University of Cincinnati CECH Impact Accelerator Grant
Canadian Institute for Cybersecurity (NSL-KDD dataset)
DS2OS dataset contributors

Contact

Amitabh Chakravorty - chakraa4@mail.uc.edu
Nelly Elsayed - elsayeny@ucmail.uc.edu

This repo is part of a systematic literature review. The main takeaway: high accuracy on proxy datasets doesn't mean much if it's driven by data leakage or testing only on known attacks. The field needs public cloud-specific cryptojacking datasets before any of these detection approaches can be taken seriously in production.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Cryptojacking Detection Validation

Overview

Key Findings

Our Results (After Fixing Data Leakage + Rigorous Evaluation)

How That Compares to the Literature

Repository Structure

Quick Start

Google Colab (Recommended)

Local Setup

Datasets

DS2OS (Distributed Smart Space Orchestration System)

NSL-KDD

Why proxy datasets?

Models

Results

DS2OS (After Data Leakage Removal)

NSL-KDD (Official KDDTest+ with Novel Attacks)

What We Learned

Preprocessing Pipeline

Reproducing Results

Citation

Software / Replication Package

License

Acknowledgments

Contact

About

Uh oh!

Releases 1

Packages

Languages

Name		Name	Last commit message	Last commit date
Latest commit History 68 Commits
data		data
docs		docs
models		models
notebooks		notebooks
python_files		python_files
results		results
scripts		scripts
.gitignore		.gitignore
CITATION.cff		CITATION.cff
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
PRISMA.png		PRISMA.png
README.md		README.md
requirements.txt		requirements.txt

License

AmitabhCh822/cryptojacking-validation

Folders and files

Latest commit

History

Repository files navigation

Cryptojacking Detection Validation

Overview

Key Findings

Our Results (After Fixing Data Leakage + Rigorous Evaluation)

How That Compares to the Literature

Repository Structure

Quick Start

Google Colab (Recommended)

Local Setup

Datasets

DS2OS (Distributed Smart Space Orchestration System)

NSL-KDD

Why proxy datasets?

Models

Results

DS2OS (After Data Leakage Removal)

NSL-KDD (Official KDDTest+ with Novel Attacks)

What We Learned

Preprocessing Pipeline

Reproducing Results

Citation

Software / Replication Package

License

Acknowledgments

Contact

About

Topics

Resources

License

Contributing

Uh oh!

Stars

Watchers

Forks

Releases 1

Packages 0

Languages

Packages