Skip to content

Commit 669e130

Browse files
committed
Update README and data location.
1 parent 14217df commit 669e130

File tree

4 files changed

+28
-22
lines changed

4 files changed

+28
-22
lines changed

README.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,16 +2,16 @@
22

33
With businesses moving online, fraud and abuse in online systems is constantly increasing as well. Traditionally, rule-based fraud detection systems are used to combat online fraud, but these rely on a static set of rules created by human experts. This project uses machine learning to create models for fraud detection that are dynamic, self-improving and maintainable. Importantly, they can scale with the online business.
44

5-
Specifically, we show how to use Amazon SageMaker to train supervised and unsupervised machine learning models on historical transactions, so that they can predict the likelihood of incoming transactions being fraudulent or not. We also show how to deploy the models, once trained, to a REST API that can be integrated into an existing business software infracture. This project includes a demonstration of this process using a public, anonymized credit card transactions [dataset provided by ULB](https://www.kaggle.com/mlg-ulb/creditcardfraud), but can be easily modified to work with custom labelled or unlaballed data provided as a relational table in csv format.
5+
Specifically, we show how to use Amazon SageMaker to train supervised and unsupervised machine learning models on historical transactions, so that they can predict the likelihood of incoming transactions being fraudulent or not. We also show how to deploy the models, once trained, to a REST API that can be integrated into an existing business software infrastructure. This project includes a demonstration of this process using a public, anonymized credit card transactions [dataset provided by ULB](https://www.kaggle.com/mlg-ulb/creditcardfraud), but can be easily modified to work with custom labelled or unlaballed data provided as a relational table in csv format.
66

77
## Getting Started
88

99
To get started quickly, use the following quick-launch link to launch a CloudFormation Stack create form and follow the instructions below to deploy the resources in this project.
1010

1111
| Region | Stack |
1212
| ---- | ---- |
13-
|US East (N. Virginia) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://sagemaker-solutions-us-east-1.s3-us-east-1.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
14-
|US East (Ohio) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-2.console.aws.amazon.com/cloudformation/home?region=us-east-2#/stacks/create/review?templateURL=https://sagemaker-solutions-us-east-2.s3-us-east-2.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
13+
|US East (N. Virginia) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://sagemaker-solutions-us-east-1.s3.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
14+
|US East (Ohio) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-2.console.aws.amazon.com/cloudformation/home?region=us-east-2#/stacks/create/review?templateURL=https://sagemaker-solutions-us-east-2.s3.us-east-2.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
1515
|US West (Oregon) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-west-2.console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/create/review?templateURL=https://sagemaker-solutions-us-west-2.s3-us-west-2.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
1616

1717

deployment/fraud-detection-sagemaker-notebook-instance.yaml

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -54,8 +54,9 @@ Resources:
5454
sudo -u ec2-user -i <<EOF
5555
cd /home/ec2-user/SageMaker
5656
# copy source files
57-
aws s3 sync s3://${SolutionsS3BucketNamePrefix}-${AWS::Region}/${SolutionName}/ .
58-
unzip creditcardfraud.zip -d ./source/notebooks/
57+
aws s3 sync s3://${SolutionsS3BucketNamePrefix}-${AWS::Region}/${SolutionName}/source .
58+
unzip ./creditcardfraud.zip -d ./notebooks/
59+
rm ./creditcardfraud.zip
5960
# create stack_outputs.json with stack resources that are required in notebook(s)
6061
touch stack_outputs.json
6162
echo '{' >> stack_outputs.json
@@ -80,9 +81,9 @@ Resources:
8081
set -e
8182
# perform following actions as ec2-user
8283
sudo -u ec2-user -i <<EOF
83-
/home/ec2-user/anaconda3/envs/python3/bin/python /home/ec2-user/SageMaker/source/env_setup.py --force --log-level DEBUG
84+
/home/ec2-user/anaconda3/envs/python3/bin/python /home/ec2-user/SageMaker/env_setup.py --force --log-level DEBUG
8485
EOF
8586
Outputs:
8687
SageMakerNotebook:
8788
Description: "Opens the Jupyter notebook to get started with model training"
88-
Value: !Sub "https://${SolutionPrefix}-notebook-instance.notebook.${AWS::Region}.sagemaker.aws/notebooks/source/notebooks/sagemaker_fraud_detection.ipynb"
89+
Value: !Sub "https://${SolutionPrefix}-notebook-instance.notebook.${AWS::Region}.sagemaker.aws/notebooks/notebooks/sagemaker_fraud_detection.ipynb"

source/env_setup.py

Lines changed: 6 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@
44
import subprocess
55
import logging
66
import sys
7+
from zipfile import ZipFile
78

89
CURRENT_FILE = Path(__file__).resolve()
910
CURRENT_FOLDER = CURRENT_FILE.parent
@@ -15,7 +16,7 @@
1516
# Common setup
1617

1718
def get_sagemaker_mode() -> str:
18-
stack_outputs_file = Path(CURRENT_FOLDER.parent, 'stack_outputs.json')
19+
stack_outputs_file = Path(CURRENT_FOLDER, 'stack_outputs.json')
1920
with open(stack_outputs_file) as f:
2021
outputs = json.load(f)
2122
sagemaker_mode = outputs['SagemakerMode']
@@ -167,6 +168,10 @@ def env_setup_notebook_instance() -> None:
167168
def env_setup_studio() -> None:
168169
logging.info('Starting environment setup for Studio.')
169170
py_exec = get_executable()
171+
logging.info('Extracting data.')
172+
with ZipFile(f"{CURRENT_FOLDER}/creditcardfraud.zip", 'r') as zf:
173+
zf.extractall(path=f"{CURRENT_FOLDER}/notebooks")
174+
170175
logging.info('Upgrading pip packages.')
171176
bash(f"""
172177
export PIP_DISABLE_PIP_VERSION_CHECK=1
Lines changed: 14 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -1,21 +1,21 @@
1-
from dotenv import load_dotenv
2-
import os
1+
import json
32
from pathlib import Path
43

54
from package import utils
65

76
current_folder = utils.get_current_folder(globals())
8-
env_location = '../../../../.env'
9-
dotenv_filepath = Path(current_folder, env_location).resolve()
10-
assert dotenv_filepath.exists(), "Could not find .env file at {}".format(str(dotenv_filepath))
7+
cfn_stack_outputs_filepath = Path(current_folder, '../../../stack_outputs.json').resolve()
8+
assert cfn_stack_outputs_filepath.exists(), "Could not find stack_outputs.json file at {}".format(
9+
str(cfn_stack_outputs_filepath))
1110

12-
load_dotenv()
11+
with open(cfn_stack_outputs_filepath) as f:
12+
cfn_stack_outputs = json.load(f)
1313

14-
STACK_NAME = os.environ['FRAUD_STACK_NAME']
15-
AWS_ACCOUNT_ID = os.environ['AWS_ACCOUNT_ID']
16-
AWS_REGION = os.environ['AWS_REGION']
17-
SAGEMAKER_IAM_ROLE = os.environ['SAGEMAKER_IAM_ROLE']
18-
SOLUTIONS_S3_BUCKET = os.environ['SOLUTIONS_S3_BUCKET']
19-
20-
MODEL_DATA_S3_BUCKET = os.environ['MODEL_DATA_S3_BUCKET']
21-
REST_API_GATEWAY = os.environ['REST_API_GATEWAY']
14+
STACK_NAME = cfn_stack_outputs['FraudStackName']
15+
SOLUTION_PREFIX = cfn_stack_outputs['SolutionPrefix']
16+
AWS_ACCOUNT_ID = cfn_stack_outputs['AwsAccountId']
17+
AWS_REGION = cfn_stack_outputs['AwsRegion']
18+
SAGEMAKER_IAM_ROLE = cfn_stack_outputs['IamRole']
19+
MODEL_DATA_S3_BUCKET = cfn_stack_outputs['ModelDataBucket']
20+
SOLUTIONS_S3_BUCKET = cfn_stack_outputs['SolutionsS3Bucket']
21+
REST_API_GATEWAY = cfn_stack_outputs['RESTAPIGateway']

0 commit comments

Comments
 (0)