Skip to content

Commit 8b87a3a

Browse files
committed
Add better integration with SageMaker capabilities and improve regional support.
1 parent 669e130 commit 8b87a3a

19 files changed

+1657
-182
lines changed

.gitignore

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,3 @@
1+
.DS_Store
2+
3+
.ipynb_checkpoints/

README.md

Lines changed: 11 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -10,9 +10,9 @@ To get started quickly, use the following quick-launch link to launch a CloudFor
1010

1111
| Region | Stack |
1212
| ---- | ---- |
13-
|US East (N. Virginia) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://sagemaker-solutions-us-east-1.s3.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
14-
|US East (Ohio) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-2.console.aws.amazon.com/cloudformation/home?region=us-east-2#/stacks/create/review?templateURL=https://sagemaker-solutions-us-east-2.s3.us-east-2.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
15-
|US West (Oregon) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-west-2.console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/create/review?templateURL=https://sagemaker-solutions-us-west-2.s3-us-west-2.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
13+
|US East (N. Virginia) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-1.console.aws.amazon.com/cloudformation/home?region=us-east-1#/stacks/create/review?templateURL=https://sagemaker-solutions-prod-us-east-1.s3.us-east-1.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
14+
|US East (Ohio) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-east-2.console.aws.amazon.com/cloudformation/home?region=us-east-2#/stacks/create/review?templateURL=https://sagemaker-solutions-prod-us-east-2.s3.us-east-2.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
15+
|US West (Oregon) | [<img src="https://s3.amazonaws.com/cloudformation-examples/cloudformation-launch-stack.png">](https://us-west-2.console.aws.amazon.com/cloudformation/home?region=us-west-2#/stacks/create/review?templateURL=https://sagemaker-solutions-prod-us-west-2.s3.us-west-2.amazonaws.com/Fraud-detection-using-machine-learning/deployment/fraud-detection-using-machine-learning.yaml&stackName=SageMaker-Fraud-Machine-Learning) |
1616

1717

1818
### Additional Instructions
@@ -38,7 +38,7 @@ Both of the trained models are deployed to Amazon SageMaker managed real-time en
3838

3939
The model training and endpoint deployment is orchestrated by running a [jupyter notebook](source/notebooks/sagemaker_fraud_detection.ipynb) on a SageMaker Notebook instance. The jupyter notebook runs a demonstration of the project using the aforementioned anonymized credit card dataset that is automatically downloaded to the Amazon S3 Bucket created when you launch the solution. However, the notebook can be modified to run the project on a custom dataset in S3. The notebook instance also contains some example code that shows how to invoke the REST API for inference.
4040

41-
In order to encapsulate the project as a stand-alone microservice, Amazon API Gateway is used to provide a REST API, that is backed by an AWS Lambda function. The Lambda function runs the [code](https://github.com/awslabs/fraud-detection-using-machine-learning/blob/master/source/fraud_detection/index.py) to preprocess incoming transactions, invoke sagemaker endpoints, merge results from both endpoints if necessary, store the model inputs and model predictions in S3 via Kinesis Firehose, and provide a response to the client.
41+
In order to encapsulate the project as a stand-alone microservice, Amazon API Gateway is used to provide a REST API, that is backed by an AWS Lambda function. The Lambda function runs the code necessary to preprocess incoming transactions, invoke sagemaker endpoints, merge results from both endpoints if necessary, store the model inputs and model predictions in S3 via Kinesis Firehose, and provide a response to the client.
4242

4343
## Data
4444

@@ -78,12 +78,16 @@ We cite the following works:
7878
* `notebooks/`
7979
* `src`
8080
* `package`
81-
* `config.py`: Read in the environment variables set by cloudformation stack creation
81+
* `config.py`: Read in the environment variables set during the Amazon CloudFormation stack creation
8282
* `generate_endpoint_traffic.py`: Custom script to show how to send transaction traffic to REST API for inference
8383
* `util.py`: Helper function and utilities
8484
* `sagemaker_fraud_detection.ipynb`: Orchestrates the solution. Trains the models and deploys the trained model
85-
* `setup/`
86-
* `on-start.sh`: Bash script to setup sagemaker notebook environment with necessary dependencies
85+
* `endpoint_demo.ipynb`: A small notebook that demonstrates how one can use the solution's endpoint to make prediction.
86+
* `scripts/`
87+
* `set_kernelspec.py`: Used to update the kernelspec name at deployment.
88+
* `test/`
89+
* Files that are used to automatically test the solution
90+
8791

8892
## License
8993

Lines changed: 99 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,99 @@
1+
AWSTemplateFormatVersion: "2010-09-09"
2+
Description: "((SO0056)) - fraud-detection-using-machine-learning demo stack"
3+
Parameters:
4+
SolutionPrefix:
5+
Description: The name of the prefix for the solution used for naming resources.
6+
Type: String
7+
SolutionsBucket:
8+
Description: The bucket that contains the solution files.
9+
Type: String
10+
SolutionName:
11+
Type: String
12+
ExecutionRoleArn:
13+
Description: The role used when invoking the enpoint.
14+
Type: String
15+
16+
Mappings:
17+
RegionMap:
18+
"us-west-1":
19+
"XGBoost": "746614075791.dkr.ecr.us-west-1.amazonaws.com"
20+
"us-west-2":
21+
"XGBoost": "246618743249.dkr.ecr.us-west-2.amazonaws.com"
22+
"us-east-1":
23+
"XGBoost": "683313688378.dkr.ecr.us-east-1.amazonaws.com"
24+
"us-east-2":
25+
"XGBoost": "257758044811.dkr.ecr.us-east-2.amazonaws.com"
26+
"ap-northeast-1":
27+
"XGBoost": "354813040037.dkr.ecr.ap-northeast-1.amazonaws.com"
28+
"ap-northeast-2":
29+
"XGBoost": "366743142698.dkr.ecr.ap-northeast-2.amazonaws.com"
30+
"ap-southeast-1":
31+
"XGBoost": "121021644041.dkr.ecr.ap-southeast-1.amazonaws.com"
32+
"ap-southeast-2":
33+
"XGBoost": "783357654285.dkr.ecr.ap-southeast-2.amazonaws.com"
34+
"ap-south-1":
35+
"XGBoost": "720646828776.dkr.ecr.ap-south-1.amazonaws.com"
36+
"ap-east-1":
37+
"XGBoost": "651117190479.dkr.ecr.ap-east-1.amazonaws.com"
38+
"ca-central-1":
39+
"XGBoost": "341280168497.dkr.ecr.ca-central-1.amazonaws.com"
40+
"cn-north-1":
41+
"XGBoost": "450853457545.dkr.ecr.cn-north-1.amazonaws.com.cn"
42+
"cn-northwest-1":
43+
"XGBoost": "451049120500.dkr.ecr.cn-northwest-1.amazonaws.com.cn"
44+
"eu-central-1":
45+
"XGBoost": "492215442770.dkr.ecr.eu-central-1.amazonaws.com"
46+
"eu-north-1":
47+
"XGBoost": "662702820516.dkr.ecr.eu-north-1.amazonaws.com"
48+
"eu-south-1":
49+
"XGBoost": "048378556238.dkr.ecr.eu-north-1.amazonaws.com"
50+
"eu-west-1":
51+
"XGBoost": "141502667606.dkr.ecr.eu-west-1.amazonaws.com"
52+
"eu-west-2":
53+
"XGBoost": "764974769150.dkr.ecr.eu-west-2.amazonaws.com"
54+
"eu-west-3":
55+
"XGBoost": "659782779980.dkr.ecr.eu-west-3.amazonaws.com"
56+
"me-south-1":
57+
"XGBoost": "801668240914.dkr.ecr.me-south-1.amazonaws.com"
58+
"sa-east-1":
59+
"XGBoost": " 737474898029.dkr.ecr.sa-east-1.amazonaws.com"
60+
"us-gov-west-1":
61+
"XGBoost": "414596584902.dkr.ecr.us-gov-west-1.amazonaws.com"
62+
63+
Resources:
64+
FraudClassificationModel:
65+
Type: "AWS::SageMaker::Model"
66+
Properties:
67+
ExecutionRoleArn: !Ref ExecutionRoleArn
68+
PrimaryContainer:
69+
Image: !Sub
70+
- "${ContainerLocation}/sagemaker-xgboost:0.90-2-cpu-py3"
71+
- ContainerLocation:
72+
Fn::FindInMap: [RegionMap, !Ref "AWS::Region", "XGBoost"]
73+
ModelDataUrl: !Sub "s3://${SolutionsBucket}/${SolutionName}/artifacts/xgboost-model.tar.gz"
74+
ModelName: !Sub "${SolutionPrefix}-demo"
75+
FraudClassificationEndpointConfig:
76+
Type: "AWS::SageMaker::EndpointConfig"
77+
Properties:
78+
ProductionVariants:
79+
- InitialInstanceCount: 1
80+
InitialVariantWeight: 1.0
81+
InstanceType: ml.m5.xlarge
82+
ModelName: !GetAtt FraudClassificationModel.ModelName
83+
VariantName: !GetAtt FraudClassificationModel.ModelName
84+
EndpointConfigName: !Sub "${SolutionPrefix}-demo"
85+
Metadata:
86+
cfn_nag:
87+
rules_to_suppress:
88+
- id: W1200
89+
reason: Demo endpoint not given a KmsID
90+
FraudClassificationEndpoint:
91+
Type: "AWS::SageMaker::Endpoint"
92+
Properties:
93+
EndpointName: !Sub "${SolutionPrefix}-demo"
94+
EndpointConfigName: !GetAtt FraudClassificationEndpointConfig.EndpointConfigName
95+
96+
Outputs:
97+
EndpointName:
98+
Description: Name of the demo XGBoost fraud classification endpoint
99+
Value: !GetAtt FraudClassificationEndpoint.EndpointName

deployment/fraud-detection-sagemaker-notebook-instance.yaml

Lines changed: 60 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,6 @@ Description: >-
44
Parameters:
55
SolutionPrefix:
66
Type: String
7-
Default: "sm-soln-fraud-detection"
87
ParentStackName:
98
Type: String
109
SolutionName:
@@ -17,19 +16,62 @@ Parameters:
1716
Type: String
1817
RESTAPIGateway:
1918
Type: String
19+
TestOutputsS3Bucket:
20+
Type: String
2021

2122
Mappings:
2223
SolutionsS3BucketName:
2324
development:
24-
Prefix: sagemaker-solutions-build
25+
Prefix: sagemaker-solutions-devo
2526
release:
26-
Prefix: sagemaker-solutions
27+
Prefix: sagemaker-solutions-prod
28+
NotebookInstanceType:
29+
"af-south-1":
30+
Type: ml.t3.medium
31+
"ap-east-1":
32+
Type: ml.t3.medium
33+
"ap-northeast-1":
34+
Type: ml.t3.medium
35+
"ap-northeast-2":
36+
Type: ml.t2.medium
37+
"ap-south-1":
38+
Type: ml.t2.medium
39+
"ap-southeast-1":
40+
Type: ml.t3.medium
41+
"ap-southeast-2":
42+
Type: ml.t3.medium
43+
"ca-central-1":
44+
Type: ml.t3.medium
45+
"eu-central-1":
46+
Type: ml.t3.medium
47+
"eu-north-1":
48+
Type: ml.t3.medium
49+
"eu-south-1":
50+
Type: ml.t3.medium
51+
"eu-west-1":
52+
Type: ml.t3.medium
53+
"eu-west-2":
54+
Type: ml.t3.medium
55+
"eu-west-3":
56+
Type: ml.t3.medium
57+
"me-south-1":
58+
Type: ml.t3.medium
59+
"sa-east-1":
60+
Type: ml.t3.medium
61+
"us-east-1":
62+
Type: ml.t3.medium
63+
"us-east-2":
64+
Type: ml.t3.medium
65+
"us-west-1":
66+
Type: ml.t3.medium
67+
"us-west-2":
68+
Type: ml.t3.medium
2769

2870
Resources:
2971
BasicNotebookInstance:
3072
Type: 'AWS::SageMaker::NotebookInstance'
3173
Properties:
32-
InstanceType: ml.t3.medium
74+
InstanceType: !FindInMap [NotebookInstanceType, !Ref "AWS::Region", Type]
3375
NotebookInstanceName: !Sub "${SolutionPrefix}-notebook-instance"
3476
RoleArn: !Ref NotebookInstanceExecutionRoleArn
3577
LifecycleConfigName: !GetAtt
@@ -55,8 +97,8 @@ Resources:
5597
cd /home/ec2-user/SageMaker
5698
# copy source files
5799
aws s3 sync s3://${SolutionsS3BucketNamePrefix}-${AWS::Region}/${SolutionName}/source .
58-
unzip ./creditcardfraud.zip -d ./notebooks/
59-
rm ./creditcardfraud.zip
100+
# copy test files
101+
aws s3 sync s3://${SolutionsS3BucketNamePrefix}-${AWS::Region}/${SolutionName}/test ./test
60102
# create stack_outputs.json with stack resources that are required in notebook(s)
61103
touch stack_outputs.json
62104
echo '{' >> stack_outputs.json
@@ -66,22 +108,33 @@ Resources:
66108
echo ' "AwsRegion": "${AWS::Region}",' >> stack_outputs.json
67109
echo ' "IamRole": "${NotebookInstanceExecutionRoleArn}",' >> stack_outputs.json
68110
echo ' "ModelDataBucket": "${ModelDataBucket}",' >> stack_outputs.json
69-
echo ' "SolutionsS3Bucket": "${SolutionsS3BucketNamePrefix}-${AWS::Region}",' >> stack_outputs.json
111+
echo ' "SolutionsS3Bucket": "${SolutionsS3BucketNamePrefix}",' >> stack_outputs.json
70112
echo ' "RESTAPIGateway": "${RESTAPIGateway}",' >> stack_outputs.json
113+
echo ' "TestOutputsS3Bucket": "${TestOutputsS3Bucket}",' >> stack_outputs.json
114+
echo ' "SolutionName": "${SolutionName}",' >> stack_outputs.json
71115
echo ' "SagemakerMode": "NotebookInstance"' >> stack_outputs.json
72116
echo '}' >> stack_outputs.json
73117
echo "stack_outputs.json created:"
74118
cat stack_outputs.json
119+
# Replace placeholders
120+
cd /home/ec2-user/SageMaker/notebooks
121+
sed -s -i 's/HUB_1P_IMAGE/conda_python3/g' *.ipynb
75122
EOF
76123
- SolutionsS3BucketNamePrefix:
77124
Fn::FindInMap: [SolutionsS3BucketName, Ref: StackVersion, Prefix]
78125
OnStart:
79126
- Content:
80127
Fn::Base64: |
128+
#!/bin/bash
81129
set -e
82130
# perform following actions as ec2-user
83131
sudo -u ec2-user -i <<EOF
84132
/home/ec2-user/anaconda3/envs/python3/bin/python /home/ec2-user/SageMaker/env_setup.py --force --log-level DEBUG
133+
cd /home/ec2-user/SageMaker
134+
for nb in notebooks/*.ipynb; do python ./scripts/set_kernelspec.py --notebook "$nb" --kernel "conda_python3" --display-name "conda_python3"; done
135+
# Optionally run the solution's notebook if this was an integration test launch
136+
nohup /home/ec2-user/anaconda3/envs/python3/bin/python ./test/run_notebook.py > ./test/run_notebook.log 2>&1 &
137+
echo "OnStart script completed!"
85138
EOF
86139
Outputs:
87140
SageMakerNotebook:

0 commit comments

Comments
 (0)