-
Notifications
You must be signed in to change notification settings - Fork 64
Using Data Job Properties vs Secrets
This article outlines when and how you should use Data Job Properties or Secrets. While both mechanisms can be used somewhat interchangeably there are certain things you should be aware of:
- Properties are used to store state and non-sensitive data and are generally faster to access and modify. If you need to overwrite a value often, sometimes on multiple occasions during the execution of a data job - properties are the way to go. They are stored in plain text in the VDK Control Service Data Base.
- Secrets are generally fast to access (somewhat slower than Properties), but slow to modify, as they are encrypted/decrypted during the storage/retrieval process. They are best suited for storing sensitive data - secrets, passwords, credentials, tokens, API keys, etc. They are stored in an encrypted stated in a Secure Storage - for example a Hashicorp Vault instance.
You can use the "vdk properties" command to store and retrieve properties via the command line. You can set a property via the command line using the "--set" option:
vdk properties -n my-job -t my-team --set "key1" "value1"
You can get the value of a single property with the "--get" option:
vdk properties -n my-job -t my-team --get "key1"
Or get all the properties via the "--list" option:
vdk properties -n my-job -t my-team --list
Finally, you can delete a property, using the "--delete" option:
In a data job, you can access Job Properties via the JobInput's properties methods. In the following example we get all the properties, modify some of them and store them back, to save the last date when we processed data:
def run(job_input):
# get the properties
properties = job_input.get_all_properties()
current_date = str(date.today())
logging.info("Current date is %s", current_date)
if 'last_ingested_timestamp' in properties:
logging.info("Last ingested timestamp is %s", properties['last_ingested_timestamp'])
if ('last_ingested_timestamp' not in properties) or current_date != properties['last_ingested_timestamp']:
logging.info("Getting data from Influx")
# some very complex processing goes here...
# update the property value and store it
properties['last_ingested_timestamp'] = current_date
job_input.set_all_properties(properties)
else:
logging.info("Skipped ingestion")You can use the "vdk secrets" command to store and retrieve secrets via the command line. If you are using the vdk cli on a private/secure console, you can directly set a secret via the following command
vdk secrets -n my-job -t my-team --set "api_key" "<your API Key goes here>"
Alternatively you can pass just the key for you secret to the command and then you'll get prompted to enter it and it won't be kept in your console's history.
vdk secrets -n my-job -t my-team --set "api_key"
You can get the value of a single secret with the "--get" option:
vdk secrets-n my-job -t my-team --get "key1"
Or get all the secrets via the "--list" option:
vdk secrets -n my-job -t my-team --list
Finally, you can delete a secret, using the "--delete" option:
In a data job, you can access Job Secrets via the JobInput's secrets methods. In the following example we'll get the value of a single secret and use it to make an authenticated REST call:
import requests
from datetime import date, timedelta
from vdk.api.job_input import IJobInput
def run(job_input: IJobInput):
# Get the API Key from the Job Secrets
api_key = job_input.get_secret('api_key')
# Get yesterday's date
yesterday_date = date.today() - timedelta(days=1)
# Get the data
url = "https://newsapi.org/v2/everything"
params = {
"q": "Taylor Swift",
"from": yesterday_date.strftime("%Y-%m-%d"),
"sortBy": "popularity",
"language": "en",
"apiKey": api_key,
}
response = requests.get(url, params=params)
response.raise_for_status()
data = response.json()
# Process the data...Congratulations, you've reached the end of this tutorial!
SDK - Develop Data Jobs
SDK Key Concepts
Control Service - Deploy Data Jobs
Control Service Key Concepts
- Scheduling a Data Job for automatic execution
- Deployment
- Execution
- Production
- Properties and Secrets
Operations UI
Community
Contacts