Skip to content

Commit ba832b3

Browse files
authored
Feature/integ 1064 multi checkpoint (#106)
1 parent dedba05 commit ba832b3

File tree

26 files changed

+400
-404
lines changed

26 files changed

+400
-404
lines changed

CHANGELOG.md

Lines changed: 10 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -8,6 +8,16 @@ and this project adheres to [Semantic Versioning](https://semver.org/spec/v2.0.0
88
The intended audience of this file is for py42 consumers -- as such, changes that don't affect
99
how a consumer would use the library (e.g. adding unit tests, updating documentation, etc) are not captured here.
1010

11+
## Unreleased
12+
13+
### Changed
14+
15+
- `-i` (`--incremental`) has been removed, use `-c` (`--use-checkpoint`) with a string name for the checkpoint instead.
16+
17+
### Added
18+
19+
- Profile can now save multiple alert and file event checkpoints. The name of the checkpoint to be used for a given query should be passed to `-c` (`--use-checkpoint`).
20+
1121
## 0.7.3 - 2020-06-23
1222

1323
### Fixed

README.md

Lines changed: 0 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -61,10 +61,6 @@ To see all your profiles, do:
6161
code42 profile list
6262
```
6363

64-
A separate profile would be needed in order to keep the incremental checkpoints separate for different queries.
65-
i.e User needs to maintain separate profiles for file event queries and saved search queries as only one checkpoint
66-
is supported per profile.
67-
6864
## Security Data and Alerts
6965

7066
Using the CLI, you can query for security events and alerts and send them to three possible destination types:

docs/commands/alerts.md

Lines changed: 6 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -30,7 +30,7 @@ Search args are shared between `print`, `write-to`, and `send-to` commands.
3030
Available choices=['FedEndpointExfiltration', 'FedCloudSharePermissions', 'FedFileTypeMismatch'].
3131
* `--description`: Filter alerts by description. Does fuzzy search by default.
3232
* `-f`, `--format` (optional): The format used for outputting file events. Available choices= [CEF,JSON,RAW-JSON].
33-
* `-i`, `--incremental` (optional): Only get file events that were not previously retrieved.
33+
* `-c`, `--use-checkpoint` (optional): Get only file events that were not previously retrieved by writing the timestamp of the last event retrieved to a named checkpoint.
3434

3535
## print
3636

@@ -73,9 +73,12 @@ code42 alerts send-to <server> <optional-args> <args>
7373

7474
## clear-checkpoint
7575

76-
Remove the saved file event checkpoint from 'incremental' (-i) mode.
76+
Arguments:
77+
* `name`: The name to save this checkpoint as for later reuse.
78+
79+
Remove the saved file event checkpoint from 'use-checkpoint' (-c) mode.
7780

7881
Usage:
7982
```bash
80-
code42 alerts clear-checkpoint
83+
code42 alerts clear-checkpoint <name>
8184
```

docs/commands/securitydata.md

Lines changed: 7 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -6,7 +6,7 @@ Search args are shared between `print`, `write-to`, and `send-to` commands.
66

77
* `--advanced-query` (optional): A raw JSON file events query. Useful for when the provided query parameters do not
88
satisfy your requirements. WARNING: Using advanced queries is incompatible with other query-building args.
9-
* `-b`, `--begin` (required except for non-first runs in incremental mode): The beginning of the date range in which to
9+
* `-b`, `--begin` (required except for non-first runs in checkpoint mode): The beginning of the date range in which to
1010
look for file events, can be a date/time in yyyy-MM-dd (UTC) or yyyy-MM-dd HH:MM:SS (UTC+24-hr time) format where
1111
the 'time' portion of the string can be partial (e.g. '2020-01-01 12' or '2020-01-01 01:15') or a short value
1212
representing days (30d), hours (24h) or minutes (15m) from current time.
@@ -26,7 +26,7 @@ Search args are shared between `print`, `write-to`, and `send-to` commands.
2626
* `--tab-url` (optional): Limits events to be exposure events with one of these destination tab URLs.
2727
* `--include-non-exposure` (optional): Get all events including non-exposure events.
2828
* `-f`, `--format` (optional): The format used for outputting file events. Available choices= [CEF,JSON,RAW-JSON].
29-
* `-i`, `--incremental` (optional): Only get file events that were not previously retrieved.
29+
* `-c`, `--use-checkpoint` (optional): Get only file events that were not previously retrieved by writing the timestamp of the last event retrieved to a named checkpoint.
3030

3131

3232
## print
@@ -70,9 +70,12 @@ code42 security-data send-to <server> <optional-server-args> <args>
7070

7171
## clear-checkpoint
7272

73-
Remove the saved file event checkpoint from 'incremental' (-i) mode.
73+
Arguments:
74+
* `name`: The name to save this checkpoint as for later reuse.
75+
76+
Remove the saved file event checkpoint from 'use-checkpoint' (-c) mode.
7477

7578
Usage:
7679
```bash
77-
code42 security-data clear-checkpoint
80+
code42 security-data clear-checkpoint <name>
7881
```

docs/userguides/siemexample.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -32,7 +32,7 @@ code42 security-data send-to "https://syslog.example.com:514" -p TCP --profile p
3232
```
3333

3434
Note that it is best practice to use a separate profile when executing a scheduled task. This way, it is harder to
35-
accidentally mess up your stored checkpoints by running `--incremental` adhoc queries.
35+
accidentally mess up your stored checkpoints by running `--use-checkpoint` adhoc queries.
3636

3737
This query will send to the syslog server only the new security event data since the previous request.
3838

src/code42cli/cmds/alerts/extraction.py

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -35,8 +35,8 @@ def extract(sdk, profile, output_logger, args):
3535
send-to: uses a logger that sends logs to a server.
3636
args: Command line args used to build up alert query filters.
3737
"""
38-
store = AlertCursorStore(profile.name) if args.incremental else None
39-
handlers = create_handlers(sdk, AlertExtractor, output_logger, store)
38+
store = AlertCursorStore(profile.name) if args.use_checkpoint else None
39+
handlers = create_handlers(sdk, AlertExtractor, output_logger, store, args.use_checkpoint)
4040
extractor = AlertExtractor(sdk, handlers)
4141
if args.advanced_query:
4242
extractor.extract_advanced(args.advanced_query)

src/code42cli/cmds/alerts/main.py

Lines changed: 5 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -11,9 +11,7 @@
1111
RuleType,
1212
)
1313
from code42cli.cmds.search_shared.cursor_store import AlertCursorStore
14-
from code42cli.cmds.search_shared.args import (
15-
create_incompatible_search_args, SEARCH_FOR_ALERTS
16-
)
14+
from code42cli.cmds.search_shared.args import create_incompatible_search_args, SEARCH_FOR_ALERTS
1715

1816

1917
class MainAlertsSubcommandLoader(SubcommandLoader):
@@ -55,20 +53,20 @@ def load_commands(self):
5553

5654
clear = Command(
5755
self.CLEAR_CHECKPOINT,
58-
u"Remove the saved alert checkpoint from 'incremental' (-i) mode.",
56+
u"Remove the saved alert checkpoint from 'use-checkpoint' (-c) mode.",
5957
u"{} {}".format(usage_prefix, u"clear-checkpoint <optional-args>"),
6058
handler=clear_checkpoint,
6159
)
6260

6361
return [print_func, write, send, clear]
6462

6563

66-
def clear_checkpoint(sdk, profile):
64+
def clear_checkpoint(sdk, profile, cursor_name):
6765
"""Removes the stored checkpoint that keeps track of the last alert retrieved for the given profile..
6866
To use, run `code42 alerts clear-checkpoint`.
69-
This affects `incremental` mode by causing it to behave like it has never been run before.
67+
This affects `use-checkpoint` mode by resetting the checkpoint, causing it to behave like it has never been run before.
7068
"""
71-
AlertCursorStore(profile.name).replace_stored_cursor_timestamp(None)
69+
AlertCursorStore(profile.name).delete(cursor_name)
7270

7371

7472
def _validate_args(args):

src/code42cli/cmds/search_shared/args.py

Lines changed: 6 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -39,8 +39,9 @@ def _saved_search_args():
3939
saved_search = ArgConfig(
4040
u"--saved-search",
4141
help=u"Limits events to those discoverable with the saved search "
42-
u"filters for the saved search with the given ID.\n"
43-
u"WARNING: Using saved search is incompatible with other query-building args.")
42+
u"filters for the saved search with the given ID.\n"
43+
u"WARNING: Using saved search is incompatible with other query-building args.",
44+
)
4445
return {u"saved_search": saved_search}
4546

4647

@@ -65,10 +66,9 @@ def create_incompatible_search_args(search_for=None):
6566
help=u"The end of the date range in which to look for {0}, "
6667
u"argument format options are the same as --begin.".format(search_for),
6768
),
68-
u"incremental": ArgConfig(
69-
u"-i",
70-
u"--incremental",
71-
action=u"store_true",
69+
u"use_checkpoint": ArgConfig(
70+
u"-c",
71+
u"--use-checkpoint",
7272
help=u"Only get {0} that were not previously retrieved.".format(search_for),
7373
),
7474
}

src/code42cli/cmds/search_shared/cursor_store.py

Lines changed: 59 additions & 105 deletions
Original file line numberDiff line numberDiff line change
@@ -1,126 +1,80 @@
11
from __future__ import with_statement
22

3-
import sqlite3
43
import os
4+
from os import path
55

6+
from code42cli.errors import Code42CLIError
67
from code42cli.util import get_user_project_path
78

89

10+
class Cursor(object):
11+
def __init__(self, location):
12+
self._location = location
13+
self._name = path.basename(location)
14+
15+
@property
16+
def name(self):
17+
return self._name
18+
19+
@property
20+
def value(self):
21+
with open(self._location) as checkpoint:
22+
return checkpoint.read()
23+
24+
925
class BaseCursorStore(object):
10-
_PRIMARY_KEY_COLUMN_NAME = u"cursor_id"
11-
_timestamp_column_name = u"OVERRIDE"
12-
_primary_key = u"OVERRIDE"
13-
14-
def __init__(self, db_table_name, db_file_path=None):
15-
self._table_name = db_table_name
16-
if db_file_path is None:
17-
db_path = get_user_project_path(u"db")
18-
db_file = u"file_event_checkpoints.db"
19-
db_file_path = os.path.join(db_path, db_file)
20-
21-
self._connection = sqlite3.connect(db_file_path)
22-
if self._is_empty():
23-
self._init_table()
24-
25-
def _get(self, columns, primary_key):
26-
query = u"SELECT {0} FROM {1} WHERE {2}=?"
27-
query = query.format(columns, self._table_name, self._PRIMARY_KEY_COLUMN_NAME)
28-
with self._connection as conn:
29-
cursor = conn.cursor()
30-
cursor.execute(query, (primary_key,))
31-
return cursor.fetchall()
32-
33-
def _set(self, column_name, new_value, primary_key):
34-
query = u"UPDATE {0} SET {1}=? WHERE {2}=?".format(
35-
self._table_name, column_name, self._PRIMARY_KEY_COLUMN_NAME
36-
)
37-
with self._connection as conn:
38-
conn.execute(query, (new_value, primary_key))
39-
40-
def _delete(self, primary_key):
41-
query = u"DELETE FROM {0} WHERE {1}=?".format(
42-
self._table_name, self._PRIMARY_KEY_COLUMN_NAME
43-
)
44-
with self._connection as conn:
45-
conn.execute(query, (primary_key,))
46-
47-
def _row_exists(self, primary_key):
48-
query = u"SELECT * FROM {0} WHERE {1}=?"
49-
query = query.format(self._table_name, self._PRIMARY_KEY_COLUMN_NAME)
50-
with self._connection as conn:
51-
cursor = conn.cursor()
52-
cursor.execute(query, (primary_key,))
53-
query_result = cursor.fetchone()
54-
if not query_result:
55-
return False
56-
return True
57-
58-
def _drop_table(self):
59-
drop_query = u"DROP TABLE {0}".format(self._table_name)
60-
with self._connection as conn:
61-
conn.execute(drop_query)
62-
63-
def _is_empty(self):
64-
table_count_query = u"""
65-
SELECT COUNT(name)
66-
FROM sqlite_master
67-
WHERE type='table' AND name=?
68-
"""
69-
with self._connection as conn:
70-
cursor = conn.cursor()
71-
cursor.execute(table_count_query, (self._table_name,))
72-
query_result = cursor.fetchone()
73-
if query_result:
74-
return int(query_result[0]) <= 0
75-
76-
def _init_table(self):
77-
columns = u"{0}, {1}".format(self._PRIMARY_KEY_COLUMN_NAME, self._timestamp_column_name)
78-
create_table_query = u"CREATE TABLE {0} ({1})".format(self._table_name, columns)
79-
with self._connection as conn:
80-
conn.execute(create_table_query)
81-
82-
def _insert_new_row(self):
83-
insert_query = u"INSERT INTO {0} VALUES(?, null)".format(self._table_name)
84-
with self._connection as conn:
85-
conn.execute(insert_query, (self._primary_key,))
86-
87-
def get_stored_cursor_timestamp(self):
88-
"""Gets the last stored date observed timestamp."""
89-
rows = self._get(self._timestamp_column_name, self._primary_key)
90-
if rows and rows[0]:
91-
return rows[0][0]
26+
def __init__(self, dir_path):
27+
self._dir_path = dir_path
9228

93-
def replace_stored_cursor_timestamp(self, new_date_observed_timestamp):
29+
def get(self, cursor_name):
30+
"""Gets the last stored date observed timestamp."""
31+
try:
32+
location = path.join(self._dir_path, cursor_name)
33+
with open(location) as checkpoint:
34+
return float(checkpoint.read())
35+
except FileNotFoundError:
36+
return None
37+
38+
def replace(self, cursor_name, new_timestamp):
9439
"""Replaces the last stored date observed timestamp with the given one."""
95-
self._set(
96-
column_name=self._timestamp_column_name,
97-
new_value=new_date_observed_timestamp,
98-
primary_key=self._primary_key,
99-
)
40+
location = path.join(self._dir_path, cursor_name)
41+
with open(location, "w") as checkpoint:
42+
return checkpoint.write(str(new_timestamp))
43+
44+
def delete(self, cursor_name):
45+
"""Removes a single cursor from the store."""
46+
try:
47+
location = path.join(self._dir_path, cursor_name)
48+
os.remove(location)
49+
except FileNotFoundError:
50+
msg = "No checkpoint named {0} exists for this profile.".format(cursor_name)
51+
raise Code42CLIError(msg)
10052

10153
def clean(self):
102-
"""Removes profile cursor data from store."""
103-
self._delete(self._primary_key)
54+
"""Removes all cursors from this store."""
55+
cursors = self.get_all_cursors()
56+
for cursor in cursors:
57+
self.delete(cursor.name)
10458

59+
def get_all_cursors(self):
60+
"""Returns a list of all cursors stored in this directory (which istypically scoped to a profile)."""
61+
dir_contents = os.listdir(self._dir_path)
62+
return [Cursor(f) for f in dir_contents if self._is_file(f)]
10563

106-
class FileEventCursorStore(BaseCursorStore):
107-
_timestamp_column_name = u"insertionTimestamp"
64+
def _is_file(self, node_name):
65+
return path.isfile(path.join(self._dir_path, node_name))
10866

109-
def __init__(self, profile_name, db_file_path=None):
110-
self._primary_key = profile_name
111-
super(FileEventCursorStore, self).__init__(u"file_event_checkpoints", db_file_path)
112-
if not self._row_exists(self._primary_key):
113-
self._insert_new_row()
11467

68+
class FileEventCursorStore(BaseCursorStore):
69+
def __init__(self, profile_name):
70+
dir_path = get_user_project_path(u"file_event_checkpoints", profile_name)
71+
super(FileEventCursorStore, self).__init__(dir_path)
11572

116-
class AlertCursorStore(BaseCursorStore):
117-
_timestamp_column_name = u"createdAt"
11873

119-
def __init__(self, profile_name, db_file_path=None):
120-
self._primary_key = profile_name
121-
super(AlertCursorStore, self).__init__(u"alert_checkpoints", db_file_path)
122-
if not self._row_exists(self._primary_key):
123-
self._insert_new_row()
74+
class AlertCursorStore(BaseCursorStore):
75+
def __init__(self, profile_name):
76+
dir_path = get_user_project_path(u"alert_checkpoints", profile_name)
77+
super(AlertCursorStore, self).__init__(dir_path)
12478

12579

12680
def get_file_event_cursor_store(profile_name):

src/code42cli/cmds/search_shared/enums.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
IS_INCREMENTAL_KEY = u"incremental"
1+
IS_CHECKPOINT_KEY = u"use_checkpoint"
22

33

44
class OutputFormat(object):

0 commit comments

Comments
 (0)