-
Notifications
You must be signed in to change notification settings - Fork 3.4k
HBASE-29368 [Feature] Key management for encryption at rest (MVP changes) #7618
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Open
haridsv
wants to merge
1
commit into
apache:master
Choose a base branch
from
haridsv:HBASE-29368-feature
base: master
Could not load branches
Branch not found: {{ refName }}
Loading
Could not load tags
Nothing to show
Loading
Are you sure you want to change the base?
Some commits from the old base branch may be removed from the timeline,
and old review comments may become outdated.
+13,902
−316
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This PR implements the key management feature for HBase encryption at rest, building on the API surface and refactoring introduced in the precursor PR (apache#7584). Jira: [HBASE-29368](https://issues.apache.org/jira/browse/HBASE-29368) Design doc: https://docs.google.com/document/d/1ToW_rveXHXUc1F6eFNQfu5LOeMAjzgq6FcYUDbdZrSM/edit?usp=sharing Discussion thread: https://lists.apache.org/thread/q7g2rr2xcgl64rkn9j3mnokf6fvohp2y Cumulative changes from feature branch corresponding to the following sub-tasks: 1. [Phase 1: Key caching and minimal service](https://issues.apache.org/jira/browse/HBASE-29402) 2. [Phase 2: Integrate key management with existing encryption](https://issues.apache.org/jira/browse/HBASE-29495) 3. [Phase 2: Migration path from current encryption to managed encryption](https://issues.apache.org/jira/browse/HBASE-29617) 4. [Phase 2: Admin API to trigger for System Key rotation detection as an alternative to failover.](https://issues.apache.org/jira/browse/HBASE-29643) 5. [Phase 3: Additional key management APIs](https://issues.apache.org/jira/browse/HBASE-29666) This feature introduces a comprehensive key management system that extends HBase's existing encryption-at-rest capabilities. The implementation provides enterprise-grade key lifecycle management with support for key rotation, hierarchical namespace resolution for key lookup, key caching and improved integration with key management systems to handle key life cycles and external key changes. **1. Managed Keys Infrastructure** - Introduction of `ManagedKeyProvider` interface for pluggable key provider implementations on the lines of the existing `KeyProvider` interface. - The new interface can also return Data Encryption Keys (DEKs) and a lot more details on the keys. - Comes with the default `ManagedKeyStoreKeyProvider` implementation using Java KeyStore, similar to the existing `KeyStoreKeyProvider`. - Enables logical key isolation for multi-tenant scenarios through custodian identifiers (future use cases) and the special default global custodian. - Hierarchical namespace resolution for DEKs with automatic fallback: explicit CF namespace attribute → constructed `table/family` namespace → table name → global namespace **2. System Key (STK) Management** - Cluster-wide system key for wrapping data encryption keys (DEKs). This is equivalent to the existing master key, but better managed and operation friendly. - Secure storage in HDFS with support for automatic key rotation during boot up. - Admin API to trigger key rotation and propagation to all RegionServers without needing to do a rolling restart. - Preserves the current double-wrapping architecture: DEKs wrapped by STK, STK sourced from external KMS **3. KeymetaAdmin API** - `enableKeyManagement(keyCust, keyNamespace)` - Enable key management for a custodian/namespace pair - `getManagedKeys(keyCust, keyNamespace)` - Query key status and metadata - `rotateSTK()` - Check for and propagate new system keys - `disableKeyManagement(keyCust, keyNamespace)` - Disable all the keys for a custodian/namespace - `disableManagedKey(keyCust, keyNamespace, keyMetadataHash)` - Disable a specific key - `rotateManagedKey(keyCust, keyNamespace)` - Rotate the active key - `refreshManagedKeys(keyCust, keyNamespace)` - Refresh from external KMS to validate all the keys. - Internal cache management operations for convenience and meeting SLAs. **4. Persistent Key Metadata Storage** - New system table `hbase:keymeta` for storing key metadata and state which acts as an `L2` cache. - Tracks key lifecycle: `ACTIVE`, `INACTIVE`, `DISABLED`, `FAILED` states - Stores wrapped DEKs and metadata for key lookup without depending on external KMS. - Optimized for high-priority access with in-memory column families - Key metadata tracking with cryptographic hashes for integrity verification **5. Multi-Layer Caching** - L1: In-memory Caffeine cache on RegionServers for hot key data - L2: Keymeta table for persistent key metadata that is shared across all RegionServers. - L3: Dynamic lookup from external KMS as fallback when not found in L2. - Cache invalidation mechanism for key rotation scenarios **6. HBase Shell Integration** - `enable_key_management` - Enable key management for a custodian and namespace - `show_key_status` - Display key status and metadata - `rotate_stk` - Trigger system key rotation - `disable_key_management` - Disable key management for a custodian and namespace - `disable_managed_key` - Disable a specific key - `rotate_managed_key` - Rotate the active key - `refresh_managed_keys` - Refresh all keys for a custodian and namespace - **Backward Compatibility:** Changes are fully compatible with existing encryption-at-rest configuration - **Gradual step-by-step migration**: Well defined migration path from existing configuration to new configuration - **Performance:** Minimal overhead through efficient caching and lazy key loading - **Security:** Cryptographic verification of key metadata, secure key wrapping - **Operability:** Administrative tools for key life cycle and cache management - **Extensibility:** Plugin architecture for custom key provider implementations - **Testing:** Comprehensive unit and integration tests coverage The implementation follows a layered architecture: 1. **Provider Layer:** Pluggable `ManagedKeyProvider` for KMS integration 2. **Management Layer:** `KeyMetaAdmin` API for administrative operations 3. **Persistence Layer:** `KeymetaTableAccessor` for metadata storage 4. **Cache Layer:** `ManagedKeyDataCache` and `SystemKeyCache` for performance 5. **Service Layer:** Coprocessor endpoints for client-server communication I would particularly appreciate feedback on: 1. **API Design:** Is the `KeymetaAdmin` API intuitive and complete for common key management scenarios? 2. **Security Model:** Does the double-wrapping architecture (DEK wrapped by STK, STK from KMS) provide appropriate security guarantees? 3. **Performance:** Are there potential bottlenecks in the caching strategy or table access patterns? 4. **Operational Aspects:** Are the administrative commands sufficient for the needs of operations and monitoring? 5. **Testing Coverage:** Are there additional test scenarios we should cover? 6. **Documentation:** Is the design document clear? What additional documentation would be helpful? 7. **Compatibility:** Any concerns about interaction with existing HBase features? After incorporating community feedback, I plan to: 1. Address any issues identified during review 2. Implement the work identified for future phases 3. Add additional documentation to the reference guide This PR introduces changes across multiple modules, so I recommend focusing on these **core components** first: **Core Architecture:** 1. Design document (linked above) - architectural overview 2. `ManagedKeyProvider`, `KeymetaAdmin`, `ManagedKeyData` interfaces (hbase-common) 3. `ManagedKeys.proto` - protocol definitions 4. `HMaster` and misc. procedure changes - initialization of `keymeta` in a predictable order 5. `FixedFileTrailer` + reader/writer changes - encode/decode additional encryption key in store files **Key Implementation:** 1. `KeymetaAdminImpl`, `KeymetaTableAccessor`, `ManagedKeyUtils`, `SystemKeyManager`, `SystemKeyAccessor` - admin operations and persistence 2. `ManagedKeyDataCache`, `SystemKeyCache` - caching layer 3. `SecurityUtil` - encryption context creation **Client & Shell:** 1. `KeymetaAdminClient` - client API 2. Shell commands and Ruby wrappers **Tests & Examples:** 1. `TestKeymetaAdminImpl`, `TestManagedKeymeta` - for usage patterns 2. `key_provider_keymeta_migration_test.rb` - E2E migration steps
|
💔 -1 overall
This message was automatically generated. |
|
💔 -1 overall
This message was automatically generated. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
This PR implements the key management feature for HBase encryption at rest, building on the API surface and refactoring introduced in the precursor PR (#7584). It supersedes PR #7421 which originally had most of the changes from this PR as well PR #7584.
Jira: HBASE-29368
Design doc: https://docs.google.com/document/d/1ToW_rveXHXUc1F6eFNQfu5LOeMAjzgq6FcYUDbdZrSM/edit?usp=sharing
Discussion thread: https://lists.apache.org/thread/q7g2rr2xcgl64rkn9j3mnokf6fvohp2y
Cumulative changes from feature branch corresponding to the following sub-tasks:
This feature introduces a comprehensive key management system that extends HBase's existing encryption-at-rest capabilities. The implementation provides enterprise-grade key lifecycle management with support for key rotation, hierarchical namespace resolution for key lookup, key caching and improved integration with key management systems to handle key life cycles and external key changes.
1. Managed Keys Infrastructure
ManagedKeyProviderinterface for pluggable key provider implementations on the lines of the existingKeyProviderinterface.ManagedKeyStoreKeyProviderimplementation using Java KeyStore, similar to the existingKeyStoreKeyProvider.table/familynamespace → table name → global namespace2. System Key (STK) Management
3. KeymetaAdmin API
enableKeyManagement(keyCust, keyNamespace)- Enable key management for a custodian/namespace pairgetManagedKeys(keyCust, keyNamespace)- Query key status and metadatarotateSTK()- Check for and propagate new system keysdisableKeyManagement(keyCust, keyNamespace)- Disable all the keys for a custodian/namespacedisableManagedKey(keyCust, keyNamespace, keyMetadataHash)- Disable a specific keyrotateManagedKey(keyCust, keyNamespace)- Rotate the active keyrefreshManagedKeys(keyCust, keyNamespace)- Refresh from external KMS to validate all the keys.4. Persistent Key Metadata Storage
hbase:keymetafor storing key metadata and state which acts as anL2cache.ACTIVE,INACTIVE,DISABLED,FAILEDstates5. Multi-Layer Caching
6. HBase Shell Integration
enable_key_management- Enable key management for a custodian and namespaceshow_key_status- Display key status and metadatarotate_stk- Trigger system key rotationdisable_key_management- Disable key management for a custodian and namespacedisable_managed_key- Disable a specific keyrotate_managed_key- Rotate the active keyrefresh_managed_keys- Refresh all keys for a custodian and namespaceBackward Compatibility: Changes are fully compatible with existing encryption-at-rest configuration
Gradual step-by-step migration: Well defined migration path from existing configuration to new configuration
Performance: Minimal overhead through efficient caching and lazy key loading
Security: Cryptographic verification of key metadata, secure key wrapping
Operability: Administrative tools for key life cycle and cache management
Extensibility: Plugin architecture for custom key provider implementations
Testing: Comprehensive unit and integration tests coverage
The implementation follows a layered architecture:
ManagedKeyProviderfor KMS integrationKeyMetaAdminAPI for administrative operationsKeymetaTableAccessorfor metadata storageManagedKeyDataCacheandSystemKeyCachefor performanceI would particularly appreciate feedback on:
KeymetaAdminAPI intuitive and complete for common key management scenarios?After incorporating community feedback, I plan to:
This PR introduces changes across multiple modules, so I recommend focusing on these core components first:
Core Architecture:
ManagedKeyProvider,KeymetaAdmin,ManagedKeyDatainterfaces (hbase-common)ManagedKeys.proto- protocol definitionsHMasterand misc. procedure changes - initialization ofkeymetain a predictable orderFixedFileTrailer+ reader/writer changes - encode/decode additional encryption key in store filesKey Implementation:
KeymetaAdminImpl,KeymetaTableAccessor,ManagedKeyUtils,SystemKeyManager,SystemKeyAccessor- admin operations and persistenceManagedKeyDataCache,SystemKeyCache- caching layerSecurityUtil- encryption context creationClient & Shell:
KeymetaAdminClient- client APITests & Examples:
TestKeymetaAdminImpl,TestManagedKeymeta- for usage patternskey_provider_keymeta_migration_test.rb- E2E migration steps