Use CMEK with Google Cloud Serverless for Apache Spark
Stay organized with collections
Save and categorize content based on your preferences.
By default, Google Cloud Serverless for Apache Spark encrypts customer content at
rest. Serverless for Apache Spark handles encryption for you without any
additional actions on your part. This option is called Google default encryption.
If you want to control your encryption keys, then you can use customer-managed encryption keys
(CMEKs) in Cloud KMS with CMEK-integrated services including
Serverless for Apache Spark. Using Cloud KMS keys gives you control over their protection
level, location, rotation schedule, usage and access permissions, and cryptographic boundaries.
Using Cloud KMS also lets
you track key usage, view audit logs, and
control key lifecycles.
Instead of Google owning and managing the symmetric
key encryption keys (KEKs) that protect your data, you control and
manage these keys in Cloud KMS.
After you set up your resources with CMEKs, the experience of accessing your
Serverless for Apache Spark resources is similar to using Google default encryption.
For more information about your encryption
options, see Customer-managed encryption keys (CMEK).
Use CMEK
Follow the steps in this section to use CMEK to encrypt data that Google Cloud Serverless for Apache Spark
writes to persistent disk and to the Dataproc staging bucket.
You can use Cloud Key Management Service to create and manage key rings and keys, or use
Cloud KMS Autokey for simplified
auto-creation of key rings and keys.
Using Cloud KMS Autokey
Enable Autokey on the folder that contains your project.
Create a key handle.
When you create the key handle, specify dataproc.googleapis.com/Batch or
dataproc.googleapis.com/Session as the --resource-type. Autokey
generates a key and assigns it to the key handle.
Grant permissions to service accounts and configure your batch or session
workload by following steps 4 and 5 in the
Manually create and use keys section that follows.
When you submit your workload, specify the key handle resource name in
place of the key resource name in the kmsKey field.
Manually create and use keys
Follow these steps to manually create Cloud KMS keys and use them
with Serverless for Apache Spark.
KMS_PROJECT_ID: the ID of your Google Cloud project that
runs Cloud KMS. This project can also be the project that runs Dataproc resources.
PROJECT_NUMBER: the project number (not the project ID) of your Google Cloud project that runs Dataproc resources.
Enable the Cloud KMS API on the project that runs Serverless for Apache Spark resources.
If the Dataproc Service Agent role is not attached to the Dataproc Service Agent service account,
then add the serviceusage.services.use permission to the custom
role attached to the Dataproc Service Agent service account. If the Dataproc Service Agent role is
attached to the Dataproc Service Agent service account, you can skip this step.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-01-21 UTC."],[],[]]