Varonis Threat Labs disclosed “Dataflow Rider,” a technique that lets attackers with basic write access to Google Cloud Storage buckets replace Cloud Dataflow templates or Python UDFs to hijack pipelines and execute arbitrary code on worker nodes. The attack can enable data exfiltration, credential theft (service‑account tokens), data manipulation, and lateral movement; Google VRP judged this intended behavior, so organizations must tightly restrict and monitor bucket access. #DataflowRider #CloudDataflow
Keypoints
- Varonis discovered a novel attack technique—Dataflow Rider—where adversaries modify Dataflow pipeline components stored in Google Cloud Storage to hijack running pipelines.
- The attack requires only bucket-level write permissions (e.g., storage.objects.create/update) and succeeds because Dataflow does not validate the integrity of template or UDF files in buckets.
- By altering job templates or Python UDFs, attackers can execute arbitrary code on Dataflow workers, exfiltrate service‑account access tokens via the metadata endpoint, and manipulate or steal processed data.
- Both batch and streaming Dataflow jobs are affected; replacements must occur before or in the early minutes of execution or when a new worker starts (including autoscaled workers) to take effect.
- Attackers commonly obtain required access via stolen user credentials or long‑lived service account keys and can use stolen tokens for lateral movement and privilege escalation.
- Google VRP classified the behavior as intended and does not plan a fix; recommended mitigations include restricting bucket access, using dedicated buckets for pipeline components, alerting on unauthorized uploads, and applying VPC‑SC perimeters.
MITRE Techniques
- [N/A ] No MITRE ATT&CK technique identifiers were explicitly mentioned in the article – “No MITRE techniques were referenced in the report.”
Indicators of Compromise
- [Service account access tokens ] Exfiltrated from Dataflow worker metadata endpoint – example: stolen service‑account access token retrieved via the worker metadata endpoint, stolen service account information (e.g., default Compute Engine service account)
- [Cloud Storage objects (pipeline components) ] Maliciously modified files stored in customer-managed GCS buckets that define job behavior – example: pipeline-template.yaml, transform_udf.py, and other pipeline component files
- [Bucket permissions / IAM actions ] Permissive bucket ACLs and granted permissions enabling abuse – example: storage.objects.create, storage.objects.update, storage.buckets.get (used by attackers to enumerate and overwrite components)
Read more: https://www.varonis.com/blog/dataflow-rider