Masking Sensitive Data in Apache Airflow 2.4.3: Snowflake Connection Private Key
I’m using Apache Airflow 2.4.3 and trying to securely store a Snowflake connection with a private key inside the connection’s extras JSON field. I want to mask the sensitive private_key_content field in the Airflow UI and logs.
What I’ve Tried
I’ve set the following environment variables in my deployment:
AIRFLOW__CORE__HIDE_SENSITIVE_VAR_CONN_FIELDS=True
AIRFLOW__CORE__SENSITIVE_VAR_CONN_NAMES=private_key_content
The Problem
Despite these settings, the private_key_content value still shows unmasked in the UI under the connection details.
Questions
- Is masking of custom extras fields like
private_key_contentfully supported in Airflow 2.4.3? - Does the
SENSITIVE_VAR_CONN_NAMESconfig reliably mask arbitrary keys inside connection extras in this version? - Are there any known workarounds or best practices for securely handling such sensitive data in connection extras with Airflow 2.4.3?
Any guidance or experience would be greatly appreciated!
In Apache Airflow 2.4.3, masking custom extras fields like private_key_content is not fully supported due to fundamental limitations in the version’s masking implementation. While you can configure AIRFLOW__CORE__SENSITIVE_VAR_CONN_NAMES, this setting only works for specific scenarios and doesn’t provide comprehensive masking for arbitrary extras fields in the UI.
- Is masking of custom extras fields like
private_key_contentfully supported in Airflow 2.4.3? - Does the
SENSITIVE_VAR_CONN_NAMESconfig reliably mask arbitrary keys inside connection extras in this version? - Are there any known workarounds or best practices for securely handling such sensitive data in connection extras with Airflow 2.4.3?
Understanding Airflow 2.x Limitations
Airflow 2.x versions (including 2.4.3) have fundamental differences in their masking approach compared to newer 3.x releases. As noted in the research, Airflow 2.x releases remain unaffected by the newer write-only masking model and have long allowed connection secrets to be visible to editors by design.
This means that even with proper configuration, the UI masking in Airflow 2.4.3 is limited in scope. The version doesn’t have the robust masking infrastructure that was developed for Airflow 3.x versions.
Configuration Analysis: SENSITIVE_VAR_CONN_NAMES
The AIRFLOW__CORE__SENSITIVE_VAR_CONN_NAMES setting you’ve configured is designed to work as a comma-separated list of extra sensitive keywords to look for in variables names or connection’s extra JSON. However, its effectiveness in Airflow 2.4.3 is limited:
- Partial implementation: The setting may work for some fields but not consistently for arbitrary extras
- UI limitations: Even when configured, the masking might not appear in all UI views
- API inconsistencies: The masking behavior may differ between UI and API responses
Based on the research findings, there are several GitHub issues that demonstrate these limitations, including connections where sensitive fields get masked as literal asterisks or aren’t masked at all.
Known Issues with Extras Masking
Several documented issues affect sensitive data masking in Airflow 2.x:
-
Issue #52301: "If an connection has an extra field with a name like ‘token’ or other sensitive keyword, its value gets masked as ‘’ when displayed. If you edit and save the connection… the extra field will…"
-
Issue #48105: “Secrets Masker also applies masking to fields not in scope” - indicating inconsistent masking behavior
-
Issue #47003: Specifically mentions problems with the
private_key_contentfield in Snowflake connections
These issues show that masking of custom extras fields was unreliable in Airflow 2.4.3 and often resulted in inconsistent behavior.
Alternative Approaches for Secure Storage
Since UI masking isn’t reliable in Airflow 2.4.3, consider these alternative approaches for handling sensitive private keys:
1. External Secrets Management
Instead of storing private keys in connection extras, use external secrets management systems:
- HashiCorp Vault: Store private keys in Vault and retrieve them at runtime
- AWS Secrets Manager: Store Snowflake credentials in AWS Secrets Manager
- Azure Key Vault: For Azure deployments
2. Environment Variables
Store the private key as an environment variable and reference it in your connection:
import os
from airflow.providers.snowflake.operators.snowflake import SnowflakeOperator
private_key = os.environ.get('SNOWFLAKE_PRIVATE_KEY')
3. Encrypted Variables
Use Airflow’s encrypted variables feature to store the private key:
from airflow.models.variable import Variable
# Store encrypted variable
private_key = Variable.get("snowflake_private_key", deserialize_json=True)
Best Practices for Airflow 2.4.3
Given the limitations in Airflow 2.4.3, follow these best practices:
- Avoid storing sensitive data in connection extras when possible
- Use external secrets management for production deployments
- Implement proper access controls to limit who can view connections
- Consider upgrading to a newer version if security is critical
- Use environment variables for sensitive configuration
Temporary Workarounds
If you must use connection extras in Airflow 2.4.3:
- Custom UI modifications: Override the connection display template to hide sensitive fields
- Proxy layer: Deploy a proxy that filters out sensitive data from API responses
- Custom authentication: Implement additional authentication layers for sensitive connections
Conclusion
Based on the research findings, Airflow 2.4.3 has significant limitations in masking sensitive data from connection extras. The SENSITIVE_VAR_CONN_NAMES configuration exists but doesn’t provide reliable masking for arbitrary fields like private_key_content.
Key recommendations:
- Avoid storing sensitive data in connection extras in Airflow 2.x versions
- Use external secrets management systems instead
- Consider upgrading to Airflow 3.x for better security features
- Implement additional security layers like proper access controls and encryption
For production environments handling sensitive data like Snowflake private keys, the most secure approach is to use dedicated secrets management systems rather than relying on Airflow’s built-in masking capabilities in older versions.
Sources
- Apache Airflow Vulnerability Lets Read-Only Users Access Sensitive Data
- Secrets Masker also applies masking to fields not in scope · Issue #48105
- Apache Airflow Bug Leaks Sensitive Details to Users with Read-Only Access
- Editing connection with sensitive extra field saves literal asterisks · Issue #52301
- Snowflake provider: The
private_key_contentfield is no longer a multi-line text input in Connections · Issue #47003 - Beneath the Surface: A Closer Look at 4 Airflow Internals
- Apache Airflow Vulnerability Exposes Sensitive Details to Read-Only Users
- Apache Airflow: Sensitive configuration values are not masked in the logs by default