Enhance data governance with enforced metadata rules in Amazon DataZone
We’re excited to announce a new feature in Amazon DataZone that offers enhanced metadata governance for your subscription approval process. With this update, domain owners can define and enforce metadata requirements for data consumers when they request access to data assets. By making it mandatory for data consumers to provide specific metadata, domain owners can achieve compliance, meet organizational standards, and support audit and reporting needs.
Many organizations require additional metadata from data consumers during the subscription request process to align with internal workflows and regulatory requirements. With enforced metadata rules, domain unit owners can establish consistent governance practices across all data subscriptions. For example, financial services organizations can mandate specific compliance-related metadata when data consumers request access to sensitive financial data. Similarly, healthcare providers can enforce metadata requirements to align with regulatory standards for patient data access. This feature simplifies the approval process by guiding data consumers through completing mandatory fields and enabling data owners to make informed decisions, ensuring data access requests meet organizational policies.
By streamlining metadata governance, Amazon DataZone empowers customers to meet compliance standards, maintain audit readiness, and simplify access workflows for enhanced efficiency and control. For example, one of our customers, Bristol Myers Squibb (BMS), leverages Amazon DataZone to address their specific data governance needs. Sitikantha Sarangi, Director of Data Engineering and ML Ops Platform at BMS, says:
“At BMS, our teams have been leveraging Amazon DataZone’s comprehensive data governance solution to catalog and enable secure data subscriptions across the organization within governed project environments. With the new custom metadata enforcement feature, we now can more easily navigate our data catalog. This capability allows us to set specific requirements for data consumers, such as providing a compliance certification link or detailing data usage intentions, ensuring that access requests for sensitive data are thoroughly reviewed and approved in alignment with our standards. This customization helps us more efficiently ensure we are appropriately utilizing data while facilitating efficient, secure data sharing across teams.”
Key benefits
The feature benefits multiple stakeholders. Domain unit owners can ensure compliance by enforcing metadata requirements, granting access only after thorough reviews. Data consumers benefit from a streamlined subscription request process, guided by metadata requirements that reduce complexity. Data producers gain clarity with detailed subscription requests, enabling informed decisions aligned with required standards. Overall, the key benefits are:
- Enhanced control for domain owners – Admins and domain unit owners can now enforce additional metadata requirements on subscription requests, making sure that data consumers supply essential information for thorough review and compliance checks
- Custom workflow support – Organizations can build custom workflows for assets by capturing critical metadata from data consumers, such as AWS account IDs or project-specific identifiers, to fulfill access requests
In this post, we walk you through setting up and using metadata enforcement to create seamless, compliant data access workflows.
Solution overview
The solution in this post is composed of two parts. In the first part, we walk through the steps necessary to enforce metadata for subscription requests for managed assets. In the second part, we walk through the steps necessary to request subscriptions for custom assets.
Prerequisites
To follow this post, user should already have Amazon DataZone setup with respective projects to publish and consume the assets. The publisher of the Retail project must have published a shipments
data asset in Amazon DataZone. The domain owner or admin must have created a metadata form required for the subscription request.
This feature also supports metadata enforcement for subscription requests of a data product. For instructions on how to set this up, refer to Amazon DataZone data products.
Solution walkthrough: Enhance data governance with enforced metadata rules for Managed Assets
To perform the solution in this post, follow the steps in the next sections.
Metadata enforcement for subscription requests
To enforce metadata for subscription requests, use the following steps.
Step 1: Domain owner configures metadata requirements
Domain unit owners can configure metadata enforcement in Amazon DataZone as follows:
- On the Amazon DataZone console, choose Domain to open your domain or domain unit settings.
- Choose dataplatform, as shown in the following screenshot.
- To add metadata forms for subscription requests, on the RULES tab, choose ADD, as shown in the following screenshot.
- Provide the name to the metadata form rule.
- Choose ADD ANOTHER METADATA FORM.
- Choose from a list of available metadata forms within the domain or domain unit. Search options make navigation straightforward.
You can select multiple forms for enforcement on subscription requests.
- Choose Add, as shown in the following screenshot.
Create metadata form rule as below:
- In the next screen, you can specify additional settings. You can apply metadata forms across all asset types or limit them to specific asset types. Additionally, choose whether the rule applies to a specific project or all projects within the domain. After the scope is defined as shown in the screenshot, choose ADD RULE.
Note: Enable metadata enforcement across child domains, with optional permissions allowing child domains to override the parent domain’s enforced forms. This option is available while defining the scope, if the domain owner chooses All projects, as shown in the following screenshot.
Step 2: Data consumer submits subscription request
After metadata enforcement is configured, data consumers follow these steps to request access:
- To find and select an asset in the Amazon DataZone catalog, choose MARKETING and then sign in to the Amazon DataZone console as a data consumer. On the search bar, enter the
shipments
data asset, as shown in following screenshot. - Choose SUBSCRIBE to open the subscription request modal, as shown in the following screenshot.
- Choose a project and provide a Reason for request, as shown in the following screenshot.
- Fill in the required metadata fields as specified by the domain unit. If mandatory fields are incomplete, they will be highlighted, and the submission will be disabled until resolved. After all the mandatory fields are entered, choose APPLY, as shown in the following screenshot.
- Choose Request to submit the subscription request, as shown in the following screenshot.
After submitting, an event is generated in Amazon EventBridge, which can be used in custom workflows outside of Amazon DataZone as needed.
Step 3: Data producer (owner) approves the subscription
After a data consumer submits a subscription request, they review the metadata. The data producer receives the subscription request with all metadata provided by the data consumer.
- Sign in to the Amazon DataZone console as a data producer. Choose RETAIL as the
- In the navigation pane, choose Incoming requests and find the subscription request. Choose View request, as shown in the following screenshot.
- Data producers can review the metadata, including document links and account IDs, to determine if the request meets compliance and workflow requirements before granting access, as shown in the following screenshot.
- Under Approval access, choose Full access to provide full access to data. For fine-grain access control, choose Approve with row or column filters. For this post, we choose Full access.
- Provide the Decision comment.
- Choose APPROVE, as shown in the following screenshot.
Step 4: Data consumer consumes the data
Now, data consumers follow these steps:
- After the subscription grants are approved and fulfilled, sign in to the Amazon DataZone console as data consumer from MARKETING project to query the subscribed data.
- Choose MARKETING On the Environments tab, choose Query data through Amazon Athena, as shown in the following screenshot.
- Query the subscribed data asset
shipments
in Amazon Athena, with below query and as shown in the screenshot.
Solution walkthrough: Enhance data governance with enforced metadata rules for Custom Assets
Customers can manage access grants for unmanaged assets using Amazon DataZone. When a subscription to an asset in the business data catalog is approved by the data owner, Amazon DataZone publishes an event in Amazon EventBridge in the account along with all the necessary information in the payload that you can use to create the access grants between the source and the target. Using metadata enforcement for unmanaged assets, customers can provide all context in the single request.
STEP 1: Create a custom asset type
To create a custom asset type Metrics with an attached metadata form to describe the metric asset type, follow these steps:
Below is an example of a custom asset type – “Metrics” which has two fields 1/Dashboard Link and 2/Calculation
Step 2: Data producer creates a custom asset using the “Metrics” asset type
The data producer creates a Conversion Rate Metric with all metadata along with associated metadata forms by following these steps:
Below is “Conversion Rate Metric” asset created in DataZone. The highlighted boxes show that is an Unmanaged asset and of type “Metrics” that was created in the previous step.
Step 3: Domain owner configures metadata requirements
Domain unit owners can configure metadata enforcement in Amazon DataZone as follows:
- On the Amazon DataZone console, choose Domain to open your domain or domain unit settings.
- To add metadata forms for subscription requests, on the RULES tab, choose ADD, as shown in the following screenshot.
- To select metadata forms, provide the Name to the metadata form rule.
- Choose ADD METADATA FORM, as shown in the following screenshot.
- Remaining fields can be left as default. For this blog, please set it as shown in below
- In the Add metadata form pop-up, enter
MetricsRequestForm
, as shown in the following screenshot. - Choose ADD Rule as shown above to create the rule for all metrics assets. Below is the screenshot of the rule once created.
Step 4: Admins sets up an EventBridge rule
To set up an EventBridge rule, follow these steps:
- Create an EventBridge rule to capture all new subscription requests. Please see the documentation Amazon DataZone events and notifications for details to setup.
- Create an AWS Lambda function as a target to action on the event. Please see documentation – Event bus targets in Amazon EventBridge to setup targets.
For this blog, set the below event pattern that triggers the lambda only for new Subscription requests.
Step 5: Data consumer submits subscription request
After metadata enforcement is configured, data consumers follow these steps to request access:
- To locate the asset in the Amazon DataZone catalog, sign in to the Amazon DataZone console as a data consumer from the marketing Use the search bar to find the Conversion Rate Metric asset. Choose SUBSCRIBE, as shown in the following screenshot.
- Provide details, including the Metrics Request Form associated with the Metrics asset type.
- Choose REQUEST, as shown in the following screenshot.
You will receive notification confirming that your subscription request is submitted, as shown in the following screenshot.
For the request, EventBridge will capture the following request event and send it to the setup target:
The data steward and asset owner can get details for the request with the GetSubscriptionRequestDetails API and view the asset details and form associated with the request:
The data and asset owner can use these details to orchestrate an approval workflow using the Lambda function. After it has been validated, the asset owner or steward can then call the AcceptSubscriptionRequest API to grant access. The data consumer will be notified after access is approved. The following screenshot shows the notification that the subscription was approved.
Now that the subscription is approved, users can use the dashboard URL to access the metric.
Cleanup
To make sure no additional charges are incurred after testing, delete the Amazon DataZone domain. Refer to Delete Amazon DataZone domains for the process.
Conclusion
The new metadata enforcement rule for subscription requests in Amazon DataZone strengthens data governance by empowering domain unit owners to establish clear metadata requirements for data consumers, streamlining access requests and enhancing data governance. This feature enables organizations to align with the organization’s metadata standards, implement custom workflows, and provide a consistent, governed data access experience.
The feature is supported in all AWS Regions where Amazon DataZone is available at the time of this writing. To check which Regions are available, refer to AWS Services by Region. Check out the video below to learn more about how to set up metadata rules for subscription workflows. Get started with the technical documentation.
About the Authors
Ramesh H Singh is a Senior Product Manager Technical (External Services) at AWS in Seattle, Washington, currently with the Amazon DataZone team. He is passionate about building high-performance ML/AI and analytics products that enable enterprise customers to achieve their critical goals using cutting-edge technology. Connect with him on LinkedIn.
Pradeep Misra is a Principal Analytics Solutions Architect at AWS. He works across Amazon to architect and design modern distributed analytics and AI/ML platform solutions. He is passionate about solving customer challenges using data, analytics, and AI/ML. Outside of work, Pradeep likes exploring new places, trying new cuisines, and playing board games with his family. He also likes doing science experiments, building LEGOs and watching anime with his daughters.
Lakshmi Nair is a Senior Analytics Specialist Solutions Architect at AWS. She specializes in designing advanced analytics systems across industries. She focuses on crafting cloud-based data platforms, enabling real-time streaming, big data processing, and robust data governance.
Santhosh Padmanabhan is a Software Development Manager at AWS, leading the Amazon DataZone engineering team. His team designs, builds, and operates services specializing in data, machine learning, and AI governance. With deep expertise in building distributed data systems at scale, Santhosh plays a key role in advancing AWS’s data governance capabilities.
Post Comment