Skip to main content
Version: 1.0 prerelease

Manage Validation Definitions

A Validation Definition is a fixed reference that links a Batch of data to an Expectation Suite. It can be run by itself to validate the referenced data against the associated Expectations for testing or data exploration. Multiple Validation Definitions can also be provided to a Checkpoint which, when run, executes Actions based on the Validation Results for each provided Validation Definition.

Prerequisites

Create a Validation Definition

  1. Import the ValidationDefinition class from the GX library.
Python
from great_expectations.core import ValidationDefinition
  1. Request a Data Context.

In this example the variable context is your Data Context.

  1. Get an Expectation Suite with Expectations. This can be an existing Expectation Suite retrieved from your Data Context or a new Expectation Suite in your current code.

In this example the variable suite is your Expectation Suite.

  1. Get an existing or create a new Batch Definition describing the data that will be associated with the Expectation Suite.

In this example the variable batch_definition is your Batch Definition.

  1. Create a ValidationDefinition instance using the Batch Definition, Expectation Suite, and a unique name.
Python
definition_name = "My Validation Definition"
validation_definition = ValidationDefintion(data=batch_definition, suite=suite, name=definition_name)
  1. Optional. Save the Validation Definition to your Data Context.
Python
validation_definition = context.validation_definitions.add(validation_definition)
tip

You can add a Validation Definition to your Data Context at the same time as you create it with the following code:

Python
definition_name = "My second Validation Definition"
validation_definition = context.validation_definitions.add(ValidationDefinition(data=batch_definition, suite=suite, name=definition_name))

List available Validation Definitions

  1. Request a Data Context.

In this example the variable context is your Data Context.

  1. Use the Data Context to retrieve and print the names of the available Validation Definitions:
validation_definition_names = [definition.name for definition in context.validation_definitions]
print(validation_definition_names)

Get a Validation Definition by name

  1. Request a Data Context.

In this example the variable context is your Data Context.

  1. Use the Data Context to request the Validation Definition.
Python
definition_name = "My Validation Definition"
validation_definition = context.validation_definitions.get(name=definition_name)

Get Validation Definitions by attributes

  1. Request a Data Context.

In this example the variable context is your Data Context.

  1. Determine the attributes to filter on.

Validation Definitions associate an Expectation Suite with a Batch Definition. This means that valid attributes to filter on include the attributes for the Expectation Suite, as well as the attributes for the Batch Definition, the Batch Definition's Data Asset, and the Data Asset's Data Source.

  1. Use a list comprehension to return all Validation Definitions that match the filtered attributes.

For example, you can retrieve all Validation Definitions that include a specific Expectation Suite by filtering on the Expectation Suite name:

Python
existing_expectation_suite_name = "my_expectation_suite"
validation_definitions_for_suite = [
definition for definition in context.validation_definitions
if definition.suite.name == existing_expectation_suite_name
]

Or you could return all Validation Definitions involving a specific Data Asset by filtering on the Data Source and Data Asset names:

Python
existing_data_source_name = "my_data_source"
existing_data_asset_name = "my_data_asset"
validation_definitions_for_asset = [
definition for definition in context.validation_definitions
if definition.data_source.name == existing_data_source_name
and definition.asset.name == existing_data_asset_name
]

Delete a Validation Definition

  1. Request a Data Context.

In this example the variable context is your Data Context.

  1. Get the Validation Definition to delete.

In this example the variable validation_definition is the Validation Definition to delete.

  1. Use the Data Context to delete the Validation Definition:
Python
context.validations.delete(name=validation_definition.name)

You can directly provide the Validation Definition's name as a string. However, retrieving the Validation Definition from your Data Context and using its name attribute to specify the Validation Definition to delete will ensure that you do not introduce typos, differences in capitalization, or otherwise attempt to delete a Validation Definition that does not exist.

Duplicate a Validation Definition

Validation definitions are intended to be fixed references that link a set of data to an Expectation Suite. As such, they do not include an update method. However, multiple Validation Definitions with the same Batch Definition and Expectation Suite can exist as long as each has a unique name.

Although an existing Validation Definition cannot be renamed, a duplicate can be created that has a name different or updated from the original.

  1. Import the GX library and ValidationDefintion class:
Python
import great_expectations as gx
from great_expectations.core import ValidationDefinition
  1. Request a Data Context.

In this example the variable context is your Data Context.

  1. Get the original Validation Definition.

In this example the variable original_validation_definition is the original Validation Definition.

  1. Get the Batch Definition and Expectation Suite from the original Validation Definition:
Python
original_suite = original_validation_definition.suite
original_batch = original_validation_definition.batch_definition
  1. Add a new Validation Definition to the Data Context using the same Batch Definition and Expectation Suite as the original:
Python
new_definition_name = "my_validation_definition"
new_validation_definition = ValidationDefintion(
data=original_batch,
suite=original_suite,
name=definition_name
)
context.validation_definitions.add(new_validation_definition)
  1. Optional. Delete the original Validation Definition.

Run a Validation Definition

  1. Create a new or retrieve an existing Validation Definition.

  2. Execute the Validation Definition's run() method:

Python
validation_result = validation_definition.run()

Validation Results are automatically saved in your Data Context when a Validation Definition's run() method is called. For convenience, the run() method also returns the Validation Results as an object you can review.

  1. Review the Validation Results:
Python
print(validation_result)
tip

GX Cloud users can view the Validation Results in the GX Cloud UI by following the url provided with:

Python
print(validation_result.result_url)