Great Expectations installation and configuration workflow
Setting up Great Expectations (GX) includes installing GX and initializing your deployment. Optionally, you can customize the configuration of some components, such as Stores, Data Docs, and Plugins.
After you've completed the setup for your production deployment, you can access all GX features from your Data ContextThe primary entry point for a Great Expectations deployment, with configurations and methods for all supporting components.. Also, your StoresA connector to store and retrieve information about metadata in Great Expectations. and Data DocsHuman readable documentation generated from Great Expectations metadata detailing Expectations, Validation Results, etc. will be optimized for your business requirements.
To set up Data SourcesProvides a standard API for accessing and interacting with data from a wide variety of source systems., Expectation SuitesA collection of verifiable assertions about data., and CheckpointsThe primary means for validating data in a production deployment of Great Expectations. see the specific topics for these components.
If you don't want to manage your own configurations and infrastructure, then Great Expectations Cloud might be the solution. If you're interested in participating in the Great Expectations Cloud Beta program, or you want to receive progress updates, sign up for the Beta program.
Windows support for the open source Python version of GX is currently unavailable. If you’re using GX in a Windows environment, you might experience errors or performance issues.
Before you start
Before you start installing and configuring GX, you should complete the Quickstart guide and have the following items installed:
- A supported version of Python. GX supports Python versions 3.8 to 3.11.
- pip (the package installer for Python).
- An internet connection.
- A web browser (for Jupyter Notebooks).
- A virtual environment. Recommended for your project workspace.
Install Great Expectations
See Install Great Expectations.
Initialize a Data Context
Your Data Context contains your Great Expectations project, and it is the entry point for configuring and interacting with Great Expectations. The Data Context manages various classes and helps limit the number of objects you need to manage to get Great Expectations working.
Optional configurations
After you've initialized your Data Context, you can start using Great Expectations. However, a few components such as Stores, Data Docs, and Plugins that are configured by default to operate locally can be changed to hosted if it better suits your use case.
Stores
Stores are the locations where your Data Context stores information about your ExpectationsA verifiable assertion about data., your Validation ResultsGenerated when data is Validated against an Expectation or Expectation Suite., and your MetricsA computed attribute of data such as the mean of a column.. By default, these are stored locally. To reconfigure a Store to work with a specific backend, see Configure Expectation Stores, Configure Validation Result Stores, and Configure a MetricStore.
Data Docs
Data Docs provide human-readable renderings of your Expectation Suites and Validation Results, and they are built locally by default. To host and share Data Docs differently, see Host and share Data Docs.
Plugins
Python files are treated as PluginsExtends Great Expectations' components and/or functionality. when they are in the plugins
directory of your project (which is created automatically when you initialize your Data Context) and they can be used to extend Great Expectations. If you have Custom ExpectationsAn extension of the `Expectation` class, developed outside of the Great Expectations library. or other extensions that you want to use as Plugins with Great Expectations, add them to the plugins
directory.