Connect to data
Great Expectations (GX) differentiates between the locations in which data is stored and the sets of data that are available at those locations. In GX, Data Sources manage the process of accessing locations in which data is stored. Data Assets define sets of data that can be accessed from those Data Sources. To connect to your data you will first define a Data Source to tell GX how to access the data in question and then define one or more Data Assets to tell GX which sets of data to make available.
File system Data Sources
File system Data Sources connect GX to data that is stored as one or more files (such as .csv
or .parquet
files) within a directory hierarchy.
Basic file system
Manage data stored as files on a local or networked file system.
Amazon S3
Manage data stored as files in an Amazon S3 bucket.
Azure Blob Storage
Manage data stored as files in Azure Blob Storage.
Google Cloud Storage
Manage data stored as files in Google Cloud Storage.
In memory Data Sources
In memory Data Sources connect GX to data that has been read into memory.
pandas
Manage Data Sources for data that has been loaded into memory with pandas.
Spark
Manage Data Sources for data that has been loaded into memory with Spark.
SQL Data Sources
SQL Data Sources connect GX to data stored in SQL databases, with support for some specific SQL dialects.
Snowflake
Manage Data Sources that connect to Snowflake SQL databases.
PostgreSQL
Manage Data Sources that connect to PostgreSQL databases.
SQLite
Manage Data Sources that connect to SQLite databases.
Databricks
Manage Data Sources that connect to Databricks SQL databases.
BigQuery
Manage Data Sources that connect to BigQuery SQL databases.
Generic SQL
Manage Data Sources that connect to SQL databases without utilizing a specific SQL dialect.