You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Describe the bug
we have set result_format as "COMPLETE" and return_unexpected_index_query to true, we want to use return_unexpected_index_query to get the error records from dataframe. It seems that it is returning broken query and escaping double quotes.
Example : return unexpected index query by GX : df.filter(F.expr(NOT(city IS NOT NULL)))
working query : df.filter(F.expr("NOT(city IS NOT NULL)"))
To Reproduce
import great_expectations as ge
from great_expectations.core import ExpectationSuite
from great_expectations.core.batch import RuntimeBatchRequest
from great_expectations.data_context import BaseDataContext
from great_expectations.data_context.types.base import FilesystemStoreBackendDefaults,DataContextConfig, DatasourceConfig
Describe the bug
we have set result_format as "COMPLETE" and return_unexpected_index_query to true, we want to use return_unexpected_index_query to get the error records from dataframe. It seems that it is returning broken query and escaping double quotes.
Example : return unexpected index query by GX : df.filter(F.expr(NOT(city IS NOT NULL)))
working query : df.filter(F.expr("NOT(city IS NOT NULL)"))
To Reproduce
import great_expectations as ge
from great_expectations.core import ExpectationSuite
from great_expectations.core.batch import RuntimeBatchRequest
from great_expectations.data_context import BaseDataContext
from great_expectations.data_context.types.base import FilesystemStoreBackendDefaults,DataContextConfig, DatasourceConfig
expectation_suite_config = {
"expectation_suite_name": "my_expectation_suite",
"expectations": [ # List of expectations
{
"expectation_type": "expect_column_values_to_not_be_null",
"kwargs": {
"column": "my_column",
"result_format": {"result_format": "COMPLETE"}
}
my_expectation_suite = ExpectationSuite(**expectation_suite_config)
Define DataContext configuration
data_context_config = DataContextConfig(
plugins_directory=None,
config_variables_file_path= None,
datasources={
"my_spark_datasource": DatasourceConfig(
class_name= "Datasource",
execution_engine={
"class_name": "SparkDFExecutionEngine",
"force_reuse_spark_context": True,
)
batch_request=RuntimeBatchRequest(datasource_name="my_spark_datasource",
data_connector_name="spark_runtime_dataconnector",
data_asset_name="my_asset",
runtime_parameters={"batch_data": df},
batch_identifiers={"batch_name": "batch_run"})
context = ge.get_context(project_config=data_context_config)
batch_validator = context.get_validator(batch_request=batch_request, expectation_suite=my_expectation_suite)
validation_result = batch_validator.validate()
print(validation_result)
validation_result contains unexpected_index_query value as "df.filter(F.expr(NOT(city IS NOT NULL)))"
when i execute this query it is giving error syntax error. Invalid syntax
Expected behavior
Executing Query should result into getting error records from dataframe
Environment (please complete the following information):
Additional context
Add any other context about the problem here.
The text was updated successfully, but these errors were encountered: