Skip to content

Releases: canimus/cuallee

v0.10.3

19 May 00:06
e22ea67
Compare
Choose a tag to compare
  • Added approximate flag into the is_complete implementation for pyspark to run comparisson with pydeequ
  • Resolved JOSS issues for documentation and references against other data quality frameworks
  • Updated the test/performance folder with recent versions of all frameworks and accurate docker containers for each test

v0.10.2

11 May 13:51
4b54a0a
Compare
Choose a tag to compare
  • Added documentation to main classes Check and Rule
  • Changed to base=2 the implementation of has_entropy for pyspark as it does reflect with the common uses

v0.10.1

30 Apr 22:09
f494dac
Compare
Choose a tag to compare
  • Upgrade to duckdb==0.10.2
  • Community guidelines in README. Thanks @devarops
  • Fix pipeline with new SF account

v0.10.0

27 Mar 22:20
Compare
Choose a tag to compare
  • Addition of daft data frame support. Attribution to @dsaad68 👏
  • @dsaad68 largest contribution to the project ever! 🏆
  • Thanks for covering all: test, docs and code 💯

v0.9.2

23 Mar 09:46
b8c79bb
Compare
Choose a tag to compare
  • Removal of deprecated sum(axis=1) in polars in favor for sum_horizontal()
  • Thanks @StuffbyYuki

v0.9.1

22 Mar 20:52
18a2e9c
Compare
Choose a tag to compare
  • Added support for spark-connect via SPARK_REMOTE environment variable

v.0.9.0

17 Mar 21:53
Compare
Choose a tag to compare
  • Fix an important issue when working with datasets >1 billion rows were violations were present, and status was marked as PASS
  • Inclusion of new Controls
  • Structure for PDF report added

v0.8.8

10 Mar 19:25
Compare
Choose a tag to compare
  • JOSS submission

v0.8.7: Feature pyspark hotfix (#171)

04 Mar 22:16
c824864
Compare
Choose a tag to compare
  • Hot fix for pyspark on reconciliation of results. It was returning only last rule

v0.8.6

04 Mar 20:49
5e11271
Compare
Choose a tag to compare
  • Added percentage_fill for Control class and pyspark dataframes
  • Added percentage_empty for Control class and pyspark dataframes