Skip to content

Releases: kedro-org/kedro-plugins

kedro-datasets-9.0.0

17 Nov 16:01
eb22793

Choose a tag to compare

Major features and improvements

  • Removed the deprecated MatplotlibWriter datset. Matplotlib objects can now be handled using MatplotlibDataset.
  • Group datasets documentation according to the dependencies to clean up the nav bar.
  • Added mode save argument to ibis.TableDataset, supporting "append", "overwrite", "error"/"errorifexists", and "ignore" save modes. The deprecated overwrite save argument is mapped to mode for backward compatibility and will be removed in a future release. Specifying both mode and overwrite results in an error.
  • Added credentials support to ibis.TableDataset.
  • Added the following new datasets:
Type Description Location
openxml.PptxDataset A dataset for loading and saving .pptx files (Microsoft PowerPoint) using python-pptx kedro_datasets.openxml
  • Graduated the following experimental datasets to core:
Type Description Location
langchain.ChatOpenAIDataset Kedro dataset for loading a ChatOpenAI LangChain model. kedro_datasets.langchain
langchain.OpenAIEmbeddingsDataset Kedro dataset for loading an OpenAIEmbeddings model. kedro_datasets.langchain
langchain.ChatAnthropicDataset A dataset for loading a ChatAnthropic LangChain model. kedro_datasets.langchain
langchain.ChatCohereDataset A dataset for loading a ChatCohere LangChain model. kedro_datasets.langchain
  • Added the following new experimental datasets:
Type Description Location
langfuse.LangfuseTraceDataset Kedro dataset to provide Langfuse tracing clients and callbacks kedro_datasets_experimental.langfuse
langchain.LangChainPromptDataset Kedro dataset for loading LangChain prompts kedro_datasets_experimental.langchain
pypdf.PDFDataset Kedro dataset to read PDF files and extract text using pypdf kedro_datasets_experimental.pypdf
langfuse.LangfusePromptDataset Kedro dataset for managing Langfuse prompts kedro_datasets_experimental.langfuse
opik.OpikPromptDataset A dataset to provide Opik integration for handling prompts kedro_datasets_experimental.opik
opik.OpikTraceDataset Kedro dataset to provide Opik tracing clients and callbacks kedro_datasets_experimental.opik

Bug fixes and other changes

  • Add HTMLPreview type.
  • Fixed StudyDataset to properly propagate a RDB password through the dataset's credentials.

Community contributions

Many thanks to the following Kedroids for contributing PRs to this release:

kedro-telemetry-0.6.5

17 Sep 11:18
77d3cdc

Choose a tag to compare

  • Disabled data collection for CI/CD environments running in kedro-org repositories to avoid capturing internal usage metrics.

kedro-datasets-8.1.0

21 Aug 09:37
24d3589

Choose a tag to compare

Major features and improvements

  • Added the following new experimental datasets:
Type Description Location
polars.PolarsDatabaseDataset A dataset to load and save data to a SQL backend using Polars kedro_datasets_experimental.polars

Bug fixes and other changes

  • Added primary key constraint to BaseTable.
  • Added save/load with use_pyarrow=True save_args for LazyPolarsDataset partitioned parquet files.
  • Updated the json schema for Kedro 1.0.0.

Breaking Changes

Community contributions

kedro-telemetry-0.6.4

18 Aug 11:00
609a38c

Choose a tag to compare

  • Updated logic to only show the message that Kedro is sending telemetry if the user hasn't explicitly granted permission.
  • Replaced dependency on toml with tomli (before Python 3.11) and tomli-w.

kedro-datasets-8.0.0

14 Jul 10:21
88769e8

Choose a tag to compare

Major features and improvements

  • Migrated docs to mkdocs
  • Make kedro-datasets compatible with Kedro 1.0.0.
  • Added the following new datasets:
Type Description Location
openxml.DocxDataset A dataset for loading and saving .docx files (Microsoft Word) using python-docx kedro_datasets.openxml

Bug fixes and other changes

  • Fixed PartitionedDataset to reliably load newly created partitions, particularly with ParallelRunner, by ensuring load() always re-scans the filesystem .
  • Add a parameter encoding inside the dataset SQLQueryDataset to choose the encoding format of the query.
  • Corrected the APIDataset docstring to clarify that request parameters should be passed via load_args, not as top-level arguments.

Breaking changes

  • kedro-datasets now requires Kedro 1.0.0 or higher.

Community contributions

Many thanks to the following Kedroids for contributing PRs to this release:

kedro-airflow-0.10.0

04 Jun 09:05
0edb400

Choose a tag to compare

  • Fixed check whether a dataset is a MemoryDataset.
  • Added the option to group nodes by namespace.
  • The CLI option --group-in-memory was altered to --group-by, which can receive the values memory or namespace. Functionality for grouping by memory was not altered.

kedro-telemetry-0.6.3

29 May 08:48
a6c8629

Choose a tag to compare

  • Updated catalog API usage to comply with both new and old catalogs.

kedro-datasets-7.0.0

25 Apr 14:57
da4af7e

Choose a tag to compare

Major features and improvements

  • Added a parameter to enable/disable lazy saving for PartitionedDataset.
  • Added ibis-athena and ibis-databricks extras for the backends added in Ibis 10.0.
  • Renamed MatplotlibWriter to MatplotlibDataset for consistency with other dataset naming conventions. MatplotlibWriter is deprecated and will be removed in a future release.
  • Added the following new experimental datasets:
Type Description Location
optuna.StudyDataset A dataset for saving and loading Optuna studies. kedro_datasets_experimental.optuna
darts.DartsTorchModelDataset A dataset for securely saving and loading Darts Torch Forecasting Models. kedro_datasets_experimental.darts

Bug fixes and other changes

  • Fixed polars.CSVDataset save method on Windows using utf-8 as default encoding.
  • Made table_name a keyword argument in the ibis.FileDataset implementation to be compatible with Ibis 10.0.
  • Fixed how sessions are handled in the snowflake.SnowflakeTableDataset implementation.
  • Fixed credentials handling in pandas.GBQQueryDataset and pandas.GBQTableDataset.

Breaking changes

  • Removed tracking.MetricsDataset and tracking.JSONDataset.

Community contributions

Many thanks to the following Kedroids for contributing PRs to this release:

kedro-datasets-6.0.0

18 Dec 16:47
87d5e62

Choose a tag to compare

Major features and improvements

  • Supported passing database to ibis.TableDataset for load and save operations.
  • Added functionality to save pandas DataFrames directly to Snowflake, facilitating seamless .csv ingestion.
  • Added Python 3.9, 3.10 and 3.11 support for snowflake.SnowflakeTableDataset.
  • Enabled connection sharing between ibis.FileDataset and ibis.TableDataset instances, thereby allowing nodes to save data loaded by one to the other (as long as they share the same connection configuration).
  • Added the following new experimental datasets:
Type Description Location
databricks.ExternalTableDataset A dataset for accessing external tables in Databricks. kedro_datasets_experimental.databricks
safetensors.SafetensorsDataset A dataset for securely saving and loading files in the SafeTensors format. kedro_datasets_experimental.safetensors

Bug fixes and other changes

  • Delayed backend connection for pandas.GBQTableDataset. In practice, this means that a dataset's connection details aren't used (or validated) until the dataset is accessed. On the plus side, the cost of connection isn't incurred regardless of when or whether the dataset is used. Furthermore, this makes the dataset object serializable (e.g. for use with ParallelRunner), because the unserializable client isn't part of it.
  • Removed the unused BigQuery client created in pandas.GBQQueryDataset. This makes the dataset object serializable (e.g. for use with ParallelRunner) by removing the unserializable object.
  • Implemented Snowflake's local testing framework for testing purposes.
  • Improved the dependency management for Spark-based datasets by refactoring the Spark and Databricks utility functions used across the datasets.
  • Added deprecation warning for tracking.MetricsDataset and tracking.JSONDataset.
  • Moved kedro-catalog JSON schemas from Kedro core to kedro-datasets.

Breaking Changes

  • Demoted video.VideoDataset from core to experimental dataset.
  • Removed file handling capabilities from ibis.TableDataset. Use ibis.FileDataset to load and save files with an Ibis backend instead.

Community contributions

Many thanks to the following Kedroids for contributing PRs to this release:

kedro-telemetry-0.6.2

27 Nov 15:00
27954cd

Choose a tag to compare

  • Removed support for Python 3.8
  • Added support for Python 3.13

Thanks for supporting contributions