You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
And doing all of these only after issuing build-time and run-time deprecation warnings for 1 release.
Summary
6+ years ago, #1243 was merged, adding "experimental" support for Hadoop Filesystem ("HDFS") to LightGBM.
Specifically, that meant adding the ability to provide an HDFS URI to Dataset creation methods that accept files, and have LightGBM read the file from HDFS.
Since then:
we've received very very few issues / PRs related to that support
0 tests have ever been run on that build of the library
I'm all for this change, simplification is always good and if we neither have tests nor indication that nobody has used this (extensively), that's even more of an argument 👍
Proposal
I'm requesting comment on the following proposal:
USE_HDFS
And doing all of these only after issuing build-time and run-time deprecation warnings for 1 release.
Summary
6+ years ago, #1243 was merged, adding "experimental" support for Hadoop Filesystem ("HDFS") to LightGBM.
Specifically, that meant adding the ability to provide an HDFS URI to Dataset creation methods that accept files, and have LightGBM read the file from HDFS.
Since then:
I don't believe SynapseML depends on the HDFS build of LightGBM (GitHub search). Hopefully @imatiach-msft or @mhamilton723 can confirm.
Searching the rest of GitHub (GitHub search), I just see other forks of LightGBM and two recipes from repackages:
Motivation
Simplifies the project's public interface (build system options, installation docs, etc.).
Simplifies file-reading C++ code, making it easier to understand and refactor in the future.
Here in 2024, users have other and better options for using LightGBM distributed training on data in HDFS, like:
lightgbm.dask
(Dask docs on Hadoop)SynapseML
(https://github.com/microsoft/SynapseML)The text was updated successfully, but these errors were encountered: