Skip to content

Releases: bpolaszek/bentools-etl

4.0-alpha8

13 Nov 12:00
3460279
Compare
Choose a tag to compare
4.0-alpha8 Pre-release
Pre-release

What's Changed

Full Changelog: 4.0-alpha7...4.0-alpha8

4.0-alpha7

12 Nov 14:22
2adb322
Compare
Choose a tag to compare
4.0-alpha7 Pre-release
Pre-release

What's Changed

Full Changelog: 4.0-alpha6...4.0-alpha7

4.0-alpha6

10 Nov 19:54
c0998ed
Compare
Choose a tag to compare
4.0-alpha6 Pre-release
Pre-release

What's Changed

Full Changelog: 4.0-alpha5...4.0-alpha6

4.0-alpha5

10 Nov 11:39
b8df541
Compare
Choose a tag to compare
4.0-alpha5 Pre-release
Pre-release

What's Changed

Full Changelog: 4.0-alpha4...4.0-alpha5

4.0-alpha4

08 Nov 16:19
b2525ae
Compare
Choose a tag to compare
4.0-alpha4 Pre-release
Pre-release

What's Changed

  • Feat: Allow context to be initialized by user by @bpolaszek in #16
  • Refactor: Rename recipe main method by @bpolaszek in #17
  • Tests: Improve coverage by @bpolaszek in #18
  • Feat: chain transformers, chain loaders, conditional loaders by @bpolaszek in #19
  • Fix: Silently chain transformers / loaders with the EtlBuilder by @bpolaszek in #20
  • Refactor: Fix typo in root namespace 😅 by @bpolaszek in #21

Full Changelog: 4.0-alpha3...4.0-alpha4

4.0-alpha3

06 Nov 14:03
2e5f893
Compare
Choose a tag to compare
4.0-alpha3 Pre-release
Pre-release

What's Changed

  • Feat: Transformer can now return single values for a better DX by @bpolaszek in #12
  • Fix: Improve object cloning performance by @bpolaszek in #14
  • Fix: Clones not being accurate by @bpolaszek in #15
  • Refactor: use holding objects instead of passing EtlState by reference by @bpolaszek in #13

Full Changelog: 4.0-alpha2...4.0-alpha3

4.0-alpha2

30 Oct 17:30
4ccf009
Compare
Choose a tag to compare
4.0-alpha2 Pre-release
Pre-release

What's Changed

Full Changelog: 4.0-alpha1...4.0-alpha2

Version 4.0 on its way !

26 Oct 08:25
9608d06
Compare
Choose a tag to compare
Pre-release

Hey folks! 👋

It's been more than 4 years since a version 3 bentools/etl was drafted, but never got out of the alpha stability, mostly because of a lack of time but also, I have to admit, uncertainties about design directions taken.

Introducing bentools/etl v4

PHP 8 and a lot of projects on my side came in between, and I recently got the need of this library, but I wanted to keep the good ideas of the v3, and remove the bad ones as well.

So, I decided that a stable v3 will never sunrise, and because lots of classes have been renamed, most of them became immutable, here's a brand new v4 version.

What's new?

  • This version requires PHP 8.2 as a minimum, is 100% covered by tests (this wasn't the case before), and uses PHPStan to ensure types consistency at the highest level. A Github Actions CI has also been set up.

  • It introduces a new EtlState object, which is instantiated at the beginning of the ETL process, and passed through the different steps and event listeners. The EtlExecutor (previously the Etl class) is no longer mutable, since it basically holds the Extractor, the Transformer and the Loader objects, fires events and provides you with the state you need with the EtlState.

  • The EtlState is mostly readonly, but you can still call $state->skip() to skip items, $state->stop() to stop the process, $state->flush() to request an early flush, and you can use the $state->context array to pass arbitrary data between the different steps and events during the whole workflow.

How does it work?

Here's an example of the new API:

city_english_name,city_local_name,country_iso_code,continent,population
"New York","New York",US,"North America",8537673
"Los Angeles","Los Angeles",US,"North America",39776830
Tokyo,東京,JP,Asia,13929286
...
use Bentools\ETL\EtlConfiguration;
use Bentools\ETL\EtlExecutor;
use Bentools\ETL\EventDispatcher\Event\LoadEvent;
use Bentools\ETL\Extractor\CSVExtractor;
use Bentools\ETL\Loader\JSONLoader;
use Bentools\ETL\Recipe\LoggerRecipe;
use Monolog\Logger;

$etl = (new EtlExecutor(options: new EtlConfiguration(flushEvery: 100)))
    ->extractFrom(new CSVExtractor(options: ['columns' => 'auto']))
    ->transformWith(function (array $city) {
        $city['slug'] = strtr(strtolower($city['city_english_name']), [' ' => '-']);
        yield $city;
    })
    ->loadInto(new JSONLoader())
    ->onLoad(fn (LoadEvent $event) => print("Loading city `{$event->item['slug']}`".PHP_EOL))
    ->withRecipe(new LoggerRecipe(new Logger('etl-logs')));

$report = $etl->process(
    source: 'file:///tmp/cities.csv',
    destination: 'file:///tmp/cities.json',
);

var_dump($report->output); // file:///tmp/cities.json
[
    {
        "city_english_name": "New York",
        "city_local_name": "New York",
        "country_iso_code": "US",
        "continent": "North America",
        "population": 8537673,
        "slug": "new-york"
    },
    {
        "city_english_name": "Los Angeles",
        "city_local_name": "Los Angeles",
        "country_iso_code": "US",
        "continent": "North America",
        "population": 39776830,
        "slug": "los-angeles"
    },
    {
        "city_english_name": "Tokyo",
        "city_local_name": "東京",
        "country_iso_code": "JP",
        "continent": "Asia",
        "population": 13929286,
        "slug": "tokyo"
    }
]

I hope you'll enjoy this release as much as I enjoyed coding it! 😃

3.0-alpha.3

24 Sep 15:25
Compare
Choose a tag to compare
3.0-alpha.3 Pre-release
Pre-release
  • Etl::process() can now have no args
  • Refactored some constructors
  • Add early flush feature (loader can now flush on demand from ETL or from the event system, and will know if it's a partial flush or a full flush)
  • Handled Doctrine namespace change
  • PHP 7.4 support (sorry, we're late)

Init loader hook

08 Apr 10:57
Compare
Choose a tag to compare
Init loader hook Pre-release
Pre-release
  • Signature changes on some extractors / loaders (CSV, JSON).
  • It's now possible to pass arbitrary arguments to LoaderInterface::init() that will be processed just before the 1st item to be loaded. These arbitrary, optionnal arguments are now part of Etl::process() signature. this allows a single loader to have multiple options and/or targets at runtime, and to reset its state at each ETL process.
  • It's possible to hook on the loader.init event with EtlBuilder::onLoaderInit().