Skip to content

Releases: hitsz-ids/synthetic-data-generator

0.2.4

03 Dec 02:44
6c9ecd1
Compare
Choose a tag to compare

What's Changed

  • Docs - Update README News and fix the link of python package badge. by @jalr4ever in #243
  • Bugfix - Datatime formatter in small dataset and improve performace by @cyantangerine in #244
  • Bugfix - Fixed numeric inspector error for int32/float32 types, by @cyantangerine in #247
  • Feature - Support more encoders, NormalizedFrequencyEncoder & NormalizedLabelEncoder, by @cyantangerine in #247
  • Feature - Integrate GaussianCopula model into the Synthesizer. by @jalr4ever in #241
  • Feature - Support DataFrameConnector for in-memory datasets processing, by @cyantangerine in #247
  • Feature - Support NormalizedFrequencyEncoder and NormalizedLabelEncoder for categorical encoding, by @cyantangerine in #247
  • Enhancement - Support CTGAN sample with drop_more parameter for better generation efficiency, by @cyantangerine in #247
  • Enhancement - Improved Disk_cache performance by avoiding pd iterative connections, by @cyantangerine in #247

Full Changelog: 0.2.3...0.2.4

0.2.3

18 Nov 09:31
4dd3dcc
Compare
Choose a tag to compare

What's Changed

  • Enhance - Handling Fixed Column Relationships using FixedCombinationInspector and FixedCombinationTransformer by @MooooCat @jalr4ever in #219
  • BugFix - Fix the type error in the query function of Metadata. by @jalr4ever in #235
  • Enhance - Handling fixed column relationships by specific_combinations and SpecificCombinationTransformer. by @jalr4ever in #236
  • chore: Drop python 3.8 support and improve ci file name by @Wh1isper in #237

Full Changelog: 0.2.2...0.2.3

0.2.2

08 Nov 02:18
8eb395b
Compare
Choose a tag to compare

What's Changed

  • Feature: Add progressbar for CTGAN when fitting and sampling by @cyantangerine in #228
  • Enhance: Check the type of foreign key by @Z712023 in #229
  • BugFix: Parallel Data Processing by @cyantangerine in #227
  • Enhanee: Improved CONTRIBUTING Docs with 4+1 view and Overview Diagram by @jalr4ever in #226
  • BugFix: Regulate positive-negative values in the generated data by @jalr4ever in #232
  • Enhance: Tenfold performance boost for reduce the memory usage of Gaussian Copula training. by @jalr4ever in #233

New Contributors

Full Changelog: 0.2.1...0.2.2

0.2.1

11 Oct 01:57
00685a3
Compare
Choose a tag to compare

What's Changed

  • Add CHN address inspector by @MooooCat in #158
  • Update inspector part in Doc(API Reference) by @MooooCat in #159
  • Add dotenv in single-table gpt model by @MooooCat in #161
  • Speed up regex inspector, Add chn/eng name inspectors by @MooooCat in #162
  • Add single table metadata example by @MooooCat in #166
  • bugfix: SingleTableGPTModel._sample_with_data "has no attribute 'result'" by @aaronrmm in #174
  • Change Metadata.column_list from Set to List by @MooooCat in #176
  • Remove unnecessary dependency torchvision by @Guo-Yunzhe in #177
  • Update pyproject.toml (joblib version) by @MooooCat in #175
  • Bugfix: fix gussian copula segmentfault error by @MooooCat in #180
  • Bugfix: fix division by zero error in numeric inspector, add comments by @MooooCat in #181
  • Intro data processor in sdgx by @MooooCat in #171
  • Intro data processor in Readme by @MooooCat in #182
  • Fix View GFI Link in Readme by @MooooCat in #183
  • Fix precision problem in metric's testcases by @MooooCat in #185
  • Use GLM-4 by @TracyWang95 in #188
  • Pin numpy<2 by @Wh1isper in #190
  • Feature: Add Email Generator (a new type of sdgx.data_processor) by @MooooCat in #184
  • Add ChnPiiGenerator and Enhance Models by @MooooCat in #191
  • Update documentation and docstrings for DataProcessors by @MooooCat in #186
  • Add live QR code by @MooooCat in #198
  • Enhance Data Handling with Empty Column Inspector and Transformer by @MooooCat in #197
  • Update NonValueTransformer's Default Setting and Handle Custom Fill Values by @MooooCat in #199
  • Enhance Chinese Name Inspector by @MooooCat in #200
  • Add Chinese Company Name Support and Inspector by @MooooCat in #201
  • Update Live QR Code Image by @MooooCat in #203
  • BugFix: base_url not included when request to gpt in SingleTableGPTModel by @jalr4ever in #205
  • Enhance: Fix Data Quality with Outlier Handling and Improved Missing Value Treatment by @MooooCat in #207
  • Typo Fix: Unified Logger Usage by @MooooCat in #209
  • Update Live QR Code Image 0730 by @MooooCat in #210
  • Bugfix: Update Fit Methods in Data Processors by @MooooCat in #211
  • Add ConstInspector and ConstValueTransformer for Handling Constant Columns by @MooooCat in #202
  • Enhance: Add NonValueTransformer Reverse Conversion with NAN_VALUE Replacement by @MooooCat in #212
  • Maintenance: Update CTGAN Example to Use Latest SDG by @MooooCat in #213
  • Fix Minor Typo by @MooooCat in #216
  • Enhance Numeric Data Inspection and Introduce Positive/Negative Filtering by @MooooCat in #217
  • Fix Division by Zero Error in Numeric Column Inspection by @MooooCat in #220

New Contributors

Full Changelog: 0.2.0...0.2.1

0.2.0

01 Mar 03:31
e775104
Compare
Choose a tag to compare

What's Changed

LLM-Based SingleTable Model

A single-table data synthesis model based on LLM is included, view colab example:

Commits:

Improvements on Inspectors

  • Add Regex Inspector and Email Inspector example. by @MooooCat in #115
  • Implement datetime_formats in DatetimeInspector by @Femi-lawal in #125
  • Distinguish int/float in NumericInspector by @MooooCat in #133

Metadata

  • Bugfix: fix KeyError when metadata raising an MetadataInvalidError. by @MooooCat in #134
  • Add dict support on metadata, optimize datetime format judgment rules, add eq for combiner by @MooooCat in #135

Python 3.12 Support

Readme and Docs

Others

New Contributors

Full Changelog

Please view: 0.1.5...0.2.0

0.1.5

22 Jan 08:10
60f6323
Compare
Choose a tag to compare

What's Changed

New Contributors

Full Changelog: 0.1.4...0.1.5

0.1.4

16 Jan 13:44
dae869e
Compare
Choose a tag to compare

What's Changed

  • [Bugfix] Add future annotations by @MooooCat in #106
  • Add testing for JSD metrics by @sjh120 in #100
  • Add base model for multi-table statistic model, change single-table base class location by @MooooCat in #102
  • Add mutual information metric by @Z712023 in #101

Full Changelog: 0.1.3...0.1.4

0.1.3

08 Jan 06:46
9479c1b
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.1.2...0.1.3

0.1.2

23 Dec 02:30
4ea93b0
Compare
Choose a tag to compare

What's Changed

Full Changelog: 0.1.1...0.1.2

0.1.1

21 Dec 05:32
Compare
Choose a tag to compare

What's Changed

  • Add more testing, fix some bugs, drop mem cache by @Wh1isper in #85

Full Changelog: 0.1.0...0.1.1