Skip to content

Commit

Permalink
Merge branch 'api'
Browse files Browse the repository at this point in the history
  • Loading branch information
mohayemin committed Aug 3, 2024
2 parents 4538c2a + 80b77cb commit ac963fb
Show file tree
Hide file tree
Showing 146 changed files with 176 additions and 13,534 deletions.
4 changes: 4 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
secrets.json
dist/
*.egg-info/
.venv/
File renamed without changes.
22 changes: 22 additions & 0 deletions .idea/PyMigBench.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

7 changes: 7 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion api/.idea/modules.xml → .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

2 changes: 1 addition & 1 deletion api/.idea/vcs.xml → .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 3 additions & 0 deletions MANIFEST.in
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
exclude pymigbench_tests/*
exclude data/*
include version
51 changes: 48 additions & 3 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
PyMigBench is a benchmark of Python Library Migrations.
This repository contains the data and code for the dataset.
This repository contains the data and the code the library that can be used to access the dataset.

## PyMigBench v2
## Dataset
### PyMigBench v2
The current version, PyMigBench-2.0, includes 3,096 migration-related code changes from 335 migrations between 141 analogous library pairs.
This includes all migrations from [PyMigBench v1](#pymigbench-v1) and additional migrations borrowed from the [SALM dataset](https://ieeexplore.ieee.org/document/10123560).
The data also includes additional information per migration-related code change compared to v1.
Expand All @@ -15,7 +16,7 @@ Use either of these links to reproduce the paper.
We may update this repository to correct any mistakes or add more data and it may not synch with the paper.
For, the latest data, use the [latest release](https://github.com/ualberta-smr/PyMigBench/releases/latest) in this repository.

## PyMigBench v1
### PyMigBench v1
We recommend using PyMigBench v2 for any new research.
However, you want to use the v1 dataset, you should look at [Release 1.0.3](https://github.com/ualberta-smr/PyMigBench/releases/v1.0.3).
Cite the paper below if you use the v1 dataset.
Expand All @@ -34,6 +35,50 @@ Cite the paper below if you use the v1 dataset.
```


## Library

### Installation
The library and the dataset should be at the same version to be compatible.
To install the library, run:
```bash
pip install pymigbench==<version>
```

### Basic usage
To use the library, you need to have the dataset downloaded.
You can download the dataset from the [GitHub repository](https://github.com/ualberta-smr/pymigbench).

```python
from pymigbench.database import Database
from pathlib import Path

yaml_root = Path('repo-root/migration/')

db = Database.load_from_dir(yaml_root) # Load the dataset from the directory
migs = db.migs() # Get all the migrations
```

### The constants
There are several enums to help you work with the dataset:
They are all in the `pymigbench.constants` module. Example:
```python
from pymigbench.constants import ProgramElement
```

### The migration-related objects
There are three main classes to encapsulate the data: `Migration`, `MigrationFile`, and `CodeChange`.

`Migration` is the top level class representing one single migration, ie, one yaml file.
`Migration` has a list of `MigrationFile` objects, which represent the files that were changed in the migration.
`MigrationFile` has a list of `CodeChange` objects, which represent a single migration-related code change.
Each of these model classes has an `id()` method that returns a unique identifier for the object across the full dataset.
`CodeChange` object additionally has an `index` property and a `id_in_file()` method, which are unique within container file.
Each of the classes has some additional helper methods.





## Contributors
- [Mohayeminul Islam](https://mohayemin.github.io/)
- [Ajay Kumar Jha](https://hifromajay.github.io/)
Expand Down
4 changes: 0 additions & 4 deletions api/.gitignore

This file was deleted.

1 change: 0 additions & 1 deletion api/.idea/.name

This file was deleted.

10 changes: 0 additions & 10 deletions api/.idea/misc.xml

This file was deleted.

11 changes: 0 additions & 11 deletions api/.idea/pymigbench.api.iml

This file was deleted.

3 changes: 0 additions & 3 deletions api/MANIFEST.in

This file was deleted.

7 changes: 0 additions & 7 deletions api/build.sh

This file was deleted.

1 change: 0 additions & 1 deletion api/publish.sh

This file was deleted.

138 changes: 0 additions & 138 deletions code/.gitignore

This file was deleted.

21 changes: 0 additions & 21 deletions code/LICENSE

This file was deleted.

6 changes: 0 additions & 6 deletions code/configs/config.yaml

This file was deleted.

Empty file removed code/pymigstat/__init__.py
Empty file.
Empty file.
Loading

0 comments on commit ac963fb

Please sign in to comment.