Skip to content

Commit

Permalink
Merge branch 'msr-2023-datatrack'
Browse files Browse the repository at this point in the history
  • Loading branch information
mohayemin committed Jan 24, 2023
2 parents 983186a + e15c5d9 commit 3ae26fd
Show file tree
Hide file tree
Showing 609 changed files with 1,090 additions and 6,008 deletions.
4 changes: 2 additions & 2 deletions .github/ISSUE_TEMPLATE/query-request.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,12 +8,12 @@ assignees: mohayemin
---

**A one-line English description of the query.**
Ex. List all code changes having one-to-many function call replacement.
Ex. List all migrations having code changes.
This should be the same as what you described in the issue title.

**(Optional) if the one-line description seems insufficient, describe the query further.**

**Proposed query syntax**
```bash
python pymigbench.py list -dt cc -f program_element="function call" cardinality="1-n"
python pymigbench.py list -dt mg -f code_changes=<>
```
7 changes: 2 additions & 5 deletions CONTRIBUTING.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,12 +5,9 @@ Please follow this document to find the best way you can contribute.
## Submit new migration data
We will be happy to include manually validated Python library migration data in PyMigBench.
Please [submit a data request](https://github.com/ualberta-smr/PyMigBench/issues/new?template=data-request.md) and attach your dataset to the issue.
We currently support three types of data: library pairs, migrations and code changes.
You can submit only library pairs or library pairs and migrations or all three types of data.
We currently support two types of data: library pairs and migrations.
The data should be in YAML format as described in our [dataset page](https://ualberta-smr.github.io/PyMigBench/dataset).
Alternatively, you can submit one CSV file per dataset, where each row represents one item.
The header should be same as the properties in the YAML files.
Please leave the IDs blank.
Please leave the IDs blank as we will assign them.

We will review your data and add it to the benchmark, or contact you if there are any issues.

Expand Down
13 changes: 3 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,12 +1,5 @@
# PyMigBench
Library migration is the process of replacing one library with another in a client project.
_PyMigBench_ is a benchmark of Python Library Migration that we developed in the paper:
> Mohayeminul Islam, Ajay Kumar Jha, Sarah Nadi.
> PyMigBench and PyMigTax: A Benchmark and Taxonomy for Python Library Migration.
> _Empirical Software Engineering (Under Review)_.
PyMigBench is a benchmark of Python Library Migrations.
This repository contains the dataset and the tool to access the dataset.
This repository contains the benchmark data and the source code of the tool to explore the data. Please visit [the PyMigBench website](https://ualberta-smr.github.io/PyMigBench) to learn more about the dataset and the tool.

Other than the benchmark, we also developed _PyMigTax_,
a taxonomy of the migration related code changes that we include in PyMigBench.
Please read the [preprint version of the paper](https://arxiv.org/abs/2207.01124) to learn more about PyMigBench and PyMigTax.

This repository contains the benchmark data and the tools to explore the data. Please visit [the PyMigBench website](https://ualberta-smr.github.io/PyMigBench) to learn more about the dataset and the tool.
2 changes: 1 addition & 1 deletion code/core/Arguments.py
Original file line number Diff line number Diff line change
Expand Up @@ -35,7 +35,7 @@ def build_arguments() -> Arguments:
parser.add_argument("-d", "-dt", "--data-types", nargs='+',
help="The data types that you want to fetch. "
"Different queries accept different numbers of arguments.",
choices=["all", "lp", "mg", "cc"])
choices=["all", "lp", "mg"])
parser.add_argument("-f", "--filters", required=False, nargs='+',
help="Additional filters. The format varies based on the query.")
parser.add_argument("-o", "--output-format", required=False, default="yaml",
Expand Down
6 changes: 2 additions & 4 deletions code/core/Constants.py
Original file line number Diff line number Diff line change
@@ -1,11 +1,9 @@
LibPairKey = "lp"
MigrationKey = "mg"
CodeChangeKey = "cc"

DataTypeKeys = [LibPairKey, MigrationKey, CodeChangeKey]
DataTypeKeys = [LibPairKey, MigrationKey]

DataTypeName = {
LibPairKey: "library pair",
MigrationKey: "migration",
CodeChangeKey: "code change"
MigrationKey: "migration"
}
15 changes: 0 additions & 15 deletions code/db/CodeChange.py

This file was deleted.

6 changes: 1 addition & 5 deletions code/db/Db.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,15 +4,13 @@

import yaml

from core.Constants import CodeChangeKey, MigrationKey, LibPairKey
from db.CodeChange import CodeChange
from core.Constants import MigrationKey, LibPairKey
from db.DataItem import DataItem
from db.LibPair import LibPair
from db.Migration import Migration


class Db:
code_changes: dict[str, CodeChange]
migrations: dict[str, Migration]
lib_pairs: dict[str, LibPair]
_mapping: dict[str, dict[str, DataItem]]
Expand All @@ -21,11 +19,9 @@ def __init__(self, data_root: str):
self.data_root = data_root

def load(self):
self.code_changes = self.load_items("codechange", CodeChange)
self.migrations = self.load_items("migration", Migration)
self.lib_pairs = self.load_items("libpair", LibPair)
self._mapping = {
CodeChangeKey: self.code_changes,
MigrationKey: self.migrations,
LibPairKey: self.lib_pairs,
}
Expand Down
6 changes: 6 additions & 0 deletions code/db/Migration.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,3 +8,9 @@ class Migration(DataItem):
commit: str
pair_id: str
commit_message: str
code_changes: list


class CodeChange:
filepath: str
lines: list[str]
4 changes: 2 additions & 2 deletions code/format/JSONFormat.py
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,5 @@


class JSONFormat(OutputFormat):
def format(self, result: Result):
return json.dumps(result, indent=2, sort_keys=False, default=vars)
def format_impl(self, result: Result):
return json.dumps(result.items, indent=2, sort_keys=False, default=vars)
7 changes: 6 additions & 1 deletion code/format/OutputFormat.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,11 @@


class OutputFormat(ABC):
@abstractmethod
def format(self, result: Result):
count = f"{result.count} items returned"
output = f"{count}\n\n{self.format_impl(result)}\n{count}\n"
return output

@abstractmethod
def format_impl(self, result: Result):
pass
4 changes: 2 additions & 2 deletions code/format/YAMLFormat.py
Original file line number Diff line number Diff line change
Expand Up @@ -6,5 +6,5 @@


class YAMLFormat(OutputFormat):
def format(self, result: Result):
return yaml.safe_dump(to_dict(result), sort_keys=False)
def format_impl(self, result: Result):
return yaml.safe_dump(to_dict(result.items), sort_keys=False)
13 changes: 0 additions & 13 deletions data/codechange/100_1.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/101_1.yaml

This file was deleted.

12 changes: 0 additions & 12 deletions data/codechange/102_1.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/103_1.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/104_1.yaml

This file was deleted.

14 changes: 0 additions & 14 deletions data/codechange/105_1.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/106_1.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/107_1.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/107_2.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/108_1.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/109_1.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/10_1.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/10_10.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/10_2.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/10_3.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/10_4.yaml

This file was deleted.

13 changes: 0 additions & 13 deletions data/codechange/10_5.yaml

This file was deleted.

Loading

0 comments on commit 3ae26fd

Please sign in to comment.