Set of methods for source code decomposition.
$ git clone [email protected]:acheshkov/program_slicing.git
$ cd program_slicing
$ git submodule update --recursive --init
$ pip3 install ./program_slicing
You should have access to global network to use pip. Python 3.9 with corresponding C compiler is required. Run Python Console to check the version of C compiler.
This project can be used via Command Line Interface, or it can be included into any other Python project as a submodule.
slice
Use this command if you want to decompose source files by complete computation slice (Nikolaos Tsantalis and Alexander Chatzigeorgiou. 2011. Identification of extract method refactoring opportunities for the decomposition of methods).
$ python cli.py slice [-h]
[-o OUTPUT]
source
Positional arguments:
source - source folder, file or url
Optional arguments:
-o, --output OUTPUT - output file or directory: depending on what you set as output, you will get folder full of slice decompositions or a single file with it. It uses stdout if not specified
-h, --help - show this help message and exit
Examples:
$ python cli.py slice MyProjectPath
$ python cli.py slice MyFile.java
$ python cli.py slice MyProjectPath --output MyResultPath
$ python cli.py slice MyFile.java --output MyResultPath
Control Dependence Graph - structure that represents Control Dependence Graph (inherited from networkx.DiGraph) with corresponding methods.
from program_slicing.graph.cdg import ControlDependenceGraph
- add_entry_point - mark specified node as entry point.
- get_entry_points - return a set of nodes that where marked as an entry point.
Control Flow Graph - structure that represents Control Flow Graph (inherited from networkx.DiGraph) with corresponding methods.
from program_slicing.graph.cfg import ControlFlowGraph
- add_entry_point - mark specified node as entry point.
- get_entry_points - return a set of nodes that where marked as an entry point.
Data Dependence Graph - structure that represents Data Dependence Graph (inherited from networkx.DiGraph) with corresponding methods.
from program_slicing.graph.ddg import DataDependenceGraph
- add_entry_point - mark specified node as entry point.
- get_entry_points - return a set of nodes that where marked as an entry point.
Program Dependence Graph - structure that represents Program Dependence Graph (inherited from networkx.DiGraph) with corresponding methods.
from program_slicing.graph.pdg import ProgramDependenceGraph
- add_entry_point - mark specified node as entry point.
- get_entry_points - return a set of nodes that where marked as an entry point.
Statement - structure that represents Control Dependence Graph, Data Dependence Graph or Program Dependence Graph nodes.
from program_slicing.graph.statement import Statement
- statement_type - StatementType object.
- start_point - line and column numbers of the Statement's start.
- end_point - line and column numbers of the Statement's end.
- affected_by - set of strings with names of variables that may affect the current Statement.
- name - string with the name of the Statement. Not all Statements are named.
- ast_node_type - string with additional information about node (e.g. an AST root's class).
StatementType - structure that enumerates Statement types.
from program_slicing.graph.statement import StatementType
- FUNCTION - function declaration Statement.
- VARIABLE - variable declaration Statement.
- ASSIGNMENT - variable assignment Statement (such as
i = 0
,i += 1
,i++
, etc). - CALL - function call Statement.
- SCOPE - scope Statement (such as braces
{}
or empty body inif (...) a = 0
). - BRANCH - Statement that branches the flow (such as
if
,try
,catch
,switch
). - LOOP - branch Statement that generates backward flow (such as
for
andwhile
). - GOTO - Statement that lead flow to a concrete Statement
(such as
break
,continue
,throw
,return
and evenelse
). - UNKNOWN - all the other Statements.
- EXIT - special Statement that is placed in the end of function declaration, has zero size and all the 'return' and 'throw' nodes leads flow into it (as well as the last Statement in the flow).
Basic Block - structure that represents Control Flow Graph nodes.
from program_slicing.graph.basic_block import BasicBlock
- statements - get the content of the Basic Block, i.e a list of Statements.
- root - get the first Statement from the Basic Block. None if it is empty.
- append - add a specified Statement to the Basic Block.
- is_empty - return True if there are no statements in the Basic Block, otherwise - False.
- split - split content by the given index, left first part of the split at the original Basic Block and return a new Basic Block which contains all the rest Statements.
Program Graphs Manager - structure that contains different types of program graphs (such as Control Flow Graph or Control Dependence Graph) based on same source code and provides a set of methods for their analysis.
from program_slicing.graph.parse import Lang
from program_slicing.graph.parse import control_dependence_graph
from program_slicing.graph.parse import control_flow_graph
from program_slicing.graph.manager import ProgramGraphsManager
manager_by_source = ProgramGraphsManager(source_code, Lang.JAVA)
manager_by_cdg = ProgramGraphsManager.from_control_dependence_graph(control_dependence_graph(source_code, Lang.JAVA))
manager_by_cfg = ProgramGraphsManager.from_control_flow_graph(control_flow_graph(source_code, Lang.JAVA))
Properties:
- control_dependence_graph - return the Control Dependence Graph.
- control_flow_graph - return the Control Flow Graph.
- data_dependence_graph - return the Data Dependence Graph.
- program_dependence_graph - return the Program Dependence Graph.
- sorted_statements - return Statements that are sorted first by increasing of their
start_point
, then by decreasing of theirend_point
. - general_statements - return general Statements
(those which are not contained in any non
SCOPE
,BRANCH
,LOOP
,FUNCTION
orEXIT
Statement). - scope_statements - return all the
SCOPE
,BRANCH
,LOOP
orFUNCTION
Statements.
Public methods:
- get_basic_block - return a Basic Block (that is a node of the Control Flow Graph) that contains a given Statement.
- get_boundary_blocks - return a set of Basic Blocks which intersection of dominated and reach blocks contain the given one block.
- get_boundary_blocks_for_statement - return a set of boundary blocks for Basic Block in which the given Statement is placed.
- get_dominated_blocks return a set of Basic Blocks that are dominated by the given one (i.e. their Statements are placed in a control Dependence Graph subtree of the root of the given Basic Block).
- get_reach_blocks - return a set of Basic Blocks that are reachable from the given one in the Control Flow Graph.
- get_statement_line_numbers - return a set of line numbers in which the given Statement is placed.
- get_function_statement - return the minimal
FUNCTION
Statement in which the given Statement is placed. - get_function_statement_by_range - return the minimal FUNCTION Statement in which the given range is placed.
- get_scope_statement - return a minimal 'scope-like' (
SCOPE
,LOOP
,BRANCH
) Statement that contains a given Statement. - get_statements_in_scope - return all the Statements in the given scope Statement.
- get_statements_in_range - return all the Statements in the given range.
- get_exit_statements - return Statements that are Flow Dependence children of the given statements but not one of them.
- get_affecting_statements - return Statements from the given set of Statements that affect some Statement not form the given set.
- get_changed_variables_statements - return
VARIABLE
Statements that represent variables changed in the given set of Statements. - get_involved_variables_statements - return
VARIABLE
Statements that represent variables involved (including usage) in the given set of Statements. - contain_redundant_statements - check if the given set of Statements contain part of some construction not fully included in the given set.
Class methods:
- from_source_code - build all the graphs by a given source code string and a language description.
- from_control_dependence_graph - build all the graphs by a given Control Dependence Graph.
- from_control_flow_graph - build all the graphs by a given Control Flow Graph.
- from_data_dependence_graph - build all the graphs by a given Data Dependence Graph.
- from_program_dependence_graph - build all the graphs by a given Program Dependence Graph.
parse - set of functions that allow to build different graphs from the specified source code string and programming language specification.
- control_dependence_graph - parse a Control Dependence Graph:
from program_slicing.graph.cdg import ControlDependenceGraph
from program_slicing.graph.parse import control_dependence_graph, Lang
cdg: ControlDependenceGraph = control_dependence_graph(source_code, Lang.JAVA)
- control_flow_graph - parse a Control Flow Graph:
from program_slicing.graph.cfg import ControlFlowGraph
from program_slicing.graph.parse import control_flow_graph, Lang
cfg: ControlFlowGraph = control_flow_graph(source_code, Lang.JAVA)
- data_dependence_graph - parse a Data Dependence Graph:
from program_slicing.graph.ddg import DataDependenceGraph
from program_slicing.graph.parse import data_dependence_graph, Lang
ddg: DataDependenceGraph = data_dependence_graph(source_code, Lang.JAVA)
- program_dependence_graph - parse a Program Dependence Graph:
from program_slicing.graph.pdg import ProgramDependenceGraph
from program_slicing.graph.parse import program_dependence_graph, Lang
pdg: ProgramDependenceGraph = program_dependence_graph(source_code, Lang.JAVA)
- tree_sitter_ast - parse an Abstract Syntax Tree:
from tree_sitter import Tree
from program_slicing.graph.parse import tree_sitter_ast, Lang
ast: Tree = tree_sitter_ast(source_code, Lang.JAVA)
convert - there is also an option to convert one type of graph to another:
from program_slicing.graph import convert
from program_slicing.graph.cdg import ControlDependenceGraph
from program_slicing.graph.cfg import ControlFlowGraph
cdg: ControlDependenceGraph = ControlDependenceGraph()
cfg: ControlFlowGraph = convert.cdg.to_cfg(cdg)
new_cdg: ControlDependenceGraph = convert.cfg.to_cdg(cfg)