Skip to content

Commit

Permalink
updated doc, more tests, little refatoring in python interface
Browse files Browse the repository at this point in the history
  • Loading branch information
rikigigi committed Jan 24, 2022
1 parent 14ec6df commit 5a042d3
Show file tree
Hide file tree
Showing 75 changed files with 967 additions and 1,021 deletions.
53 changes: 37 additions & 16 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,16 @@ analisi_traj = pyanalisi.Trajectory(
pos, #shape (ntimesteps, natoms, 3)
vel, #same shape as pos
types, #shape (natoms)
box, #shape (ntimesteps, 3,3) or (ntimesteps, 6)
True, # if True, box for a single frame is 3,3,
# if False is 6
False # don't apply pbc
box, #shape (ntimesteps, 3,3)
# or (ntimesteps, 6)
# or (ntimesteps, 9)
pyanalisi.BoxFormat.CellVectors,
# input box format is
# matrix of cell vectors
False, # don't wrap atoms around the center of
# the cell
False # don't save rotation matrix that is used
# internally if the cell is triclinic
)
#do the calculation that you want
Expand Down Expand Up @@ -266,22 +272,34 @@ Where meaningful, the following arguments are shared by all calculation performe
You can create a trajectory python object to be used in this library in two ways:
- start from python arrays generated by other code
- use a LAMMPS binary file that you have on the filesystem. This is the same file that is used by the command line interface.

to access the positions, velocities and cell data you can use in both the python buffer protocol interface and the lammps binary one the functions `get_positions_copy()`, `get_velocities_copy()` and `get_box_copy()`. Note that each call of these functions allocates more memory, as needed.

### Internal format used to store the cell information

Internally the format used for storing the cell information is:

`[x_low, y_low, z_low, lx/2, ly/2, lz/2, xy, xz, yz]`

and is accessible by using the python function `get_box_copy()` common to the trajectory objects. Note that internally the first cell vector is always in the same direction of the x axis, the second one is on the xy plane while the third has an arbitrary direction. This is equivalent to requiring that the cell matrix is triangular. If necessary, this is achieved by doing a QR decomposition of the input cell matrix and then rotating everything with the Q matrix.


#### using Python arrays: the buffer protocol interface
### using Python arrays: the buffer protocol interface


You must have 4 arrays. In general the interface supports any object that supports python's buffer protocol. This include, among others, numpy arrays. Suppose you have a trajectory of `t` timesteps and `n` atoms. You need the following arrays:
- position array, of shape `(t,n,3)`
- velocity array, of shape `(t,n,3)`
- cell array, of shape `(t,3,3)` only diagonal matrices (orthorombic cells) are supported at the moment, or if a lammps formated cell is provided `(t,6)` The lammps format simply list the low and high coordinates of the orthorombic cell, as in the header of each timestep that you find in the lammps trajetory format (shown in [command line interface](#command-line-interface) section )
- cell array, of shape `(t,3,3)` or if a lammps formated cell is provided `(t,6)` for orthorombic and `(t,9)` for triclinic. The lammps format simply list the low and high coordinates for each dimension and the tilts factors as described below
- types array, of shape `(n)` (integer array)

In general no particular units of measure are required, and the output will reflect the units that you used in the input. The calculations that the program does are reported exactly in the following sections. From those formulas you can deduce the units of the output given the units that you used in the input.

Then you must decide if you want that the coordinates are rewrapped inside the cell or not. At the moment only orthorombic cells are supported in all calculations but those that need only unwrapped coordinates, like MSD.
Then you must decide if you want that the coordinates are rewrapped around the center of the cell or not. This is not equivalent to wrapping the coordinates inside the cell for the triclinic case, but generates a compact list of positions suitable for an efficient calculation of the minimum image distance.

The lammps format for the cell is `[x_lo, x_hi, y_lo, y_hi, z_lo, z_hi, xy, xz, yz]`, as described in the [lammps documentation](https://docs.lammps.org/Howto_triclinic.html).

The lammps format for the cell is `[x_lo, x_hi, y_lo, y_hi, z_lo, z_hi]`: you have to provide only the coordinates boundaries.
If the plain cell vectors are provided in the cell matrix and this matrix is not diagonal, a QR decomposition is performed to get a triangular cell matrix and the achieve the desidered internal format. In this case all velocities and positions vector are rotated with the rotation matrix Q, that can be obtained with the function `get_rotation_matrix()`.

The syntax for creating the object is
```
Expand All @@ -290,12 +308,14 @@ analisi_traj = pyanalisi.Trajectory(positions_array,
velocity_array,
types_array,
box_array,
use_matrix_or_lammps_cell_format,
wrap_atomic_coordinates)
input_box_format_id,
wrap_atomic_coordinates,
save_Q_rotation_matrix
)
```
where `use_matrix_or_lammps_cell_format` is `True` if usual matrix format for the cell is given and `False` if a lammps formatted cell is provided and `wrap_atomic_coordinates` is `True` if you want to wrap all the atomic coordinates inside the cell.
where `input_box_format_id` is one of `pyanalisi.BoxFormat.CellVectors`, `pyanalisi.BoxFormat.LammpsOrtho`, `pyanalisi.BoxFormat.LammpsTriclinic` and describe the format of the array `box_array`. If `wrap_atomic_coordinates` is `True` the code will wrap all the atomic coordinates around the center of the cell (that does not mean inside the cell in the triclinc case). This makes the code for computing the minimum image distance more efficient if the atoms were far away out of the cell, but invalidates all the calculations were the atoms are not supposed to be wrapped back inside the cell.

You can write a LAMMPS bynary trajectory (that can be used by the command line interface with mpi, for example) by calling
You can write a LAMMPS binary trajectory (that can be used by the command line interface with mpi, for example) by calling
```
analisi_traj.write_lammps_binary('output_path', start_timestep, end_timestep)
```
Expand All @@ -313,7 +333,8 @@ where `start_timestep` is the first timestep index to dump (indexes start from 0
import pyanalisi
analisi_traj = pyanalisi.Trajectory(pos, vel, types, box,
True, # matrix format for the box array
False # don't wrap the coordinates
False, # don't wrap the coordinates
False # not interested in Q matrix
)
analisi_traj.write_lammps_binary("output_filename.bin"
, 0, # starting timestep
Expand All @@ -322,7 +343,7 @@ where `start_timestep` is the first timestep index to dump (indexes start from 0
```


#### LAMMPS binary trajectory interface
### LAMMPS binary trajectory interface


This interface is a little more complicated, since it was designed for computing block averages of very big files. It can read the same files that the command line program reads. The object is created with
Expand Down Expand Up @@ -370,7 +391,7 @@ The `header_array` object must describe the columns that are stored in the bidim
## MSD

Given a trajectory <img src="svgs/995aad7488ee6146d6b411759a5db052.svg?invert_in_darkmode" align=middle width=20.789844899999988pt height=27.91243950000002pt/> where <img src="svgs/0e45672d84ee21e1464939301cc46b43.svg?invert_in_darkmode" align=middle width=137.42503499999998pt height=24.65753399999998pt/> is the atomic index and <img src="svgs/4f4f4e395762a3af4575de74c019ebb5.svg?invert_in_darkmode" align=middle width=5.936097749999991pt height=20.221802699999984pt/> is the timestep index, defining the center of mass position of the atomic species <img src="svgs/36b5afebdba34564d884d347484ac0c7.svg?invert_in_darkmode" align=middle width=7.710416999999989pt height=21.68300969999999pt/> at the timestep <img src="svgs/4f4f4e395762a3af4575de74c019ebb5.svg?invert_in_darkmode" align=middle width=5.936097749999991pt height=20.221802699999984pt/> as
<p align="center"><img src="svgs/1069af11f835cf7996677118283e621d.svg?invert_in_darkmode" align=middle width=165.4146549pt height=51.449925449999995pt/></p>
<p align="center"><img src="svgs/397343bd70e7e236981916f727d5d2a0.svg?invert_in_darkmode" align=middle width=165.4146549pt height=45.002035649999996pt/></p>
where <img src="svgs/bf032e59612f5135c5425a8a0c07f2f7.svg?invert_in_darkmode" align=middle width=19.312276499999992pt height=22.465723500000017pt/> is the number of atoms of the specie <img src="svgs/36b5afebdba34564d884d347484ac0c7.svg?invert_in_darkmode" align=middle width=7.710416999999989pt height=21.68300969999999pt/>,
the code computes the following
<p align="center"><img src="svgs/fb6fd8b28a21787d6b119a3f059a9bd3.svg?invert_in_darkmode" align=middle width=429.54891045pt height=110.10997139999999pt/></p>
Expand Down Expand Up @@ -419,7 +440,7 @@ In general the object msd_calculation supports the buffer protocol interface, so

## Green-Kubo

Given <img src="svgs/fb97d38bcc19230b0acd442e17db879c.svg?invert_in_darkmode" align=middle width=17.73973739999999pt height=22.465723500000017pt/> vector time series of length <img src="svgs/f9c4988898e7f532b9f826a75014ed3c.svg?invert_in_darkmode" align=middle width=14.99998994999999pt height=22.465723500000017pt/> <img src="svgs/6ea8ff937f14aa2a3e0622a395dee889.svg?invert_in_darkmode" align=middle width=27.224196449999987pt height=22.55708729999998pt/>, <img src="svgs/e17d8a1279f0bfa5b92c35f05c242243.svg?invert_in_darkmode" align=middle width=101.57889059999998pt height=24.65753399999998pt/>, <img src="svgs/3ef98fe3db393644cede24e594d556d5.svg?invert_in_darkmode" align=middle width=90.34214144999999pt height=24.65753399999998pt/>,
Given <img src="svgs/fb97d38bcc19230b0acd442e17db879c.svg?invert_in_darkmode" align=middle width=17.73973739999999pt height=22.465723500000017pt/> vector time series of length <img src="svgs/f9c4988898e7f532b9f826a75014ed3c.svg?invert_in_darkmode" align=middle width=14.99998994999999pt height=22.465723500000017pt/> <img src="svgs/6ea8ff937f14aa2a3e0622a395dee889.svg?invert_in_darkmode" align=middle width=27.22419314999999pt height=22.55708729999998pt/>, <img src="svgs/e17d8a1279f0bfa5b92c35f05c242243.svg?invert_in_darkmode" align=middle width=101.57889059999998pt height=24.65753399999998pt/>, <img src="svgs/3ef98fe3db393644cede24e594d556d5.svg?invert_in_darkmode" align=middle width=90.34214144999999pt height=24.65753399999998pt/>,
implements an expression equivalent to the following formula:
<p align="center"><img src="svgs/168f1aff83bb74b9c5f70ae822348485.svg?invert_in_darkmode" align=middle width=251.32803135pt height=254.2893243pt/></p>
but with the trapezoidal rule in place of the sums marked with <img src="svgs/cdcac8939f3840cd8cddf40059a4cf58.svg?invert_in_darkmode" align=middle width=6.735194399999992pt height=22.63846199999998pt/>. Note that <img src="svgs/bcd81920696e26093495d90eb4cd7b1b.svg?invert_in_darkmode" align=middle width=16.153034699999992pt height=22.465723500000017pt/> is a matrix. To get the correct units of measure, you have still to multiply all the quantities but the <img src="svgs/9442d407594839d38f91ad3d2deb134e.svg?invert_in_darkmode" align=middle width=16.71464189999999pt height=22.465723500000017pt/>s by the integration timestep. <img src="svgs/8379d82223552fdbdfd98783823c755b.svg?invert_in_darkmode" align=middle width=33.56332814999999pt height=22.465723500000017pt/> is the number of timesteps on which the code runs the average.
Expand Down
Binary file modified README.pdf
Binary file not shown.
51 changes: 36 additions & 15 deletions README_.md
Original file line number Diff line number Diff line change
Expand Up @@ -69,10 +69,16 @@ analisi_traj = pyanalisi.Trajectory(
pos, #shape (ntimesteps, natoms, 3)
vel, #same shape as pos
types, #shape (natoms)
box, #shape (ntimesteps, 3,3) or (ntimesteps, 6)
True, # if True, box for a single frame is 3,3,
# if False is 6
False # don't apply pbc
box, #shape (ntimesteps, 3,3)
# or (ntimesteps, 6)
# or (ntimesteps, 9)
pyanalisi.BoxFormat.CellVectors,
# input box format is
# matrix of cell vectors
False, # don't wrap atoms around the center of
# the cell
False # don't save rotation matrix that is used
# internally if the cell is triclinic
)
#do the calculation that you want
Expand Down Expand Up @@ -266,22 +272,34 @@ Where meaningful, the following arguments are shared by all calculation performe
You can create a trajectory python object to be used in this library in two ways:
- start from python arrays generated by other code
- use a LAMMPS binary file that you have on the filesystem. This is the same file that is used by the command line interface.

to access the positions, velocities and cell data you can use in both the python buffer protocol interface and the lammps binary one the functions `get_positions_copy()`, `get_velocities_copy()` and `get_box_copy()`. Note that each call of these functions allocates more memory, as needed.

### Internal format used to store the cell information

Internally the format used for storing the cell information is:

`[x_low, y_low, z_low, lx/2, ly/2, lz/2, xy, xz, yz]`

and is accessible by using the python function `get_box_copy()` common to the trajectory objects. Note that internally the first cell vector is always in the same direction of the x axis, the second one is on the xy plane while the third has an arbitrary direction. This is equivalent to requiring that the cell matrix is triangular. If necessary, this is achieved by doing a QR decomposition of the input cell matrix and then rotating everything with the Q matrix.


#### using Python arrays: the buffer protocol interface
### using Python arrays: the buffer protocol interface


You must have 4 arrays. In general the interface supports any object that supports python's buffer protocol. This include, among others, numpy arrays. Suppose you have a trajectory of `t` timesteps and `n` atoms. You need the following arrays:
- position array, of shape `(t,n,3)`
- velocity array, of shape `(t,n,3)`
- cell array, of shape `(t,3,3)` only diagonal matrices (orthorombic cells) are supported at the moment, or if a lammps formated cell is provided `(t,6)` The lammps format simply list the low and high coordinates of the orthorombic cell, as in the header of each timestep that you find in the lammps trajetory format (shown in [command line interface](#command-line-interface) section )
- cell array, of shape `(t,3,3)` or if a lammps formated cell is provided `(t,6)` for orthorombic and `(t,9)` for triclinic. The lammps format simply list the low and high coordinates for each dimension and the tilts factors as described below
- types array, of shape `(n)` (integer array)

In general no particular units of measure are required, and the output will reflect the units that you used in the input. The calculations that the program does are reported exactly in the following sections. From those formulas you can deduce the units of the output given the units that you used in the input.

Then you must decide if you want that the coordinates are rewrapped inside the cell or not. At the moment only orthorombic cells are supported in all calculations but those that need only unwrapped coordinates, like MSD.
Then you must decide if you want that the coordinates are rewrapped around the center of the cell or not. This is not equivalent to wrapping the coordinates inside the cell for the triclinic case, but generates a compact list of positions suitable for an efficient calculation of the minimum image distance.

The lammps format for the cell is `[x_lo, x_hi, y_lo, y_hi, z_lo, z_hi, xy, xz, yz]`, as described in the [lammps documentation](https://docs.lammps.org/Howto_triclinic.html).

The lammps format for the cell is `[x_lo, x_hi, y_lo, y_hi, z_lo, z_hi]`: you have to provide only the coordinates boundaries.
If the plain cell vectors are provided in the cell matrix and this matrix is not diagonal, a QR decomposition is performed to get a triangular cell matrix and the achieve the desidered internal format. In this case all velocities and positions vector are rotated with the rotation matrix Q, that can be obtained with the function `get_rotation_matrix()`.

The syntax for creating the object is
```
Expand All @@ -290,12 +308,14 @@ analisi_traj = pyanalisi.Trajectory(positions_array,
velocity_array,
types_array,
box_array,
use_matrix_or_lammps_cell_format,
wrap_atomic_coordinates)
input_box_format_id,
wrap_atomic_coordinates,
save_Q_rotation_matrix
)
```
where `use_matrix_or_lammps_cell_format` is `True` if usual matrix format for the cell is given and `False` if a lammps formatted cell is provided and `wrap_atomic_coordinates` is `True` if you want to wrap all the atomic coordinates inside the cell.
where `input_box_format_id` is one of `pyanalisi.BoxFormat.CellVectors`, `pyanalisi.BoxFormat.LammpsOrtho`, `pyanalisi.BoxFormat.LammpsTriclinic` and describe the format of the array `box_array`. If `wrap_atomic_coordinates` is `True` the code will wrap all the atomic coordinates around the center of the cell (that does not mean inside the cell in the triclinc case). This makes the code for computing the minimum image distance more efficient if the atoms were far away out of the cell, but invalidates all the calculations were the atoms are not supposed to be wrapped back inside the cell.

You can write a LAMMPS bynary trajectory (that can be used by the command line interface with mpi, for example) by calling
You can write a LAMMPS binary trajectory (that can be used by the command line interface with mpi, for example) by calling
```
analisi_traj.write_lammps_binary('output_path', start_timestep, end_timestep)
```
Expand All @@ -313,7 +333,8 @@ where `start_timestep` is the first timestep index to dump (indexes start from 0
import pyanalisi
analisi_traj = pyanalisi.Trajectory(pos, vel, types, box,
True, # matrix format for the box array
False # don't wrap the coordinates
False, # don't wrap the coordinates
False # not interested in Q matrix
)
analisi_traj.write_lammps_binary("output_filename.bin"
, 0, # starting timestep
Expand All @@ -322,7 +343,7 @@ where `start_timestep` is the first timestep index to dump (indexes start from 0
```


#### LAMMPS binary trajectory interface
### LAMMPS binary trajectory interface


This interface is a little more complicated, since it was designed for computing block averages of very big files. It can read the same files that the command line program reads. The object is created with
Expand Down Expand Up @@ -371,7 +392,7 @@ The `header_array` object must describe the columns that are stored in the bidim

Given a trajectory $\bf ^ix_t$ where $i\in\{1,\dots,N_{atoms}\}$ is the atomic index and $t$ is the timestep index, defining the center of mass position of the atomic species $j$ at the timestep $t$ as
$$
^{j}cm_t=\frac{1}{N_{j}}\sum_{i|type(i)=j}^i{\bf x}_{t}
^{j}cm_t=\frac{1}{N_{j}}\sum_{i|type(i)=j}{\bf x}_{t}
$$
where $N_j$ is the number of atoms of the specie $j$,
the code computes the following
Expand Down
Loading

0 comments on commit 5a042d3

Please sign in to comment.