Skip to content

Commit

Permalink
deploy: 434f923
Browse files Browse the repository at this point in the history
  • Loading branch information
PicoCentauri committed Oct 11, 2023
0 parents commit c5a203c
Show file tree
Hide file tree
Showing 689 changed files with 103,643 additions and 0 deletions.
9 changes: 9 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
target/
**/*.rs.bk
Cargo.lock

.tox/
build/
dist/
*.egg-info
__pycache__/
Empty file added .nojekyll
Empty file.
9 changes: 9 additions & 0 deletions _redirect.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
<meta http-equiv="refresh" content="0;URL=rascaline/index.html" />
</head>
<body></body>
</html>
9 changes: 9 additions & 0 deletions index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,9 @@
<!DOCTYPE html>
<html>
<head>
<meta charset="utf-8" />
<meta http-equiv="X-UA-Compatible" content="IE=edge,chrome=1" />
<meta http-equiv="refresh" content="0;URL=latest/index.html" />
</head>
<body></body>
</html>
4 changes: 4 additions & 0 deletions latest/.buildinfo
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
# Sphinx build info version 1
# This file hashes the configuration used when building these files. When it is not found, a full rebuild will be done.
config: f3b9040e567a8e9a253537fb51c0495b
tags: 645f666f9bcd5a90fca523b33c5a78b7
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/devdoc/get-started.doctree
Binary file not shown.
Binary file added latest/.doctrees/devdoc/how-to/index.doctree
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/devdoc/how-to/profiling.doctree
Binary file not shown.
Binary file added latest/.doctrees/devdoc/index.doctree
Binary file not shown.
Binary file added latest/.doctrees/environment.pickle
Binary file not shown.
Binary file added latest/.doctrees/examples/compute-soap.doctree
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/examples/index.doctree
Binary file not shown.
Binary file added latest/.doctrees/examples/keys-selection.doctree
Binary file not shown.
Binary file added latest/.doctrees/examples/profiling.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/explanations/concepts.doctree
Binary file not shown.
Binary file added latest/.doctrees/explanations/index.doctree
Binary file not shown.
Binary file added latest/.doctrees/explanations/soap.doctree
Binary file not shown.
Binary file added latest/.doctrees/get-started/index.doctree
Binary file not shown.
Binary file added latest/.doctrees/get-started/installation.doctree
Binary file not shown.
Binary file added latest/.doctrees/get-started/rascaline.doctree
Binary file not shown.
Binary file added latest/.doctrees/get-started/tutorials.doctree
Binary file not shown.
Binary file added latest/.doctrees/how-to/computing-soap.doctree
Binary file not shown.
Binary file added latest/.doctrees/how-to/index.doctree
Binary file not shown.
Binary file added latest/.doctrees/how-to/keys-selection.doctree
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/how-to/sample-selection.doctree
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/index.doctree
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/references/api/c/index.doctree
Binary file not shown.
Binary file added latest/.doctrees/references/api/c/misc.doctree
Binary file not shown.
Binary file added latest/.doctrees/references/api/c/systems.doctree
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/references/api/cxx/index.doctree
Binary file not shown.
Binary file added latest/.doctrees/references/api/cxx/misc.doctree
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/references/api/index.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/references/api/rust.doctree
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file not shown.
Binary file added latest/.doctrees/references/index.doctree
Binary file not shown.
Original file line number Diff line number Diff line change
@@ -0,0 +1,248 @@
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"\n# Property Selection\n\n.. start-body\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"import chemfiles\nimport numpy as np\nfrom metatensor import Labels, MetatensorError, TensorBlock, TensorMap\nfrom skmatter.feature_selection import FPS\n\nfrom rascaline import SoapPowerSpectrum"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"First we load the dataset with chemfiles\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"with chemfiles.Trajectory(\"dataset.xyz\") as trajectory:\n frames = [f for f in trajectory]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and define the hyper parameters of the representation\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"HYPER_PARAMETERS = {\n \"cutoff\": 5.0,\n \"max_radial\": 6,\n \"max_angular\": 4,\n \"atomic_gaussian_width\": 0.3,\n \"center_atom_weight\": 1.0,\n \"radial_basis\": {\n \"Gto\": {},\n },\n \"cutoff_function\": {\n \"ShiftedCosine\": {\"width\": 0.5},\n },\n}\n\ncalculator = SoapPowerSpectrum(**HYPER_PARAMETERS)\n\ndescriptor = calculator.compute(frames)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The selections for feature can be a set of ``Labels``, in which case the names\nof the labels must be a subset of the names of the properties produced by the\ncalculator. You can see the default set of names with:\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"print(\"property names:\", descriptor.property_names)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can use a subset of these names to define a selection. In this case, only\nproperties matching the labels in this selection will be used by rascaline\n(here, only properties with ``l = 0`` will be used)\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"selection = Labels(\n names=[\"l\"],\n values=np.array([[0]]),\n)\nselected_descriptor = calculator.compute(frames, selected_properties=selection)\n\nselected_descriptor = selected_descriptor.keys_to_samples(\"species_center\")\nselected_descriptor = selected_descriptor.keys_to_properties(\n [\"species_neighbor_1\", \"species_neighbor_2\"]\n)\n\nproperties = selected_descriptor.block().properties"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We expect to get `[0]` as the list of `l` properties\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"print(f\"we have the following angular components: {np.unique(properties['l'])}\")"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The previous selection method uses the same selection for all blocks. If you\ncan to use different selection for different blocks, you should use a\n``TensorMap`` to create your selection\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"selected_descriptor = calculator.compute(frames, selected_properties=selection)\ndescriptor_for_comparison = calculator.compute(\n frames, selected_properties=selected_descriptor\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The descriptor had 180 properties stored in the first block, the\nselected_descriptor had 36. So ``descriptor_for_comparison`` will also have 36\nproperties.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"print(\"shape of first block initially:\", descriptor.block(0).values.shape)\nprint(\"shape of first block of reference:\", selected_descriptor.block(0).values.shape)\nprint(\n \"shape of first block after selection:\",\n descriptor_for_comparison.block(0).values.shape,\n)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"The ``TensorMap`` format allows us to select different features within each\nblock, and then construct a general matrix of features. We can select the most\nsignificant features using FPS, which selects features based on the distance\nbetween them. The following code snippet selects the 10 most important\nfeatures in each block, then constructs a TensorMap containing this selection,\nand calculates the final matrix of features for it.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"def fps_feature_selection(descriptor, n_to_select):\n \"\"\"\n Select ``n_to_select`` features block by block in the ``descriptor``, using\n Farthest Point Sampling to do the selection; and return a ``TensorMap`` with\n the right structure to be used as properties selection with rascaline calculators\n \"\"\"\n blocks = []\n for block in descriptor:\n # create a separate FPS selector for each block\n fps = FPS(n_to_select=n_to_select)\n mask = fps.fit(block.values).get_support()\n selected_properties = Labels(\n names=block.properties.names,\n values=block.properties.values[mask],\n )\n # The only important data here is the properties, so we create empty\n # sets of samples and components.\n blocks.append(\n TensorBlock(\n values=np.empty((1, len(selected_properties))),\n samples=Labels.single(),\n components=[],\n properties=selected_properties,\n )\n )\n\n return TensorMap(descriptor.keys, blocks)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"We can then apply this function to subselect according to the data contained\nin a descriptor\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"selection = fps_feature_selection(descriptor, n_to_select=10)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"and use the selection with rascaline, potentially running the calculation on a\ndifferent set of systems\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"selected_descriptor = calculator.compute(frames, selected_properties=selection)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"Note that in this case it is no longer possible to have a single feature\nmatrix, because each block will have its own properties.\n\n"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {
"collapsed": false
},
"outputs": [],
"source": [
"try:\n selected_descriptor.keys_to_samples(\"species_center\")\nexcept MetatensorError as err:\n print(err)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
".. end-body\n\n"
]
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.11.5"
}
},
"nbformat": 4,
"nbformat_minor": 0
}
Loading

0 comments on commit c5a203c

Please sign in to comment.