diff --git a/introduction/getting-started/index.md b/introduction/getting-started/index.md index 0fe0f19..f31eebd 100644 --- a/introduction/getting-started/index.md +++ b/introduction/getting-started/index.md @@ -2,16 +2,21 @@ ## Overview -Using **`dms-viz`** involves two steps. First, using a command line tool called [`configure-dms-viz`](https://pypi.org/project/configure-dms-viz/), you specify some information about your dataset to generate a `.json` format specification file. Second, you open up the [web-based tool](https://dms-viz.github.io/) and upload your specification file to generate an interactive visualization. Below are some quickstart instructions to get you oriented. +**`dms-viz`** requires two steps: + +1. First, using a command line tool called [`configure-dms-viz`](https://pypi.org/project/configure-dms-viz/), you specify information about your dataset to generate a `.json` specification file. + +2. Second, you open up the [web tool](https://dms-viz.github.io/) and upload your `.json` specification file to generate an interactive visualization. Below are some instructions to get you oriented. ::: tip Want to Skip Ahead? If you're interested in the detailed command line API, check out the reference [here](/preparing-data/command-line-api/). If you've already formatted your data and you're ready to start visualizing it, check out the instructions for that [here](/visualizing-data/web-tool-api/). ::: -Prerequisites -To start using **`dms-viz`** with your own data, you'll need to install the command line tool [`configure-dms-viz`](https://pypi.org/project/configure-dms-viz/). To use `configure-dms-viz`, you must ensure that you have the correct version of Python (3.9 or later) installed on your system. +## Prerequisites + +To use **`dms-viz`** with your data, you'll need to install the command line tool [`configure-dms-viz`](https://pypi.org/project/configure-dms-viz/). To use `configure-dms-viz`, you must ensure that you have the correct version of Python (3.9 or later) installed on your system. -If you are unsure whether you have the correct version of Python installed, open a terminal window (Command Prompt in Windows, Terminal in macOS, or a terminal emulator in Linux) type the following command and press Enter: +If you are unsure whether the correct version of Python is installed, open a terminal window (Command Prompt in Windows, Terminal in macOS, or a terminal emulator in Linux) and run the following command: ```bash python --version @@ -19,7 +24,7 @@ python --version Check the version number that is displayed. It should be 3.9.x or later. If the command isn't recognized or the version is earlier than 3.9, you will need to install or update Python. -To install `configure-dms-viz`, you'll also need the package manager `pip`. Here's how to check if `pip` is installed and how to install it if it isn't. In the terminal window, type the following command and press Enter: +To install `configure-dms-viz`, you'll also need the package manager `pip`. Here's how to check if `pip` is installed and how to install it if it isn't. In the terminal window, run the following command: ```bash pip --version @@ -29,26 +34,26 @@ If `pip` is installed, the version number will be displayed. If it is not instal ## Installation -Currently, `configure-dms-viz` is distributed on [PyPI](https://pypi.org/), allowing you to install `configure-dms-viz` using `pip`. To install the latest version of `configure-dms-viz`, type the following command into the terminal: +Currently, `configure-dms-viz` is distributed on [PyPI](https://pypi.org/), allowing you to install `configure-dms-viz` using `pip`. To install the latest version of `configure-dms-viz`, run the following command in the terminal: ```bash pip install configure-dms-viz ``` -Now, `configure-dms-viz` should have been installed and you shouldn't see any error messages. You can double-check that the installation worked correctly by typing the following into the terminal: +Now, `configure-dms-viz` should be installed. You can double-check that the installation worked by running the following command in the terminal: ```bash configure-dms-viz --help ``` -You should see the help message for the tool printed to the terminal. +You should see the help message for the tool printed to the terminal's output. ## Basic Usage -`configure_dms_viz` is a command-line tool designed to create a `JSON` format specification file for **`dms-viz`**. You provide the data that you'd like to visualize along with additional information to customize the analysis. The resulting specification file can be uploaded to [**`dms-viz`**](https://dms-viz.github.io/) for interactive visualization of your data. Below is an overview of the process of using `configure_dms_viz`. +`configure_dms_viz` is a command-line tool designed to create a `.json` format specification file for **`dms-viz`**. You provide the data that you'd like to visualize along with additional information to customize the visualization. The resulting specification file is then uploaded to [**`dms-viz`**](https://dms-viz.github.io/) to create the visualization of your data. Below is an overview of the process of using `configure_dms_viz`. ::: tip Looking for more details? -For a detailed explanation of the features of `configure_dms_viz` check out the reference [here](/preparing-data/command-line-api/). +This example only covers the *basic* use case. You can augment `dms-viz` with custom tooltips, filters, and more. For a detailed explanation of these features, check out the reference [here](/preparing-data/command-line-api/). ::: `configure-dms-viz` has two commands, `format` and `join`. To format a single dataset for **`dms-viz`**, you execute the `configure-dms-viz format` command with the required and optional arguments as needed: @@ -59,29 +64,27 @@ configure-dms-viz format \ --input \ --metric \ --structure \ - --sitemap \ --output \ [optional_arguments] ``` -The information that is required to make a visualization file for **`dms-viz`** is as follows: +The **required** arguments are: 1. `--name`: The [name of your dataset](/preparing-data/command-line-api/#name) as you'd like it to appear in the visualization. 2. `--input`: The file path to your [input data](/preparing-data/command-line-api/#input). 3. `--metric`: The name of the column that contains [the metric](/preparing-data/command-line-api/#metric) you want to visualize. 4. `--structure`: The [protein structure](/preparing-data/command-line-api/#structure) that you want to use as a model. -5. `--sitemap`: [A map of the sites](/preparing-data/command-line-api/#sitemap) in your data to the sites in the reference and protein. -6. `--output`: The file path of [the output](/preparing-data/command-line-api/#output) `.json` file. +5. `--output`: The file path of [the output](/preparing-data/command-line-api/#output) `.json` file. -The remaining arguments are all _optional_ and configure the look and interaction of your final visualization. For more details on the individual arguments, check out the [API reference](/preparing-data/command-line-api/). +The remaining arguments are all _optional_ and configure the structure, appearance, and interaction of your final visualization. For more details on the individual arguments, check out the [API reference](/preparing-data/command-line-api/). ::: warning Before going any further -If you plan to use `configure-dms-viz` right away, it's crucial to make sure that your data meets some initial requirements. Please check out what these requirements are [here](/preparing-data/data-requirements/). +If you plan to use `configure-dms-viz`, it's _crucial_ that you familiarize yourself with the input data requirements. Please check out what these requirements are [here](/preparing-data/data-requirements/). ::: Now, let's use `configure-dms-viz` with a minimal example. This data is included in the [GitHub repository](https://github.com/dms-viz/configure_dms_viz/tree/main). If you want to follow along, clone the repository and run `configure-dms-viz` from the top of the directory. -**Input** +### Input ```bash configure-dms-viz format \ @@ -97,11 +100,11 @@ configure-dms-viz format \ --output ./REGN_escape.json ``` -Here, we've specified that we want the dataset to be called `REGN mAb Cocktail` (named after the Regeneron Antibody cocktail against SARS-CoV-2) and we've pointed to the [input data](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-RBD-REGN-DMS/input/REGN_escape.csv) containing scores detailing the degree of antibody escape from the `REGN mAb Cocktail`. We've also specified a [sitemap](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-RBD-REGN-DMS/sitemap/sitemap.csv) that tells the tool how sites in your data correspond to the sites in the protein structure. Then, we specified that we wanted to use the protein structure `6XDG` from the [RSCB PDB](https://www.rcsb.org/) and only show our data on chain `E` of that structure. The column in the input data that contains our data is called `mut_escape`, and we have different values of `mut_escape` for the same mutations depending on the `condition` (in this case, the condition refers to escape from each antibody in the cocktail). +Here, we've specified that we want the dataset to be called `REGN mAb Cocktail` (named after the Regeneron Antibody cocktail against SARS-CoV-2) and we've pointed to the [input data](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-RBD-REGN-DMS/input/REGN_escape.csv) containing scores detailing the degree of antibody escape from the `REGN mAb Cocktail`. We've also specified a [sitemap](https://github.com/dms-viz/configure_dms_viz/blob/main/tests/SARS2-RBD-REGN-DMS/sitemap/sitemap.csv) that tells the tool how sites in your data correspond to the sites in the protein structure. Then, we specified that we wanted to use the protein structure `6XDG` from the [RSCB PDB](https://www.rcsb.org/) website and only display our data on chain `E` of that structure. The column in the input data that contains our data is called `mut_escape`, and we have different values of `mut_escape` for the same mutations depending on the `condition` (in this case, the condition refers to escape from each antibody in the cocktail). The result of this command should be a message printed to the terminal that looks like this: -**Output** +### Output ```md Formatting data for visualization using the 'mut_escape' column from 'tests/SARS2-RBD-REGN-DMS/input/REGN_escape.csv'... @@ -137,7 +140,8 @@ To upload a local file, you simply click on the `Upload Data` section and choose Since the `.json` file created above should now be stored locally on your machine, you can upload this file using this approach. ### Remote -Alternatively, if your raw `.json` file is hosted somewhere online – like on GitHub, for example – you can provide the link to this file by clicking on the `Remote` button under the `Upload Data` section. + +Alternatively, if your `.json` specification file is hosted somewhere online – like on GitHub, for example – you can provide the link to this file by clicking on the `Remote` button under the `Upload Data` section.
Remote Upload @@ -149,4 +153,6 @@ You can try it yourself by pasting the following link into the URL text box: https://raw.githubusercontent.com/dms-viz/configure_dms_viz/main/tests/SARS2-RBD-REGN-DMS/output/SARS2-RBD-REGN-DMS.json ``` -This approach has some advantages. For example, after providing a link to your data, this link is saved in the URL, allowing you to share a view of **`dms-viz`** with the data pre-loaded and ready to view. For more details on using the web-based interface of **`dms-viz`** including hosting, interacting, and sharing your files, check out the [interaction reference](/visualizing-data/web-tool-api/). +This approach has some advantages. **`dms-viz`** includes the link to your remotely stored specification in the URL, allowing you to share your visualization with the data pre-loaded. Another advantage of this approach is that changes made to the appearance of **`dms-viz`** are saved in the URL as well. + +For more details on using the web-based interface of **`dms-viz`** including hosting, interacting, and sharing your files, check out the [interaction reference](/visualizing-data/web-tool-api/). diff --git a/introduction/what-is-dms-viz/index.md b/introduction/what-is-dms-viz/index.md index eb0548a..bc08054 100644 --- a/introduction/what-is-dms-viz/index.md +++ b/introduction/what-is-dms-viz/index.md @@ -1,18 +1,22 @@ -# What is dms-viz? +# What is `dms-viz`? -Hi there 👋, if you've got some mutation-level data that you want to view on an interactive 3D protein structure, you're in the right place! **`dms-viz`** is a tool that helps you take quantitative data associated with mutations to a protein and analyze that data using intuitive visual summaries in the context of an interactive 3D protein structure. Visualizations created with **`dms-viz`** are intended to be _flexible_, _customizable_, and _shareable_. +Hi there 👋, if you have mutation-based data you want to view on an interactive 3D protein structure, you're in the right place! **`dms-viz`** is a tool that helps you take quantitative data associated with mutations to a protein and analyze that data with intuitive visual summaries and an interactive 3D protein structure. Visualizations created with **`dms-viz`** are _flexible_, _customizable_, and _shareable_. ::: tip Ready to use the tool? -You can skip to the [Quickstart](/introduction/getting-started/) to learn how to prepare your data, or you can see what the visualization tool looks like [here](https://dms-viz.github.io/). +Skip to the [Getting Started](/introduction/getting-started/) guide to learn how to prepare your data. ::: ## Purpose -Understanding how mutations impact a protein's functions is valuable for many types of biological questions. High-throughput techniques such as deep-mutational scanning (DMS) have greatly expanded the number of mutation-function datasets. For instance, DMS has been used to determine how mutations to viral proteins affect antibody escape, receptor affinity, and essential functions such as viral genome transcription and replication. +Many biological questions require a thorough understanding of how mutations to a protein impact its functions. High-throughput techniques such as deep-mutational scanning (DMS) have greatly expanded the number of mutation-function datasets. For instance, DMS has been used to determine how mutations to viral proteins affect antibody escape, receptor affinity, and essential functions such as viral genome transcription and replication. The mutation-based data generated by these approaches is often best understood in the context of a protein’s 3D structure; for instance, to assess questions like how mutations that affect antibody escape relate to the physical antibody binding epitope on the protein. However, current approaches for visualizing mutation data in the context of a protein’s structure are often cumbersome and require multiple steps and software. To streamline the visualization of mutation-associated data in the context of a protein structure, we developed a web-based tool, **`dms-viz`**. With **`dms-viz`**, users can straightforwardly visualize mutation-based data such as those from DMS experiments in the context of a 3D protein model in an interactive format. -## Why use dms-viz? +::: tip Interested in what the visualizations look like? +Check out these [examples](/visualizing-data/vignettes/) to see `dms-viz` in action. +::: + +## Why use `dms-viz`? - **Flexible Inputs** @@ -26,12 +30,12 @@ The mutation-based data generated by these approaches is often best understood i If your data is hosted online (e.g. in a [GitHub](https://github.com/) repository), you can share your data with URLs that automatically load the visualization while keeping your settings. However, if you don't want to host your data online, you can still use **`dms-viz`** with locally stored `.json` files. -## Development +## Contributing to `dms-viz` **`dms-viz`** has two components: 1. A command line interface (CLI) for formatting data that was written in `Python` using the [click](https://click.palletsprojects.com/en/8.1.x/) API. -2. A web-based visualization tool written in 'vanilla' `Javascript` using primarily the libraries [D3.js](https://d3js.org/) for making the visualizations and [NGL.js](https://nglviewer.org/#page-top) for creating interactive molecular structures. +2. A web-based visualization tool written in `Javascript` using primarily the libraries [D3.js](https://d3js.org/) for making the visualizations and [NGL.js](https://nglviewer.org/#page-top) for creating interactive molecular structures. If you're interested in contributing, check out the [Contributing Guide](/project-info/contributing-guide/) for details. @@ -40,5 +44,5 @@ If you're interested in contributing, check out the [Contributing Guide](/projec If you end up using **`dms-viz`** in your paper, please cite us! ```md -TODO: Add citation here +Citation pending... ``` diff --git a/preparing-data/command-line-api/index.md b/preparing-data/command-line-api/index.md index 4ee9d0f..5fc80d8 100644 --- a/preparing-data/command-line-api/index.md +++ b/preparing-data/command-line-api/index.md @@ -1,8 +1,10 @@ # Command Line API +You'll need to use the command line tool `configure_dms_viz` to prepare your data for **`dms-viz`**. Follow the instructions in [Getting Started](/introduction/getting-started/) to install `configure_dms_viz` on your operating system. + ## Basic Usage -`configure_dms_viz` is a command-line tool designed to create a `JSON` format specification file for [**`dms-viz`**](https://dms-viz.github.io/). You provide the data that you'd like to visualize along with additional information to customize the analysis. The resulting specification file can be uploaded to **`dms-viz`** for interactive visualization of your data. Below is an overview of the process of using `configure_dms_viz`. +`configure_dms_viz` is a command-line tool designed to create a `.json` format specification file for [**`dms-viz`**](https://dms-viz.github.io/). You provide the data that you'd like to visualize along with additional information to customize the analysis. The resulting specification file can be uploaded to **`dms-viz`** for interactive visualization of your data. Below is an overview of the process of using `configure_dms_viz`. `configure-dms-viz` has two commands; `format` and `join`. To format your data, you execute the `configure-dms-viz format` command with the required and optional arguments as needed: @@ -12,12 +14,11 @@ configure-dms-viz format \ --input \ --metric \ --structure \ - --sitemap \ --output \ [optional_arguments] ``` -This creates a single dataset that can be loaded into **`dms-viz`**. However, in some cases, you might want to visualize multiple datasets simultaneously. To do this, you use the `configure-dms-viz join` command. The `join` command takes a list of formatted `.json` files and combines them into a single `.json` specification file containing each dataset. Optionally, you can also describe the file by specifying the path to a `.md` file with your desired description: +This creates a single dataset that can be loaded into **`dms-viz`**. However, in some cases, you might want to visualize multiple datasets simultaneously. To do this, you use the `configure-dms-viz join` command. The `join` command takes a list of formatted `.json` files and combines them into a single `.json` specification file containing each dataset. Optionally, you can add a markdown description of your joined datasets by specifying the path to a `.md` file with your desired description: ```bash configure-dms-viz join \ @@ -26,6 +27,26 @@ configure-dms-viz join \ --description ``` +## Advanced Usage + +This is the most basic usage of `configure-dms-viz`; however, `configure-dms-viz` is a flexible formatting tool that provides many options for customizing your analysis. In addition to the description of the command line API below, we'll detail some highlights of the customization available through `configure-dms-viz`. + +### Custom Filters + +`configure-dms-viz` allows you to specify *quantitative* columns in your [input data](/preparing-data/data-requirements/#input-data) to use as dynamic filters in **`dms-viz`**. The columns you specify will populate sliders in the sidebar under "`Filters`". By dragging the slider, you filter out the mutations or sites in the visualization with values less than the selected value for the column you specify. + +To add filters with `configure-dms-viz`, specify *quantitative* columns using the `--filter-cols` flag by providing a dictionary that establishes your chosen columns and the name that will appear in the visualization (i.e. `"{'effect': 'Functional Effect', 'times_seen': 'Times Seen'}"`). In this example, the columns that are used as filters are `effect` and `times_seen` in the input data, and the names that will label the filters are `Functional Effect` and `Times Seen`. + +In addition to specifying filters, you can set their default value and limits with the `--filter-limits` flag by providing a dictionary formatted like so: `"{'effect': [min, value, max], 'times_seen': [min, value, max]}"`. You can *only* specify the min and max (i.e. `[min, max]`), but it's **highly** recommended that you set a default value for the filter that makes sense for your data. + +Check out vignette #2 in the [Vignettes](/visualizing-data/vignettes/) for an example visualization that uses filters. + +### Custom Tooltips + +In a similar process to adding custom filters, `configure-dms-viz` allows you to specify columns to include as tooltips. Tooltips will appear when you center your mouse over a point in the line-point summary plot at the center of the visualization. + +Use the `--tooltip-cols` flag to specify columns that should provide information through tooltips by providing a dictionary like so: `"{'times_seen': '# Obsv', 'effect': 'Func Eff.'}"`, where the key is the column's name and the value is the label as it should appear in the tooltip. + ## `configure-dms-viz format` _This subcommand formats your data for **`dms-viz`**. Below is a description of each argument._ @@ -34,7 +55,7 @@ _This subcommand formats your data for **`dms-viz`**. Below is a description of `` - Path to a `.csv` file with site- and mutation-level data to visualize with a protein structure. [See details [here](/preparing-data/data-requirements/) for the required columns and format. + Path to a `.csv` file with site- and mutation-level data to visualize with a protein structure. [See details here](/preparing-data/data-requirements/) for the required columns and format. - ### `--name` diff --git a/preparing-data/data-requirements/index.md b/preparing-data/data-requirements/index.md index 451e454..6b49795 100644 --- a/preparing-data/data-requirements/index.md +++ b/preparing-data/data-requirements/index.md @@ -1,14 +1,19 @@ # Data Requirements -To use **`dms-viz`**, you'll need two files. First, you'll need some [input data](#input-data) that contains the mutation-based data you'd like to visualize. Second, you'll need [a map](#sitemap) of the sites that are mutated in your dataset to the sites in the reference and protein structure. +There are only two pieces of information you need for **`dms-viz`**: -_optionally_, if you have [additional data files](#join-data), you can join these with your input data. +1. You'll need [input data](#input-data). Your data should have a quantitative metric associated with mutations to a protein sequence. +2. You'll need a 3D structure for your protein. The structure can be associated with an [RCSB ID](https://www.rcsb.org/) or provided in a custom `.pdb` file. -Below are the detailed requirements for each data file along with example datasets. +Everything beyond these two requirements is __optional__. + +There are certain cases where you will need to provide additional information. For example, if the reference positions in your input data don't match the reference positions in your `.pdb` file, you'll need to specify a [`sitemap`](/preparing-data/data-requirements/#sitemap). Also, if you have data from another dataset that you wish to include in filters or tooltips, you can provide [`join data`](/preparing-data/data-requirements/#join-data) to merge with the input data. + +The formatting requirements for the input data and optional data are explained below in detail. ## Input Data -The **Input Data** is the data that you'd like to summarize and visualize on an interactive protein structure. It must contain a column with a quantitative metric that's associated with mutations in a protein sequence. For example, this data could be a fitness score associated with mutations to a protein, or a score that represents how a mutation changes antibody binding to an antigen. For detailed examples of **`dms-viz`** in action, check out these [Vignettes](/visualizing-data/vignettes/). +The **Input Data** is the mutation-based data that you'd like to summarize and visualize on an interactive protein structure. It must contain a column with a quantitative metric that's associated with mutations in a protein sequence. For example, this data could be a fitness score associated with mutations to a protein, or a score that represents how a mutation changes antibody binding to an antigen. For detailed examples of use cases for **`dms-viz`**, check out these [Vignettes](/visualizing-data/vignettes/). ::: warning Important! The input data must be in `.csv` format. If your data is tabular but in another format, please convert it to `.csv`. @@ -18,7 +23,7 @@ The input data must contain the following columns with **exactly** these names: - ### `site` or `reference_site` - This column should contain the **site** in the protein at which each measurement was made. This column can be numeric (i.e., `[1, 2, 3, 4]`) or it can contain strings (i.e., `[1, 2, 2a, 2b, 3]`). Additionally, the sites do not need to be continuous (i.e., `[1, 4, 5, 8]`). The order of your sites will be specified in the [Sitemap](#sitemap) using the `sequential_site` column. These sites will label the x-axis of all summary plots in **`dms-viz`**. + This column should contain the **site** in the protein at which each measurement was made. This column can be numeric (i.e., `[1, 2, 3, 4]`) or it can contain strings (i.e., `[1, 2, 2a, 2b, 3]`). Additionally, the sites do not need to be continuous (i.e., `[1, 4, 5, 8]`). The order of your sites is assumed to be their order in the data unless it is specified in the [Sitemap](#sitemap) using the `sequential_site` column. These __reference__ sites will label the x-axis of all summary plots in **`dms-viz`**. In addition, the `site` or `reference_site` in the input data is assumed to match the position in the provided protein structure. If the sites are numbered differently between your data and protein structure, you must specify the correct mapping in the `protein_site` column of the [Sitemap](#sitemap). For more details on what we mean by 'reference_site', check out the [description of the sitemap file](/preparing-data/data-requirements/#reference-site). @@ -28,11 +33,11 @@ The input data must contain the following columns with **exactly** these names: - ### `wildtype` - This column should contain the **wildtype** identity of residues at a given site in the protein. For example, if a Proline (`P`) was mutated to an Alanine (`A`) at position 120 in the protein (`P120A`), there should be a `P` in the wildtype column for every row where the value of the site column is 120. This column will also be used to check how well the protein structure you provided matches the wildtype sites in your data. Significant discrepancies can indicate that you're `reference`, `sequential`, and `protein` sites are misaligned. + This column should contain the **wildtype** identity of residues at a given site in the protein. For example, if a Proline (`P`) was mutated to an Alanine (`A`) at position 120 in the protein (`P120A`), there should be a `P` in the wildtype column for every row where the value of the site column is 120. This column will also be used to check how well the sequence of the protein structure you provided matches your data. Significant discrepancies can indicate that you're `reference`, `sequential`, and `protein` sites are misaligned. --- -In addition to these three mandatory columns, you will also need to specify a `metric` column. The identity of this column is specified with the `--metric` flag of `configure-dms-viz`, and it can have any name: +In addition to these three mandatory columns, you will also need to specify a `metric` column. The identity of this column is specified with the [`--metric`](/preparing-data/command-line-api/#metric) flag of `configure-dms-viz`, and it can have any name: - ### `` @@ -40,11 +45,11 @@ In addition to these three mandatory columns, you will also need to specify a `m --- -_Optionally_, depending on the design of your experiment, you can also include a "_condition column_" that specifies how your data is grouped if there are multiple measurements per mutation: +_Optionally_, depending on the design of your experiment, you can also include a "_condition column_" that specifies how your data is grouped if there are multiple conditions. In other words, you are required to specify this column if there are multiple measurements for the same mutations. - ### `condition` - This column should only be included if there are multiple measurements in the [``](/preparing-data/command-line-api/#metric) column for the same `site`/`mutation` combinations. An example of this would be if you have a measurement like an antibody's escape for multiple 'epitopes' in an antigen. This column contains a unique identifier that's used to delineate between these measurements for each mutation. This 'identifier' will show up in an interactive legend next to the visualization. + This column should only be included if there are multiple measurements in the [``](/preparing-data/command-line-api/#metric) column for the same `site`/`mutation` combinations. For example, you'll need a condition column if your data contains a measurement like an antibody's escape for multiple 'epitopes' in an antigen. This column contains a unique identifier that's used to delineate between these measurements for each mutation. This 'identifier' will show up in an interactive legend next to the visualization. ## Sitemap diff --git a/visualizing-data/vignettes/index.md b/visualizing-data/vignettes/index.md index c33328c..c0cd461 100644 --- a/visualizing-data/vignettes/index.md +++ b/visualizing-data/vignettes/index.md @@ -57,7 +57,7 @@ Which results in the `.json` specification located [here](https://github.com/dms
-## 2. Inferring the fitness landscape of the SARS-CoV-2 proteome from pyhlogenetic data +## 2. Inferring the fitness landscape of the SARS-CoV-2 proteome from phylogenetic data The scale of genomic sequencing surveillance of SARS-CoV-2 has led to the public availability of millions of SARS-CoV-2 sequences. [Bloom and Neher](https://doi.org/10.1101/2023.01.30.526314) developed an approach that leverages this massive amount of sequencing data to estimate the fitness effects of mutations in all SARS-CoV-2 proteins. Their approach works by computing the expected count of each mutation under neutral selection and comparing this count to the observed count of mutations along the [phylogeny](https://genome.ucsc.edu/cgi-bin/hgPhyloPlace). The result is an estimate of fitness that is very helpful for understanding the evolutionary constraint on the SARS-CoV-2 proteome. This kind of data is particularly useful for assessing the constraint on possible therapeutic targets that are untractable targets for deep mutational scanning.