Skip to content

Commit

Permalink
Merge pull request #592 from harvard-edge/590-some-sparse-notes
Browse files Browse the repository at this point in the history
590 Some sparse notes
  • Loading branch information
profvjreddi authored Jan 8, 2025
2 parents 5e1ebed + 4681b95 commit 86b8276
Show file tree
Hide file tree
Showing 4 changed files with 8 additions and 8 deletions.
10 changes: 5 additions & 5 deletions contents/core/hw_acceleration/hw_acceleration.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -189,13 +189,13 @@ Data locality and optimizing memory hierarchy are crucial for high throughput an
+-----------------------------------------+-------------------------+
| Main memory reference | 100 ns |
+-----------------------------------------+-------------------------+
| Compress 1K bytes with Zippy | 3,000 ns (3 us) |
| Compress 1K bytes with Zippy | 3,000 ns (3 µs) |
+-----------------------------------------+-------------------------+
| Send 1 KB bytes over 1 Gbps network | 10,000 ns (10 us) |
| Send 1 KB bytes over 1 Gbps network | 10,000 ns (10 µs) |
+-----------------------------------------+-------------------------+
| Read 4 KB randomly from SSD | 150,000 ns (150 us) |
| Read 4 KB randomly from SSD | 150,000 ns (150 µs) |
+-----------------------------------------+-------------------------+
| Read 1 MB sequentially from memory | 250,000 ns (250 us) |
| Read 1 MB sequentially from memory | 250,000 ns (250 µs) |
+-----------------------------------------+-------------------------+
| Round trip within same datacenter | 500,000 ns (0.5 ms) |
+-----------------------------------------+-------------------------+
Expand Down Expand Up @@ -956,7 +956,7 @@ As a result, optical computing is still in the very early research stage despite

Quantum computers leverage unique phenomena of quantum physics, like superposition and entanglement, to represent and process information in ways not possible classically. Instead of binary bits, the fundamental unit is the quantum bit or qubit. Unlike classical bits, which are limited to 0 or 1, qubits can exist simultaneously in a superposition of both states due to quantum effects.

Multiple qubits can also be entangled, leading to exponential information density but introducing probabilistic results. Superposition enables parallel computation on all possible states, while entanglement allows nonlocal correlations between qubits. @fig-qubit visually conveys the differences between classical bits in computing and quantum bits (qbits).
Multiple qubits can also be entangled, leading to exponential information density but introducing probabilistic results. Superposition enables parallel computation on all possible states, while entanglement allows nonlocal correlations between qubits. @fig-qubit visually conveys the differences between classical bits in computing and quantum bits (qubits).

![Qubits, the building blocks of quantum computing. Source: [Microsoft](https://azure.microsoft.com/en-gb/resources/cloud-computing-dictionary/what-is-a-qubit)](images/png/qubit.png){#fig-qubit}

Expand Down
2 changes: 1 addition & 1 deletion contents/core/responsible_ai/responsible_ai.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -58,7 +58,7 @@ Responsible AI is about developing AI that positively impacts society under huma

* **Privacy:** Protecting sensitive user data and adhering to privacy laws and ethics

Putting these principles into practice involves technical techniques, corporate policies, governance frameworks, and moral philosophy. There are also ongoing debates around defining ambiguous concepts like fairness and determining how to balance competing objectives.
Putting these principles into practice involves technical skills, corporate policies, governance frameworks, and moral philosophy. There are also ongoing debates around defining ambiguous concepts like fairness and determining how to balance competing objectives.

## Principles and Concepts

Expand Down
2 changes: 1 addition & 1 deletion contents/core/training/training.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -444,7 +444,7 @@ Instead, the data splits should be randomized or shuffled for each experimental

With different splits per experiment, the evaluation becomes more robust. Each model is tested on a wide range of test sets drawn randomly from the overall population, smoothing out variation and removing correlation between results.

Proper practice is to set a random seed before splitting the data for each experiment. Splitting should occur after shuffling/resampling as part of the experimental pipeline. Carrying out comparisons on the same splits violates the i.i.d (independent and identically distributed) assumption required for statistical validity.
Proper practice is to set a random seed before splitting the data for each experiment. Splitting should occur after shuffling/resampling as part of the experimental pipeline. Carrying out comparisons on the same splits violates the independent and identically distributed (IID) assumption required for statistical validity.

Unique splits are essential for fair model comparisons. Though more compute-intensive, randomized allocation per experiment removes sampling bias and enables valid benchmarking. This highlights the true differences in model performance irrespective of a particular split's characteristics.

Expand Down
2 changes: 1 addition & 1 deletion contents/core/workflow/workflow.qmd
Original file line number Diff line number Diff line change
Expand Up @@ -40,7 +40,7 @@ The machine learning lifecycle is a systematic, interconnected process that guid

![The ML lifecycle.](images/png/ML_life_cycle.png){#fig-ml-lifecycle}

The prepared data then enters the data preparation stage, where it is transformed into machine learning-ready datasets through processes such as splitting and versioning. These datasets are used in the model training stage, where machine learning algorithms are applied to create predictive models. The resulting models are rigorously tested in the model evaluation stage, where performance metrics, such as key performance indicators (KPIs), are computed to assess reliability and effectiveness. The validated models move to the ML system validation phase, where they are verified for deployment readiness. Once validated, these models are integrated into production systems during the ML system deployment stage, ensuring alignment with operational requirements. The final stage tracks the performance of deployed systems in real time, enabling continuous adaptation to new data and evolving conditions.
The data then enters the preparation stage, where it is transformed into machine learning-ready datasets through processes such as splitting and versioning. These datasets are used in the model training stage, where machine learning algorithms are applied to create predictive models. The resulting models are rigorously tested in the model evaluation stage, where performance metrics, such as key performance indicators (KPIs), are computed to assess reliability and effectiveness. The validated models move to the ML system validation phase, where they are verified for deployment readiness. Once validated, these models are integrated into production systems during the ML system deployment stage, ensuring alignment with operational requirements. The final stage tracks the performance of deployed systems in real time, enabling continuous adaptation to new data and evolving conditions.

This general lifecycle forms the backbone of machine learning systems, with each stage contributing to the creation, validation, and maintenance of scalable and efficient solutions. While the lifecycle provides a detailed view of the interconnected processes in machine learning systems, it can be distilled into a simplified framework for practical implementation.

Expand Down

0 comments on commit 86b8276

Please sign in to comment.