Skip to content

Latest commit

 

History

History
672 lines (442 loc) · 14.8 KB

contents.md

File metadata and controls

672 lines (442 loc) · 14.8 KB

Best Practices of AI in Business

The Institute for Ethical AI & ML



Alejandro Saucedo

@AxSaucedo
in/axsaucedo

[NEXT]

Best Practices of AI in Business

The Institute for Ethical AI & ML

![portrait](images/alejandro.jpg)
Alejandro Saucedo

    <br>
    Chairman
    <br>
    <a style="color: cyan" href="http://ethical.institute">The Institute for Ethical AI & ML</a>
    <br>
    <br>
    AI Fellow / Member
    <br>
    <a style="color: cyan" href="#">The RSA & EU AI Alliance</a>
    <br>
    <br>
    Advisor
    <br>
    <a style="color: cyan" href="http://teensinai.com">TeensInAI.com initiative</a>
    <br>
    <br>
    Chief Engineer
    <br>
    <a style="color: cyan" href="http://eigentech.com">Exponential Technologies</a>
    <br>
    <br>
    
</td>

[NEXT]

Best practices of AI in business

Overview of The Institute

Core concepts

Best practices in industry

Next steps

[NEXT]

The Institute for Ethical AI & ML

<iframe style="height: 50vh; width: 100vw" src="http://ethical.institute"></iframe> #### http://ethical.institute

[NEXT]

Our phased rollout plan

  • Phase 1 - Ethical ML by pledge
    • Commit as a technology leader

  • Phase 2 - Ethical ML by process
    • Implement the internal processes to your workplace

  • Phase 3 - Ethical ML by certification
    • Obtain the certifications required

  • Phase 4 - Ethical ML by regulation
    • Implement policy based on case-studies

[NEXT]

Moral-consciousness matrix

Conscious Unconscious
Moral
Immoral

People can be conscious and moral

[NEXT]

#LetsDoThis

[NEXT SECTION]

1. Core concepts

[NEXT]

A. Karpathy Software 2.0 (ML)

  1. We specify some goal on the behavior of a desirable program (instead of coding it)
(e.g., “satisfy a dataset of input-output pairs of examples,” 
or, “win a game of Go”)
  1. write a rough skeleton of the code
(e.g., a neural net architecture) that identifies a 
subset of program space to search, 
  1. use the computational resources at our disposal to search this space for a program that works.

[NEXT]

In essence, all Machine Learning is:

If I was to give you a set of examples

Would you be able to learn the answers?


#### Let's have a look at an example

[NEXT]

Given some input data, predict the correct output

shapes

Let's try to build a system to predict whether a shape is a square or a triangle

How do we do this?

[NEXT]

First, let's visualise it

  • Imagine a 2-d plot
  • The x-axis is the area of the input shape
  • The y-axis is the perimeter of the input shape

classification

[NEXT]

Which function divides the data


The line defined by function

**$f(x̄) = mx̄ + b$**, where:

**x̄** is input (area & perimeter)

**m** and **b** are weights/bias

[NEXT]

Then we can predict new data


The result **$f(x̄)$** states whether it's a triangle or square

(e.g. if it's larger than 0.5 it's triangle otherwise square)



[NEXT]

So now let's start with a blank brain

[ ]

The machine knows nothing yet...

[NEXT]

Now let's take some data examples

shapes

And let the machine do the learning

[NEXT]

The machine does the learning

classification

We give it two examples (one square, one triangle)

[NEXT]

The machine does the learning

classification

We give it more examples

[NEXT]

The machine does the learning

classification

and more...

[NEXT]

Minimising loss function

We optimise the model by minimising its loss.

Keep adjusting the weights...

...until loss is not getting any smaller.

gradient_descent

[NEXT]

Finding the weights!

When it finishes, we find optimised weights and biases

i.e. $f(x̄)$ = triangle if ($0.3 x̄ + 10$) > 0.5 else square

[NEXT]

Now predict new data

classification_small

We now have a system that "knows" how to differentiate triangles from squares

[NEXT]

As your technical functions grow...

classification_large

[NEXT]

So should your infrastructure

classification_large

[NEXT]

Growing jobs

Data Scientists

In charge of development of models

Data Engineers

In charge of development of data pipelines

DataOps / ML Engineers

In charge of productionisation of models, data pipelines & products

[NEXT]

The core principles:

  • Explainability
  • Reproducibility
  • Monitoring
  • Compliance

[NEXT]

Let's see how we can implement these

[NEXT SECTION]

2. Best Practices in Industry

[NEXT]

Ethics by Pledge

The Machine Learning Pledge

The 8 commitments to ensure ethics by design

[NEXT]

1. Ensure augmented (as opposed to artificial)

Bad

Automate end-to-end process from start

Better

Understand process, add human-in-the-loop, evaluate

line

note

Bad

Have the system automatically going through all records taking the first-hand predictions without signoff on lower confidence fields

Better

Ensure there is a process for a human signoff based on predictions and have a process for low confidence fields

[NEXT]

2. Evaluation of bias in development and production

Bad

Train on ALL data, deploy, don't check

Better

Optimise training, focus on edge cases, monitor

line

note

Bad

Train the dataset on all previous cases and assume it works well

Better

Run in-depth analysis of distribution of data based on traits to ensure the model does not discriminate unfairly

[NEXT]

3. Address job displacement implications

Bad

Push for automation unconsciously

Better

Understand and address automation implications

line

note

Bad

Push for automation without taking into consideration the implications of job automation

Better

Understand the implication of both automating the process and reducing the costs for the service (which may lead to total increase in demand)

[NEXT]

4. Practical understanding on accuracy

Bad

Always try to blindly increase accuracy number

Better

Find relevant metrics for accuracy & cost function

line

note

Bad

Take percentage accuracy increases as face-value and assume a higher number is better

Better

Ensure you run consistent cross-validated and bias-reduced sets of tests/simulations to ensure that accuracy increase is objective

[NEXT]

5. Compliance by design

Bad

Use complex models without care

Better

Add domain knowledge to introduce transparency

line

[NEXT]

6. Reproducibility by design

Bad

Assume previous models won't be re-used

Better

Have a process to ensure reproducibility and compatibility

line

note none

[NEXT]

7. Trust beyond the user

Bad

Assume stakeholders understand data usage

Better

Build and communicate processes around data/meta-data, privacy, etc

line

note none

[NEXT]

8. Identify and address cybersecurity risks

Bad

Assume there isn't a need to protect models

Better

Identify and address threats for tricking, circumventing or hacking mathematical models created

note none

[NEXT]

Next steps

Applying this thinking into your actual projects

#LetsDoThis

[NEXT SECTION]

3. Final words

[NEXT]

Today we covered

AI/ML Recap

Opportunities & Risks

Ethics by design

Next steps!

[NEXT]

Code

https://github.com/axsauze/apc-2018-privacy-conference

Slides

https://axsauze.github.io/apc-2018-privacy-conference

[NEXT]

Best Practices of AI in Business

The Institute for Ethical AI & ML

![portrait](images/alejandro.jpg)
Alejandro Saucedo

    <br>
    Chairman
    <br>
    <a style="color: cyan" href="http://ethical.institute">The Institute for Ethical AI & ML</a>
    <br>
    <br>
    AI Fellow / Member
    <br>
    <a style="color: cyan" href="#">The RSA & EU AI Alliance</a>
    <br>
    <br>
    Advisor
    <br>
    <a style="color: cyan" href="http://teensinai.com">TeensInAI.com initiative</a>
    <br>
    <br>
    Chief Engineer
    <br>
    <a style="color: cyan" href="http://eigentech.com">Exponential Technologies</a>
    <br>
    <br>
    
</td>