diff --git a/totrans/prac-dl-cld_09.yaml b/totrans/prac-dl-cld_09.yaml index 0d05588..e97659e 100644 --- a/totrans/prac-dl-cld_09.yaml +++ b/totrans/prac-dl-cld_09.yaml @@ -273,6 +273,8 @@ id: totrans-42 prefs: [] type: TYPE_TB + zh: '| ~1小时 | + 使扩展训练和推理易于管理+ 在云提供商之间可移植+ 开发和生产环境一致+ 对于数据科学家,与熟悉的工具(如Jupyter Notebooks)集成,用于将模型发送到生产环境+ + 可以组合条件管道以自动化测试、级联模型+ 使用现有的手动管理的服务库– 仍在发展中– 对于初学者,托管和管理的云堆栈提供了更简单的学习曲线 |' - en: In this chapter, we explore a range of tools and scenarios. Some of these options are easy to use but limited in functionality. Others offer more granular controls and higher performance but are more involved to set up. We look at one example @@ -283,22 +285,26 @@ id: totrans-43 prefs: [] type: TYPE_NORMAL + zh: 在本章中,我们探讨了一系列工具和场景。其中一些选项易于使用,但功能有限。其他选项提供更精细的控制和更高的性能,但设置更复杂。我们将看一个每个类别的例子,并深入研究,以便了解何时使用其中之一是有意义的。然后,我们将对不同解决方案进行成本分析,并详细介绍一些解决方案在实践中的工作方式。 - en: 'Flask: Build Your Own Server' id: totrans-44 prefs: - PREF_H1 type: TYPE_NORMAL + zh: Flask:构建自己的服务器 - en: We begin with the most basic technique of *Build Your Own Server* (BYOS). From the choices presented in the first column of [Table 9-1](part0011.html#tools_to_serve_deep_learning_models_over), we’ve selected Flask. id: totrans-45 prefs: [] type: TYPE_NORMAL + zh: 我们从*构建自己的服务器*(BYOS)的最基本技术开始。从[表9-1](part0011.html#tools_to_serve_deep_learning_models_over)的第一列中选择,我们选择了Flask。 - en: Making a REST API with Flask id: totrans-46 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 使用Flask创建REST API - en: Flask is a Python-based web application framework. Released in 2010 and with more than 46,000 stars on GitHub, it is under continuous development. It’s also quick and easy to set up and is really useful for prototyping. It is often the @@ -308,10 +314,12 @@ id: totrans-47 prefs: [] type: TYPE_NORMAL + zh: Flask是一个基于Python的Web应用程序框架。它于2010年发布,在GitHub上拥有超过46,000颗星,正在持续开发。它快速且易于设置,对于原型设计非常有用。当数据科学从业者想要向一组有限的用户提供他们的模型(例如,在企业网络上与同事共享)时,通常会选择这个框架,而不会有太多麻烦。 - en: 'Installing Flask with `pip` is fairly straightforward:' id: totrans-48 prefs: [] type: TYPE_NORMAL + zh: 使用`pip`安装Flask相当简单: - en: '[PRE0]' id: totrans-49 prefs: [] @@ -322,6 +330,7 @@ id: totrans-50 prefs: [] type: TYPE_NORMAL + zh: 安装后,我们应该能够运行以下简单的“Hello World”程序: - en: '[PRE1]' id: totrans-51 prefs: [] @@ -331,6 +340,7 @@ id: totrans-52 prefs: [] type: TYPE_NORMAL + zh: 以下是运行“Hello World”程序的命令: - en: '[PRE2]' id: totrans-53 prefs: [] @@ -341,17 +351,21 @@ id: totrans-54 prefs: [] type: TYPE_NORMAL + zh: 默认情况下,Flask在端口5000上运行。当我们在浏览器中打开URL *http://localhost:5000/hello*时,我们应该看到“Hello + World!”这几个字,如[图9-2](part0011.html#navigate_to_httpcolonsolidussoliduslocal)所示。 - en: '![Navigate to http://localhost:5000/hello within a web browser to view the “Hello World!” web page](../images/00132.jpeg)' id: totrans-55 prefs: [] type: TYPE_IMG + zh: '![在Web浏览器中导航到http://localhost:5000/hello,查看“Hello World!”网页](../images/00132.jpeg)' - en: Figure 9-2\. Navigate to http://localhost:5000/hello within a web browser to view the “Hello World!” web page id: totrans-56 prefs: - PREF_H6 type: TYPE_NORMAL + zh: 图9-2。在Web浏览器中导航到http://localhost:5000/hello,查看“Hello World!”网页 - en: As you can see, it takes barely more than a few lines to get a simple web application up and running. One of the most important lines in that script is `@app.route("/hello")`. It specifies that the path `/hello` after the hostname would be served by the @@ -361,11 +375,14 @@ id: totrans-57 prefs: [] type: TYPE_NORMAL + zh: 正如您所看到的,只需几行代码就可以让一个简单的Web应用程序运行起来。在该脚本中最重要的一行是`@app.route("/hello")`。它指定了主机名后的路径`/hello`将由其下方的方法提供服务。在我们的情况下,它只是返回字符串“Hello + World!”在下一步中,我们将看看如何将Keras模型部署到Flask服务器,并创建一个路由来提供我们模型的预测。 - en: Deploying a Keras Model to Flask id: totrans-58 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 将Keras模型部署到Flask - en: 'Our first step is to load our Keras model. The following lines load the model from the .*h5* file. You’ll find the scripts for this chapter on the book’s GitHub (see [*http://PracticalDeepLearning.ai*](http://PracticalDeepLearning.ai)) in @@ -373,6 +390,7 @@ id: totrans-59 prefs: [] type: TYPE_NORMAL + zh: 我们的第一步是加载我们的Keras模型。以下行从.*h5*文件中加载模型。您可以在本书的GitHub(请参阅[*http://PracticalDeepLearning.ai*](http://PracticalDeepLearning.ai))的*code/chapter-9*目录中找到本章的脚本: - en: '[PRE3]' id: totrans-60 prefs: [] @@ -383,6 +401,7 @@ id: totrans-61 prefs: [] type: TYPE_NORMAL + zh: 现在,我们创建路由*/infer*,以支持对我们的图像进行推理。自然地,我们将支持`POST`请求以接受图像: - en: '[PRE4]' id: totrans-62 prefs: [] @@ -393,6 +412,7 @@ id: totrans-63 prefs: [] type: TYPE_NORMAL + zh: 为了测试推理,让我们使用`curl`命令,在包含一只狗的示例图像上进行如下操作: - en: '[PRE5]' id: totrans-64 prefs: [] @@ -405,6 +425,7 @@ id: totrans-65 prefs: [] type: TYPE_NORMAL + zh: 正如预期的那样,我们得到了“dog”的预测。到目前为止,这个方法运行得相当顺利。此时,Flask只在本地运行;也就是说,网络上的其他人无法向该服务器发出请求。要使Flask对其他人可用,我们只需将`app.run()`更改为以下内容: - en: '[PRE6]' id: totrans-66 prefs: [] @@ -421,65 +442,79 @@ id: totrans-67 prefs: [] type: TYPE_NORMAL + zh: 在这一点上,我们可以让我们的模型对我们网络内的任何人开放。下一个问题将是——我们是否可以做同样的事情让模型对普通公众可用?对于这个问题的答案是一个坚定的否定!Flask网站上有一个显著的警告:“警告:不要在生产环境中使用开发服务器。” + Flask确实不支持开箱即用的生产工作,需要自定义代码来启用。在接下来的部分中,我们将看看如何在适用于生产使用的系统上托管我们的模型。考虑到所有这些,让我们回顾一下使用Flask的一些优缺点。 - en: Pros of Using Flask id: totrans-68 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 使用Flask的优点 - en: 'Flask provides some advantages, namely:' id: totrans-69 prefs: [] type: TYPE_NORMAL + zh: Flask提供了一些优势,包括: - en: Quick to set up and to prototype id: totrans-70 prefs: - PREF_UL type: TYPE_NORMAL + zh: 快速设置和原型设计 - en: Fast development cycle id: totrans-71 prefs: - PREF_UL type: TYPE_NORMAL + zh: 快速开发周期 - en: Lightweight on resources id: totrans-72 prefs: - PREF_UL type: TYPE_NORMAL + zh: 资源占用轻 - en: Broad appeal within the Python community id: totrans-73 prefs: - PREF_UL type: TYPE_NORMAL + zh: 在Python社区中具有广泛吸引力 - en: Cons of Using Flask id: totrans-74 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 使用Flask的缺点 - en: 'At the same time, Flask might not be your best choice, for the following reasons:' id: totrans-75 prefs: [] type: TYPE_NORMAL + zh: 同时,出于以下原因,Flask可能不是您的最佳选择: - en: Cannot scale; by default, it is not meant for production loads. Flask can serve only one request at one time id: totrans-76 prefs: - PREF_UL type: TYPE_NORMAL + zh: 无法扩展;默认情况下,它不适用于生产负载。Flask一次只能处理一个请求 - en: Does not handle model versioning out of the box id: totrans-77 prefs: - PREF_UL type: TYPE_NORMAL + zh: 开箱即用不支持模型版本控制 - en: Does not support batching of requests out of the box id: totrans-78 prefs: - PREF_UL type: TYPE_NORMAL + zh: 不支持批量请求处理 - en: Desirable Qualities in a Production-Level Serving System id: totrans-79 prefs: - PREF_H1 type: TYPE_NORMAL + zh: 生产级服务系统中的理想特质 - en: For any cloud service that is serving traffic from the public, there are certain attributes that we want to look for when deciding to use a solution. In the context of machine learning, there are additional qualities that we would look for while @@ -487,11 +522,13 @@ id: totrans-80 prefs: [] type: TYPE_NORMAL + zh: 对于任何从公共网络提供流量的云服务,我们在决定使用解决方案时希望看到某些属性。在机器学习的背景下,在构建推理服务时,我们会寻找额外的特质。如果本节中有一些特质,我们将看一下其中的一些。 - en: High Availability id: totrans-81 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 高可用性 - en: For our users to trust our service, it must be available almost always. For many serious players, they measure their availability metric in terms of “*number of nines*.” If a business claims that its service has four 9s availability, they @@ -501,34 +538,42 @@ id: totrans-82 prefs: [] type: TYPE_NORMAL + zh: 为了让我们的用户信任我们的服务,它必须几乎始终可用。对于许多严肃的参与者,他们用“*九的数量*”来衡量他们的可用性指标。如果一个企业声称其服务有四个九的可用性,他们的意思是系统99.99%的时间都是正常运行和可用的。尽管99%听起来令人印象深刻,[表9-2](part0011.html#downtime_per_year_for_different_availabi)将每年的停机时间放在了透视中。 - en: Table 9-2\. Downtime per year for different availability percentages id: totrans-83 prefs: [] type: TYPE_NORMAL + zh: 表9-2。不同可用性百分比的每年停机时间 - en: '| **Availability %** | **Downtime per year** |' id: totrans-84 prefs: [] type: TYPE_TB + zh: '| **可用性 %** | **每年停机时间** |' - en: '| --- | --- |' id: totrans-85 prefs: [] type: TYPE_TB + zh: '| --- | --- |' - en: '| 99% (“two nines”) | 3.65 days |' id: totrans-86 prefs: [] type: TYPE_TB + zh: '| 99%(“两个九”)| 3.65天 |' - en: '| 99.9% (“three nines”) | 8.77 hours |' id: totrans-87 prefs: [] type: TYPE_TB + zh: '| 99.9%(“三个九”)| 8.77小时 |' - en: '| 99.99% (“four nines”) | 52.6 minutes |' id: totrans-88 prefs: [] type: TYPE_TB + zh: '| 99.99%(“四个九”)| 52.6分钟 |' - en: '| 99.999% (“five nines”) | 5.26 minutes |' id: totrans-89 prefs: [] type: TYPE_TB + zh: '| 99.999%(“五个九”)| 5.26分钟 |' - en: Imagine how ridiculous the situation would be if a major website like Amazon were only 99.9% available, losing millions in user revenue during the eight-plus hours of downtime. Five 9s is considered the holy grail. Anything less than three @@ -536,11 +581,13 @@ id: totrans-90 prefs: [] type: TYPE_NORMAL + zh: 想象一下,如果像亚马逊这样的主要网站只有99.9%的可用性,每年超过八个小时的停机时间将导致数百万用户收入损失。五个九被认为是圣杯。少于三个九的任何可用性通常不适合高质量的生产系统。 - en: Scalability id: totrans-91 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 可扩展性 - en: Traffic handled by production services is almost never uniform across a larger time period. For example, the *New York Times* experiences significantly more traffic during morning hours, whereas Netflix typically experiences a surge in @@ -550,6 +597,7 @@ id: totrans-92 prefs: [] type: TYPE_NORMAL + zh: 生产服务处理的流量在较长时间段内几乎从不是均匀的。例如,《纽约时报》在早晨经历的流量明显更多,而Netflix通常在晚上和深夜经历流量激增,人们在那时放松。流量还受季节因素影响。亚马逊在黑色星期五和圣诞季节经历了数量级更多的流量。 - en: 'A higher demand requires a higher amount of resources being available and online to serve them. Otherwise, the availability of the system would be in jeopardy. A naive way to accomplish this would be to anticipate the highest volume of traffic @@ -563,6 +611,7 @@ id: totrans-93 prefs: [] type: TYPE_NORMAL + zh: 更高的需求需要更多的资源可用和在线以为他们提供服务。否则,系统的可用性将受到威胁。实现这一目标的一种天真的方法是预测系统将提供的最高流量量,确定为提供该流量水平所需的资源数量,然后永久分配该数量。这种方法有两个问题:1)如果您的规划是正确的,那么大部分时间资源将被低效利用,实际上是在浪费金钱;2)如果您的估计不足,您可能会影响服务的可用性,并最终失去客户的信任和钱包。 - en: A smarter way to manage traffic loads is to monitor them as they are coming in and dynamically allocate and deallocate resources that are available for service. This ensures that the increased traffic is handled without loss of service while @@ -570,6 +619,7 @@ id: totrans-94 prefs: [] type: TYPE_NORMAL + zh: 管理流量负载的更智能的方法是在流入时监视它们,并动态分配和释放可用于服务的资源。这确保了增加的流量在不丢失服务的情况下处理,同时在低流量时期将运营成本降至最低。 - en: When scaling down resources, any resource that is about to be deallocated is quite likely to be processing traffic at that moment. It’s essential to ensure that all of those requests be completed before shutting down that resource. Also, @@ -579,11 +629,13 @@ id: totrans-95 prefs: [] type: TYPE_NORMAL + zh: 在缩减资源时,即将被释放的任何资源很可能正在处理流量。在关闭该资源之前,必须确保所有这些请求都已完成。此外,关键是资源不能处理任何新请求。这个过程被称为*排空*。当机器因例行维护和/或升级而关闭时,排空也至关重要。 - en: Low Latency id: totrans-96 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 低延迟 - en: Consider these facts. Amazon published a study in 2008 in which it found that every 100 ms increase in latency in its retail website resulted in a 1% loss of profit. A one-second delay in loading the website caused a whopping $1.6 billion @@ -595,6 +647,7 @@ id: totrans-97 prefs: [] type: TYPE_NORMAL + zh: 考虑这些事实。亚马逊在2008年发表了一项研究,发现其零售网站的每增加100毫秒的延迟会导致1%的利润损失。网站加载延迟一秒会导致高达16亿美元的收入损失!谷歌发现移动网站500毫秒的延迟导致流量下降20%。换句话说,广告服务的机会减少了20%。而这不仅影响行业巨头。如果一个网页在手机上加载时间超过三秒,53%的用户会放弃它(根据谷歌2017年的一项研究)。显然,时间就是金钱。 - en: Reporting average latency can be misleading because it might paint a cheerier picture than a ground reality. It’s like saying if Bill Gates walks into a room, everyone is a billionaire on average. Instead, percentile latency is the typically @@ -606,11 +659,13 @@ id: totrans-98 prefs: [] type: TYPE_NORMAL + zh: 报告平均延迟可能会产生误导,因为它可能比实际情况更乐观。这就好比说如果比尔·盖茨走进一个房间,那么每个人平均都是亿万富翁。相反,百分位延迟通常是报告的指标。例如,一个服务可能报告99th百分位的987毫秒。这意味着99%的请求在987毫秒或更短的时间内得到响应。同一个系统的平均延迟可能是20毫秒。当然,随着对您的服务的流量增加,如果服务没有扩展以提供足够的资源,延迟可能会增加。因此,延迟、高可用性和可伸缩性是相互交织在一起的。 - en: Geographic Availability id: totrans-99 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 地理可用性 - en: The distance between New York and Sydney is nearly 10,000 miles (16,000 km). The speed of light in a vacuum is roughly 186,282 miles per second (300,000 km per second). Silica glass (used in fiber-optic cables) decreases the speed of @@ -624,6 +679,7 @@ id: totrans-100 prefs: [] type: TYPE_NORMAL + zh: 纽约和悉尼之间的距离接近10,000英里(16,000公里)。真空中的光速大约为每秒186,282英里(每秒300,000公里)。二氧化硅玻璃(用于光纤电缆)将光速降低约30%,降至每秒130,487英里(每秒210,000公里)。在连接这两个城市之间的一条直线上运行的光纤上,单个请求的往返传输时间仅为约152毫秒。请记住,这并不考虑请求在服务器上处理所需的时间,或者数据包在途中通过多个路由器进行跳转的时间。对于许多应用程序来说,这种服务水平是不可接受的。 - en: Services that expect to be used throughout the world must be strategically located to minimize latency for the users in those regions. Additionally, resources can be dynamically scaled up or down depending on local traffic, thus giving more @@ -632,52 +688,63 @@ id: totrans-101 prefs: [] type: TYPE_NORMAL + zh: 希望在全球范围内使用的服务必须被战略性地放置,以最小化用户在这些地区的延迟。此外,资源可以根据当地流量动态扩展或缩减,从而提供更精细的控制。主要的云服务提供商至少在五个大洲设有基地(抱歉企鹅!)。 - en: Tip id: totrans-102 prefs: - PREF_H6 type: TYPE_NORMAL + zh: 提示 - en: Want to simulate how long the incoming requests would take from your computer to a particular datacenter around the world? [Table 9-3](part0011.html#latency_measurement_tools_for_different) lists a few handy browser-based tools offered by cloud providers. id: totrans-103 prefs: [] type: TYPE_NORMAL + zh: 想要模拟从您的计算机到世界各地特定数据中心的传入请求需要多长时间吗?[表9-3](part0011.html#latency_measurement_tools_for_different)列出了一些由云服务提供商提供的方便的基于浏览器的工具。 - en: Table 9-3\. Latency measurement tools for different cloud providers id: totrans-104 prefs: [] type: TYPE_NORMAL + zh: 表9-3. 不同云服务提供商的延迟测量工具 - en: '| **Service** | **Cloud provider** |' id: totrans-105 prefs: [] type: TYPE_TB + zh: '| **服务** | **云服务提供商** |' - en: '| --- | --- |' id: totrans-106 prefs: [] type: TYPE_TB + zh: '| --- | --- |' - en: '| [AzureSpeed.com](http://AzureSpeed.com) | Microsoft Azure |' id: totrans-107 prefs: [] type: TYPE_TB + zh: '| [AzureSpeed.com](http://AzureSpeed.com) | 微软Azure |' - en: '| [CloudPing.info](https://CloudPing.info) | Amazon Web Services |' id: totrans-108 prefs: [] type: TYPE_TB + zh: '| [CloudPing.info](https://CloudPing.info) | 亚马逊网络服务 |' - en: '| [GCPing.com](http://GCPing.com) | Google Cloud Platform |' id: totrans-109 prefs: [] type: TYPE_TB + zh: '| [GCPing.com](http://GCPing.com) | 谷歌云平台 |' - en: Additionally, to determine realistic combinations of latency from one location to another, *CloudPing.co* measures AWS Inter-Region Latency, between more than 16 US-based AWS datacenters to one another. id: totrans-110 prefs: [] type: TYPE_NORMAL + zh: 此外,为了确定从一个位置到另一个位置的延迟的现实组合,*CloudPing.co*测量了AWS区域间的延迟,包括16个以上美国AWS数据中心之间的延迟。 - en: Failure Handling id: totrans-111 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 故障处理 - en: There’s an old saying that there are only two things that are assured in life—death and taxes. In the twenty-first century, this adage applies not just to humans but also computer hardware. Machines fail all the time. The question is never @@ -1346,6 +1413,7 @@ id: totrans-201 prefs: [] type: TYPE_NORMAL + zh: 在本书中,我们探讨了端到端深度学习流水线的各个步骤,从数据摄入、分析、规模化的分布式训练(包括超参数调整)、跟踪实验、部署,最终到规模化提供预测请求。每个步骤都有其自身的复杂性,具有一套工具、生态系统和专业领域。人们将毕生精力投入到这些领域中的一个。这并不是一件轻而易举的事情。当考虑到必要的后端工程、硬件工程、基础设施工程、依赖管理、DevOps、容错和其他工程挑战时,所需的专业知识的组合爆炸可能导致大多数组织的招聘过程非常昂贵。 - en: As we saw in the previous section, Docker saves us the hassle of dependency management by making portable containers available. It helps us make TensorFlow Serving available across platforms easily without having to build it from source @@ -1356,6 +1424,7 @@ id: totrans-202 prefs: [] type: TYPE_NORMAL + zh: 正如我们在前一节中看到的,Docker通过提供可移植的容器,省去了我们管理依赖关系的麻烦。它帮助我们轻松地在各个平台上提供TensorFlow Serving,而无需从源代码构建或手动安装依赖项。太好了!但它仍然没有回答许多其他挑战。我们如何扩展容器以匹配需求的增长?我们如何有效地在容器之间分发流量?我们如何确保容器彼此可见并能够通信? - en: These are questions answered by *Kubernetes*. Kubernetes is an orchestration framework for automatically deploying, scaling, and managing containers (like Docker). Because it takes advantage of the portability offered by Docker, we can @@ -1368,6 +1437,7 @@ id: totrans-203 prefs: [] type: TYPE_NORMAL + zh: 这些问题是由*Kubernetes*回答的。Kubernetes是一个自动部署、扩展和管理容器(如Docker)的编排框架。由于它利用了Docker提供的可移植性,我们可以使用Kubernetes在开发人员笔记本电脑和几千台机器集群上几乎相同的方式部署。这有助于在不同环境中保持一致性,并具有可访问的可扩展性。值得注意的是,Kubernetes不是机器学习的专用解决方案(Docker也不是);相反,它是解决软件开发中许多问题的通用解决方案,我们在深度学习的背景下使用它。 - en: But let’s not get ahead of ourselves just yet. After all, if Kubernetes were the be-all and end-all solution, it would have appeared in the chapter title! A machine learning practitioner using Kubernetes still needs to assemble all of @@ -1380,6 +1450,7 @@ id: totrans-204 prefs: [] type: TYPE_NORMAL + zh: 但让我们不要过于急躁。毕竟,如果Kubernetes是一切的解决方案,它就会出现在章节标题中!使用Kubernetes的机器学习从业者仍然需要组装所有适当的容器集合(用于训练、部署、监控、API管理等),然后需要将它们协调在一起,以构建一个完全运作的端到端流水线。不幸的是,许多数据科学家正在尝试在他们自己的孤立环境中做这件事,重新发明轮子,构建临时的机器学习特定流水线。我们能不能省去所有人的麻烦,为机器学习场景制定一个基于Kubernetes的解决方案呢? - en: Enter *KubeFlow*, which promises to automate a large chunk of these engineering challenges and hide the complexity of running a distributed, scalable, end-to-end deep learning system behind a web GUI-based tool and a powerful command-line tool. @@ -1393,6 +1464,7 @@ id: totrans-205 prefs: [] type: TYPE_NORMAL + zh: '*KubeFlow*登场了,它承诺自动化大部分这些工程挑战,并隐藏了运行分布式、可扩展、端到端深度学习系统的复杂性背后的一个基于Web GUI的工具和一个强大的命令行工具。这不仅仅是一个推理服务。把它看作一个大型工具生态系统,可以无缝地互操作,更重要的是,可以随需求扩展。KubeFlow是为云构建的。虽然不只是一个云——它建立在与所有主要云提供商兼容的基础上。这对成本有重大影响。因为我们不受限于特定的云提供商,如果竞争云提供商降低价格,我们可以随时自由地转移所有操作。毕竟,竞争有利于消费者。' - en: KubeFlow supports a variety of hardware infrastructure, from developer laptops and on-premises datacenters, all the way to public cloud services. And because it’s built on top of Docker and Kubernetes, we can rest assured that the environments @@ -1403,102 +1475,125 @@ id: totrans-206 prefs: [] type: TYPE_NORMAL + zh: KubeFlow支持各种硬件基础设施,从开发人员的笔记本电脑和本地数据中心,一直到公共云服务。由于它是建立在Docker和Kubernetes之上的,我们可以放心,无论是在开发人员的笔记本电脑上部署还是在数据中心的大型集群上部署,环境都是相同的。开发人员设置与生产环境不同的任何方式都可能导致故障,因此在各种环境中保持一致性非常有价值。 - en: '[Table 9-4](part0011.html#tools_available_on_kubeflow) shows a brief list of readily available tools within the KubeFlow ecosystem.' id: totrans-207 prefs: [] type: TYPE_NORMAL + zh: '[表9-4](part0011.html#tools_available_on_kubeflow)展示了KubeFlow生态系统中现成可用的工具的简要列表。' - en: Table 9-4\. Tools available on KubeFlow id: totrans-208 prefs: [] type: TYPE_NORMAL + zh: 表9-4\. KubeFlow上可用的工具 - en: '| **Tool** | **Functionality** |' id: totrans-209 prefs: [] type: TYPE_TB + zh: '| **工具** | **功能** |' - en: '| --- | --- |' id: totrans-210 prefs: [] type: TYPE_TB + zh: '| --- | --- |' - en: '| Jupyter Hub | Notebook environment |' id: totrans-211 prefs: [] type: TYPE_TB + zh: '| Jupyter Hub | 笔记本环境 |' - en: '| TFJob | Training TensorFlow models |' id: totrans-212 prefs: [] type: TYPE_TB + zh: '| TFJob | 训练TensorFlow模型 |' - en: '| TensorFlow Serving | Serving TensorFlow models |' id: totrans-213 prefs: [] type: TYPE_TB + zh: '| TensorFlow Serving | 服务TensorFlow模型 |' - en: '| Seldon | Serving models |' id: totrans-214 prefs: [] type: TYPE_TB + zh: '| Seldon | 服务模型 |' - en: '| NVIDIA TensorRT | Serving models |' id: totrans-215 prefs: [] type: TYPE_TB + zh: '| NVIDIA TensorRT | 服务模型 |' - en: '| Intel OpenVINO | Serving models |' id: totrans-216 prefs: [] type: TYPE_TB + zh: '| Intel OpenVINO | 服务模型 |' - en: '| KFServing | Abstraction for serving Tensorflow, XGBoost, scikit-learn, PyTorch, and ONNX models |' id: totrans-217 prefs: [] type: TYPE_TB + zh: '| KFServing | 用于服务Tensorflow、XGBoost、scikit-learn、PyTorch和ONNX模型的抽象 |' - en: '| Katib | Hyperparameter tuning and NAS |' id: totrans-218 prefs: [] type: TYPE_TB + zh: '| Katib | 超参数调整和NAS |' - en: '| Kubebench | Running benchmarking jobs |' id: totrans-219 prefs: [] type: TYPE_TB + zh: '| Kubebench | 运行基准测试作业 |' - en: '| PyTorch | Training PyTorch models |' id: totrans-220 prefs: [] type: TYPE_TB + zh: '| PyTorch | 训练PyTorch模型 |' - en: '| Istio | API services, authentication, A/B testing, rollouts, metrics |' id: totrans-221 prefs: [] type: TYPE_TB + zh: '| Istio | API服务、身份验证、A/B测试、发布、指标 |' - en: '| Locust | Load testing |' id: totrans-222 prefs: [] type: TYPE_TB + zh: '| Locust | 负载测试 |' - en: '| Pipelines | Managing experiments, jobs, and runs, scheduling machine learning workflows |' id: totrans-223 prefs: [] type: TYPE_TB + zh: '| 管道 | 管理实验、作业和运行,调度机器学习工作流程 |' - en: As the joke goes in the community, with so many technologies prepackaged, KubeFlow finally makes our résumés buzzword- (and recruiter-) compliant. id: totrans-224 prefs: [] type: TYPE_NORMAL + zh: 社区中有一个笑话,有这么多预打包技术,KubeFlow最终使我们的简历符合流行词(和招聘人员)的要求。 - en: Note id: totrans-225 prefs: - PREF_H6 type: TYPE_NORMAL + zh: 注 - en: Many people assume that KubeFlow is a combination of Kubernetes and TensorFlow, which, as you have seen, is not the case. It is that and much more. id: totrans-226 prefs: [] type: TYPE_NORMAL + zh: 许多人认为KubeFlow是Kubernetes和TensorFlow的组合,但正如您所见,事实并非如此。它是那样,而且更多。 - en: 'There are two important parts to KubeFlow that make it unique: pipelines and fairing.' id: totrans-227 prefs: [] type: TYPE_NORMAL + zh: KubeFlow的两个重要部分使其独特:管道和公平。 - en: Pipelines id: totrans-228 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 管道 - en: Pipelines give us the ability to compose steps across the machine learning to schedule complex workflows. [Figure 9-11](part0011.html#an_end-to-end_pipeline_illustrated_in_ku) shows us an example of a pipeline. Having visibility into the pipeline through @@ -1507,20 +1602,24 @@ id: totrans-229 prefs: [] type: TYPE_NORMAL + zh: 管道使我们能够在机器学习中组合步骤,安排复杂的工作流程。[图9-11](part0011.html#an_end-to-end_pipeline_illustrated_in_ku)展示了一个管道的示例。通过GUI工具查看管道可以帮助利益相关者理解它(不仅仅是构建它的工程师)。 - en: '![An end-to-end pipeline illustrated in KubeFlow](../images/00094.jpeg)' id: totrans-230 prefs: [] type: TYPE_IMG + zh: '![在KubeFlow中展示的端到端管道](../images/00094.jpeg)' - en: Figure 9-11\. An end-to-end pipeline illustrated in KubeFlow id: totrans-231 prefs: - PREF_H6 type: TYPE_NORMAL + zh: 图9-11\. 在KubeFlow中展示的端到端管道 - en: Fairing id: totrans-232 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 公平 - en: 'Fairing allows us to manage the entire build, train, and deploy lifecycle directly through Jupyter Notebooks. [Figure 9-12](part0011.html#creating_a_new_jupyter_notebook_server_o) shows how to start a new notebook server, where we can host all of our Jupyter @@ -1530,6 +1629,7 @@ id: totrans-233 prefs: [] type: TYPE_NORMAL + zh: 公平允许我们通过Jupyter Notebooks直接管理整个构建、训练和部署生命周期。[图9-12](part0011.html#creating_a_new_jupyter_notebook_server_o)展示了如何启动一个新的笔记本服务器,在那里我们可以托管所有的Jupyter笔记本,在上面运行训练,并使用几行代码将我们的模型部署到谷歌云,同时还可以在非常熟悉的Jupyter环境中进行操作: - en: '[PRE13]' id: totrans-234 prefs: [] @@ -1539,16 +1639,19 @@ id: totrans-235 prefs: [] type: TYPE_IMG + zh: '![在KubeFlow上创建一个新的Jupyter Notebook服务器](../images/00197.jpeg)' - en: Figure 9-12\. Creating a new Jupyter Notebook server on KubeFlow id: totrans-236 prefs: - PREF_H6 type: TYPE_NORMAL + zh: 图9-12\. 在KubeFlow上创建一个新的Jupyter Notebook服务器 - en: Installation id: totrans-237 prefs: - PREF_H2 type: TYPE_NORMAL + zh: 安装 - en: Creating a new KubeFlow deployment is a fairly straightforward process that is well documented on the KubeFlow website. You can set up KubeFlow using the browser for GCP. Alternatively, you can use the KubeFlow command-line tool to @@ -1557,15 +1660,19 @@ id: totrans-238 prefs: [] type: TYPE_NORMAL + zh: 创建一个新的KubeFlow部署是一个非常简单的过程,在KubeFlow网站上有详细的文档。您可以使用浏览器为GCP设置KubeFlow。或者,您可以使用KubeFlow命令行工具在GCP、AWS和Microsoft + Azure上设置部署。[图9-13](part0011.html#creating_a_kubeflow_deployment_on_gcp_us)展示了使用Web浏览器进行GCP部署。 - en: '![Creating a KubeFlow deployment on GCP using the browser](../images/00014.jpeg)' id: totrans-239 prefs: [] type: TYPE_IMG + zh: '![在浏览器上创建一个在GCP上使用KubeFlow部署](../images/00014.jpeg)' - en: Figure 9-13\. Creating a KubeFlow deployment on GCP using the browser id: totrans-240 prefs: - PREF_H6 type: TYPE_NORMAL + zh: 图9-13\. 在浏览器上创建一个在GCP上使用KubeFlow部署 - en: As of this writing, KubeFlow is in active development and shows no signs of stopping. Companies such as Red Hat, Cisco, Dell, Uber, and Alibaba are some of the active contributors on top of cloud giants like Microsoft, Google, and IBM. @@ -1574,11 +1681,13 @@ id: totrans-241 prefs: [] type: TYPE_NORMAL + zh: 截至目前,KubeFlow正在积极开发,并没有停止的迹象。像红帽、思科、戴尔、优步和阿里巴巴这样的公司是一些积极的贡献者,云巨头如微软、谷歌和IBM也是如此。解决困难挑战的简易性和可访问性吸引更多人使用任何平台,而KubeFlow正是在做这件事。 - en: Price Versus Performance Considerations id: totrans-242 prefs: - PREF_H1 type: TYPE_NORMAL + zh: 价格与性能考虑 - en: 'In [Chapter 6](part0008.html#7K4G3-13fa565533764549a6f0ab7f11eed62b), we looked at how to improve our model performance for inference (whether on smartphones or on a server). Now let’s look from another side: the hardware performance and @@ -1667,11 +1776,13 @@ id: totrans-252 prefs: [] type: TYPE_NORMAL + zh: 总之,在处理大量QPS场景时,自己编排云机器环境的成本节约和性能优势在[图9-15](part0011.html#cost_comparison_of_infrastructure_as_a_s)中得到了充分展示。 - en: '![Cost comparison of Infrastructure as a service (Google Cloud ML Engine) versus Building Your Own Stack over Virtual Machines (Azure VM) (Costs as of August 2019)](../images/00244.jpeg)' id: totrans-253 prefs: [] type: TYPE_IMG + zh: '![基础设施即服务(Google Cloud ML Engine)与在虚拟机上构建自己的堆栈(Azure VM)的成本比较(截至2019年8月的成本)](../images/00244.jpeg)' - en: Figure 9-15\. Cost comparison of infrastructure as a service (Google Cloud ML Engine) versus building your own stack over virtual machines (Azure VM) (costs as of August 2019) @@ -1679,11 +1790,13 @@ prefs: - PREF_H6 type: TYPE_NORMAL + zh: 图9-15。基础设施即服务(Google Cloud ML Engine)与在虚拟机上构建自己的堆栈(Azure VM)的成本比较(截至2019年8月的成本) - en: Tip id: totrans-255 prefs: - PREF_H6 type: TYPE_NORMAL + zh: 提示 - en: A common question that arises while benchmarking is what is my system’s limit? [JMeter](https://jmeter.apache.org) can help answer this. JMeter is a load-testing tool that lets you perform stress testing of your system with an easy-to-use graphical @@ -1692,11 +1805,13 @@ id: totrans-256 prefs: [] type: TYPE_NORMAL + zh: 在基准测试中经常出现的一个问题是我的系统的极限是什么?[JMeter](https://jmeter.apache.org)可以帮助回答这个问题。JMeter是一个负载测试工具,可以让您通过易于使用的图形界面对系统进行压力测试。它允许您创建可重用的配置来模拟各种使用场景。 - en: Summary id: totrans-257 prefs: - PREF_H1 type: TYPE_NORMAL + zh: 总结 - en: 'In this chapter, we answered the question most engineers and developers ask: how do we serve model prediction requests at scale for applications in the real world? We explored four different methods of serving an image recognition model: @@ -1708,3 +1823,5 @@ id: totrans-258 prefs: [] type: TYPE_NORMAL + zh: 在这一章中,我们回答了大多数工程师和开发人员提出的问题:在现实世界的应用程序中如何扩展为规模化的模型预测请求?我们探讨了四种不同的方法来提供图像识别模型:使用Flask、Google + Cloud ML、TensorFlow Serving和KubeFlow。根据规模、延迟要求和我们的技能水平,一些解决方案可能比其他更具吸引力。最后,我们对不同堆栈的成本效益有了直观的了解。现在我们可以向世界展示我们出色的分类器模型,剩下的就是让我们的工作走向病毒式传播!