From 40371eb84250c5264775bb49455c73f4528d4ed7 Mon Sep 17 00:00:00 2001 From: zhihuiwan <15779896112@163.com> Date: Fri, 29 Dec 2023 17:39:07 +0800 Subject: [PATCH 1/2] update quick start Signed-off-by: zhihuiwan <15779896112@163.com> --- doc/quick_start.md | 408 +++++++++++++++++++++++++++--------------- doc/quick_start.zh.md | 20 +-- 2 files changed, 276 insertions(+), 152 deletions(-) diff --git a/doc/quick_start.md b/doc/quick_start.md index bfd79833c..5a1505480 100644 --- a/doc/quick_start.md +++ b/doc/quick_start.md @@ -1,31 +1,31 @@ # Quick Start ## 1. Environment Setup -You can choose one of the following three deployment modes based on your requirements: +You can choose from the following three deployment modes based on your requirements: ### 1.1 Pypi Package Installation -Note: This mode operates in a single-machine mode. +Explanation: This mode operates in a single-machine environment. #### 1.1.1 Installation - Prepare and install [conda](https://docs.conda.io/projects/miniconda/en/latest/) environment. - Create a virtual environment: ```shell -# FATE requires Python >= 3.8 +# FATE requires python>=3.8 conda create -n fate_env python=3.8 conda activate fate_env ``` - Install FATE Flow and related dependencies: ```shell -pip install fate_client[fate,fate_flow]==2.0.0.b0 +pip install fate_client[fate,fate_flow]==2.0.0 ``` #### 1.1.2 Service Initialization ```shell fate_flow init --ip 127.0.0.1 --port 9380 --home $HOME_DIR ``` -- `ip`: The IP address where the service runs. -- `port`: The HTTP port the service runs on. -- `home`: The data storage directory, including data, models, logs, job configurations, and SQLite databases. +- ip: Service running IP +- port: HTTP port for the service +- home: Data storage directory, including data/models/logs/job configurations/sqlite.db, etc. #### 1.1.3 Service Start/Stop ```shell @@ -33,113 +33,39 @@ fate_flow status/start/stop/restart ``` ### 1.2 Standalone Deployment -Refer to [Standalone Deployment](https://github.com/FederatedAI/FATE/tree/v2.0.0-beta/deploy/standalone-deploy/README.zh.md). +Refer to [Standalone Deployment](https://github.com/FederatedAI/FATE/tree/v2.0.0/deploy/standalone-deploy/README.zh.md) ### 1.3 Cluster Deployment -Refer to [Allinone Deployment](https://github.com/FederatedAI/FATE/tree/v2.0.0-beta/deploy/cluster-deploy/allinone/fate-allinone_deployment_guide.zh.md). +Refer to [Allinone Deployment](https://github.com/FederatedAI/FATE/tree/v2.0.0/deploy/cluster-deploy/allinone/fate-allinone_deployment_guide.zh.md) ## 2. User Guide -FATE provides client tools including SDK, CLI, and Pipeline. If you don't have FATE Client deployed in your environment, you can download it using `pip install fate_client`. The following operations are based on CLI. +FATE provides a client package including SDK, CLI, and Pipeline. If FATE Client isn't deployed in your environment, you can download it using `pip install fate_client`. The following operations are CLI-based. ### 2.1 Data Upload -In version 2.0-beta, data uploading is a two-step process: - -- **upload**: Uploads data to FATE-supported storage services. -- **transformer**: Transforms data into a DataFrame. - -#### 2.1.1 upload -##### 2.1.1.1 Configuration and Data -- Upload configuration can be found at [examples-upload](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0-beta/examples/upload), and the data is located at [upload-data](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0-beta/examples/data). -- You can also use your own data and modify the "meta" information in the upload configuration. - -##### 2.1.1.2 Upload Guest Data +For detailed data operation guides, refer to [Data Access Guide](data_access.zh.md) +### 2.1.1 Configuration and Data + - Upload Configuration: [examples-upload](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0/examples/upload) + - Upload Data: [upload-data](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0/examples/data) +### 2.1.2 Upload Guest Data ```shell flow data upload -c examples/upload/upload_guest.json ``` -- Record the returned "name" and "namespace" for use in the transformer phase. - -##### 2.1.1.3 Upload Host Data +### 2.1.3 Upload Host Data ```shell flow data upload -c examples/upload/upload_host.json ``` -- Record the returned "name" and "namespace" for use in the transformer phase. - -##### 2.1.1.4 Upload Result -```json -{ - "code": 0, - "data": { - "name": "36491bc8-3fef-11ee-be05-16b977118319", - "namespace": "upload" - }, - "job_id": "202308211451535620150", - "message": "success" -} -``` -Where "namespace" and "name" identify the data in FATE for future reference in the transformer phase. - -##### 2.1.1.5 Data Query -Since upload is an asynchronous operation, you need to confirm if it was successful before proceeding to the next step. -```shell -flow table query --namespace upload --name 36491bc8-3fef-11ee-be05-16b977118319 -``` -If the returned code is 0, the upload was successful. - -#### 2.1.2 Transformer -##### 2.1.2.1 Configuration -- Transformer configuration can be found at [examples-transformer](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0-beta/examples/transformer). - -##### 2.1.2.2 Transform Guest Data -- Configuration path: examples/transformer/transformer_guest.json -- Modify the "namespace" and "name" in the "data_warehouse" section to match the output from the guest data upload. -```shell -flow data transformer -c examples/transformer/transformer_guest.json -``` - -##### 2.1.2.3 Transform Host Data -- Configuration path: examples/transformer/transformer_host.json -- Modify the "namespace" and "name" in the "data_warehouse" section to match the output from the host data upload. -```shell -flow data transformer -c examples/transformer/transformer_host.json -``` - -##### 2.1.2.4 Transformer Result -```json -{ - "code": 0, - "data": { - "name": "breast_hetero_guest", - "namespace": "experiment" - }, - "job_id": "202308211557455662860", - "message": "success" -} -``` -Where "namespace" and "name" identify the data in FATE for future modeling jobs. - -##### 2.1.2.5 Check if Data Upload Was Successful -Since the transformer is also an asynchronous operation, you need to confirm if it was successful before proceeding. -```shell -flow table query --namespace experiment --name breast_hetero_guest -``` -```shell -flow table query --namespace experiment --name breast_hetero_host -``` -If the returned code is 0, the upload was successful. -### 2.2 Starting FATE Jobs +### 2.2 Starting a FATE Job #### 2.2.1 Submitting a Job -Once your data is prepared, you can start submitting jobs to FATE Flow: - -- The configuration for training jobs can be found in [lr-train](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0-beta/examples/lr/train_lr.yaml). -- The configuration for prediction jobs can be found in [lr-predict](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0-beta/examples/lr/predict_lr.yaml). To use it, modify the "dag.conf.model_warehouse" to point to the output model of your training job. -- In the training and prediction job configurations, the site IDs are set to "9998" and "9999." If your deployment environment is the cluster version, you need to replace them with the actual site IDs. For the standalone version, you can use the default configuration. -- If you want to use your own data, you can change the "namespace" and "name" of "data_warehouse" for both the guest and host in the configuration. -- To submit a job, use the following command: +Once your data is prepared, you can submit a job to FATE Flow: +- Job configuration examples are in [lr-train](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0/examples/lr/train_lr.yaml). +- Site IDs in the job configuration are "9998" and "9999". Replace them with real site IDs for cluster deployments; default configuration can be used for standalone deployments. +- If you want to use your data, modify the parameters in the reader within the configuration. +- Command to submit a job: ```shell flow job submit -c examples/lr/train_lr.yaml ``` -- A successful submission will return the following result: +- Successful submission returns: ```json { "code": 0, @@ -150,17 +76,18 @@ flow job submit -c examples/lr/train_lr.yaml "job_id": "202308211911505128750", "message": "success" } + ``` -The "data" section here contains the output model of the job. +The "data" here contains the output of the job, i.e., the model. #### 2.2.2 Querying a Job -While a job is running, you can check its status using the query command: +During job execution, you can query the job status using the query command: ```shell flow job query -j $job_id ``` #### 2.2.3 Stopping a Job -During job execution, you can stop the current job using the stop command: +While the job is running, you can stop it using the stop job command: ```shell flow job stop -j $job_id ``` @@ -171,16 +98,14 @@ If a job fails during execution, you can rerun it using the rerun command: flow job rerun -j $job_id ``` -### 2.3 Obtaining Job Outputs -Job outputs include data, models, and metrics. - +### 2.3 Fetching Job Output +Job output includes data, models, and metrics. #### 2.3.1 Output Metrics -To query output metrics, use the following command: +Querying output metrics command: ```shell flow output query-metric -j $job_id -r $role -p $party_id -tn $task_name ``` -For example, if you used the training DAG from above, you can use `flow output query-metric -j 202308211911505128750 -r arbiter -p 9998 -tn lr_0` to query metrics. -The query result will look like this: +For example, with the previously submitted training DAG task, you can use `flow output query-metric -j 202308211911505128750 -r arbiter -p 9998 -tn lr_0` to query. The result looks like this: ```json { "code": 0, @@ -236,62 +161,263 @@ The query result will look like this: ], "message": "success" } + ``` + #### 2.3.2 Output Models ##### 2.3.2.1 Querying Models -To query output models, use the following command: ```shell flow output query-model -j $job_id -r $role -p $party_id -tn $task_name ``` -For example, if you used the training DAG from above, you can use `flow output query-model -j 202308211911505128750 -r host -p 9998 -tn lr_0` to query models. -The query result will be similar to this: - +For instance, with the previously submitted training DAG task, you can use `flow output query-model -j 202308211911505128750 -r host -p 9998 -tn lr_0` to query. +The query result looks like this: ```json { "code": 0, - "data": [ - { - "model": { - "file": "202308211911505128750_host_9998_lr_0", - "namespace": "202308211911505128750_host_9998_lr_0" + "data": { + "output_model": { + "data": { + "estimator": { + "end_epoch": 10, + "is_converged": false, + "lr_scheduler": { + "lr_params": { + "start_factor": 0.7, + "total_iters": 100 + }, + "lr_scheduler": { + "_get_lr_called_within_step": false, + "_last_lr": [ + 0.07269999999999996 + ], + "_step_count": 10, + "base_lrs": [ + 0.1 + ], + "end_factor": 1.0, + "last_epoch": 9, + "start_factor": 0.7, + "total_iters": 100, + "verbose": false + }, + "method": "linear" + }, + "optimizer": { + "alpha": 0.001, + "l1_penalty": false, + "l2_penalty": true, + "method": "sgd", + "model_parameter": [ + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ], + [ + 0.0 + ] + ], + "model_parameter_dtype": "float32", + "optim_param": { + "lr": 0.1 + }, + "optimizer": { + "param_groups": [ + { + "dampening": 0, + "differentiable": false, + "foreach": null, + "initial_lr": 0.1, + "lr": 0.07269999999999996, + "maximize": false, + "momentum": 0, + "nesterov": false, + "params": [ + 0 + ], + "weight_decay": 0 + } + ], + "state": {} + } + }, + "param": { + "coef_": [ + [ + -0.10828543454408646 + ], + [ + -0.07341302931308746 + ], + [ + -0.10850320011377335 + ], + [ + -0.10066638141870499 + ], + [ + -0.04595951363444328 + ], + [ + -0.07001449167728424 + ], + [ + -0.08949052542448044 + ], + [ + -0.10958756506443024 + ], + [ + -0.04012322425842285 + ], + [ + 0.02270071767270565 + ], + [ + -0.07198350876569748 + ], + [ + 0.00548586156219244 + ], + [ + -0.06599288433790207 + ], + [ + -0.06410090625286102 + ], + [ + 0.016374297440052032 + ], + [ + -0.01607361063361168 + ], + [ + -0.011447405442595482 + ], + [ + -0.04352564364671707 + ], + [ + 0.013161249458789825 + ], + [ + 0.013506329618394375 + ] + ], + "dtype": "float32", + "intercept_": null + } + } }, - "name": "HeteroLRHost_9998_0", - "namespace": "202308211911505128750_host_9998_lr_0", - "role": "host", - "party_id": "9998", - "work_mode": 1 + "meta": { + "batch_size": null, + "epochs": 10, + "init_param": { + "fill_val": 0.0, + "fit_intercept": false, + "method": "zeros", + "random_state": null + }, + "label_count": false, + "learning_rate_param": { + "method": "linear", + "scheduler_params": { + "start_factor": 0.7, + "total_iters": 100 + } + }, + "optimizer_param": { + "alpha": 0.001, + "method": "sgd", + "optimizer_params": { + "lr": 0.1 + }, + "penalty": "l2" + }, + "ovr": false + } } - ], + }, "message": "success" } + ``` ##### 2.3.2.2 Downloading Models -To download models, use the following command: ```shell flow output download-model -j $job_id -r $role -p $party_id -tn $task_name -o $download_dir ``` -For example, if you used the training DAG from above, you can use `flow output download-model -j 202308211911505128750 -r host -p 9998 -tn lr_0 -o ./` to download the model. -The download result will be similar to this: - +For example, with the previously submitted training DAG task, you can use `flow output download-model -j 202308211911505128750 -r host -p 9998 -tn lr_0 -o ./` to download. +The download result is shown below: ```json { "code": 0, "directory": "./output_model_202308211911505128750_host_9998_lr_0", "message": "download success, please check the path: ./output_model_202308211911505128750_host_9998_lr_0" } + + ``` -#### 2.3.3 Output Data -##### 2.3.3.1 Querying Data Tables -To query output data tables, use the following command: +### 2.3.3 Output Data +#### 2.3.3.1 Query Data Table ```shell flow output query-data-table -j $job_id -r $role -p $party_id -tn $task_name ``` -For example, if you used the training DAG from above, you can use `flow output query-data-table -j 202308211911505128750 -r host -p 9998 -tn binning_0` to query data tables. -The query result will be similar to this: - +For instance, with the previously submitted training DAG task, you can use `flow output query-data-table -j 202308211911505128750 -r host -p 9998 -tn binning_0` to query. The result looks like this: ```json { "train_output_data": [ @@ -303,19 +429,17 @@ The query result will be similar to this: } ``` -##### 2.3.3.2 Preview Data +#### 2.3.3.2 Preview Data ```shell flow output display-data -j $job_id -r $role -p $party_id -tn $task_name ``` -To preview output data using the above training DAG submission, you can use the following command: `flow output display-data -j 202308211911505128750 -r host -p 9998 -tn binning_0`. +For example, with the previously submitted training DAG task, you can use `flow output display-data -j 202308211911505128750 -r host -p 9998 -tn binning_0` to preview output data. -##### 2.3.3.3 Download Data +#### 2.3.3.3 Download Data ```shell flow output download-data -j $job_id -r $role -p $party_id -tn $task_name -o $download_dir ``` -To download output data using the above training DAG submission, you can use the following command: `flow output download-data -j 202308211911505128750 -r guest -p 9999 -tn lr_0 -o ./`. - -The download result will be as follows: +For example, with the previously submitted training DAG task, you can use `flow output download-data -j 202308211911505128750 -r guest -p 9999 -tn lr_0 -o ./` to download output data. The download result is as follows: ```json { "code": 0, @@ -325,8 +449,8 @@ The download result will be as follows: ``` ## 3. More Documentation -- [Restful-api](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0-beta/doc/swagger/swagger.yaml) -- [CLI](https://github.com/FederatedAI/FATE-Client/tree/v2.0.0-beta/python/fate_client/flow_cli/build/doc) -- [Pipeline](https://github.com/FederatedAI/FATE/tree/v2.0.0-beta/doc/tutorial) -- [FATE Quick Start](https://github.com/FederatedAI/FATE/tree/v2.0.0-beta/doc/2.0/quick_start.md) -- [FATE Algorithms](https://github.com/FederatedAI/FATE/tree/v2.0.0-beta/doc/2.0/fate) \ No newline at end of file +- [Restful-api](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0/doc/swagger/swagger.yaml) +- [CLI](https://github.com/FederatedAI/FATE-Client/tree/v2.0.0/python/fate_client/flow_cli/build/doc) +- [Pipeline](https://github.com/FederatedAI/FATE/tree/v2.0.0/doc/tutorial) +- [FATE Quick Start](https://github.com/FederatedAI/FATE/tree/v2.0.0/doc/2.0/fate/quick_start.md) +- [FATE Algorithms](https://github.com/FederatedAI/FATE/tree/v2.0.0/doc/2.0/fate) \ No newline at end of file diff --git a/doc/quick_start.zh.md b/doc/quick_start.zh.md index 26bf5a8d9..f97d9fee9 100644 --- a/doc/quick_start.zh.md +++ b/doc/quick_start.zh.md @@ -31,10 +31,10 @@ fate_flow status/start/stop/restart ``` ### 1.2 单机版部署 -参考[单机版部署](https://github.com/FederatedAI/FATE/tree/dev-2.0.0-rc/deploy/standalone-deploy/README.zh.md) +参考[单机版部署](https://github.com/FederatedAI/FATE/tree/v2.0.0/deploy/standalone-deploy/README.zh.md) ### 1.3 集群部署 -参考[allinone部署](https://github.com/FederatedAI/FATE/tree/dev-2.0.0-rc/deploy/cluster-deploy/allinone/fate-allinone_deployment_guide.zh.md) +参考[allinone部署](https://github.com/FederatedAI/FATE/tree/v2.0.0/deploy/cluster-deploy/allinone/fate-allinone_deployment_guide.zh.md) ## 2. 使用指南 fate提供的客户端包括SDK、CLI和Pipeline,若你的环境中没有部署FATE Client,可以使用`pip install fate_client`下载,以下的使用操作均基于cli编写。 @@ -42,8 +42,8 @@ fate提供的客户端包括SDK、CLI和Pipeline,若你的环境中没有部 ### 2.1 数据上传 更详细的数据操作指南可参考:[数据接入指南](data_access.zh.md) ### 2.1.1 配置及数据 - - 上传配置: [examples-upload](https://github.com/FederatedAI/FATE-Flow/tree/dev-2.0.0-rc/examples/upload) - - 上传数据: [upload-data](https://github.com/FederatedAI/FATE-Flow/tree/dev-2.0.0-rc/examples/data) + - 上传配置: [examples-upload](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0/examples/upload) + - 上传数据: [upload-data](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0/examples/data) ### 2.1.2 上传guest方数据 ```shell flow data upload -c examples/upload/upload_guest.json @@ -56,7 +56,7 @@ flow data upload -c examples/upload/upload_host.json ### 2.2 开始FATE作业 #### 2.2.1 提交作业 当你的数据准备好后,可以开始提交作业给FATE Flow: -- job配置example位于[lr-train](https://github.com/FederatedAI/FATE-Flow/tree/dev-2.0.0-rc/examples/lr/train_lr.yaml); +- job配置example位于[lr-train](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0/examples/lr/train_lr.yaml); - job配置中站点id为"9998"和"9999"。如果你的部署环境为集群版,需要替换成真实的站点id;单机版可使用默认配置。 - 如果想要使用自己的数据,可以更改配置中reader的参数。 - 提交作业的命令为: @@ -452,8 +452,8 @@ flow output download-data -j $job_id -r $role -p $party_id -tn $task_name -o $do ``` ## 3.更多文档 -- [Restful-api](https://github.com/FederatedAI/FATE-Flow/tree/dev-2.0.0-rc/doc/swagger/swagger.yaml) -- [CLI](https://github.com/FederatedAI/FATE-Client/tree/dev-2.0.0-rc/python/fate_client/flow_cli/build/doc) -- [Pipeline](https://github.com/FederatedAI/FATE/tree/dev-2.0.0-rc/doc/tutorial) -- [FATE快速开始](https://github.com/FederatedAI/FATE/tree/dev-2.0.0-rc/doc/2.0/quick_start.md) -- [FATE算法](https://github.com/FederatedAI/FATE/tree/dev-2.0.0-rc/doc/2.0/fate) +- [Restful-api](https://github.com/FederatedAI/FATE-Flow/tree/v2.0.0/doc/swagger/swagger.yaml) +- [CLI](https://github.com/FederatedAI/FATE-Client/tree/v2.0.0/python/fate_client/flow_cli/build/doc) +- [Pipeline](https://github.com/FederatedAI/FATE/tree/v2.0.0/doc/tutorial) +- [FATE快速开始](https://github.com/FederatedAI/FATE/tree/v2.0.0/doc/2.0/fate/quick_start.md) +- [FATE算法](https://github.com/FederatedAI/FATE/tree/v2.0.0/doc/2.0/fate) From aa3dbb0e20733be58f7130862c3370cdeb652175 Mon Sep 17 00:00:00 2001 From: zhihuiwan <15779896112@163.com> Date: Fri, 29 Dec 2023 18:09:45 +0800 Subject: [PATCH 2/2] fix bug Signed-off-by: zhihuiwan <15779896112@163.com> --- python/fate_flow/apps/__init__.py | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/python/fate_flow/apps/__init__.py b/python/fate_flow/apps/__init__.py index d059f1044..7b5528c55 100644 --- a/python/fate_flow/apps/__init__.py +++ b/python/fate_flow/apps/__init__.py @@ -60,7 +60,8 @@ def get_app_module(page_path): def register_page(page_path, func=None, prefix=API_VERSION): page_name = page_path.stem.rstrip('app').rstrip("_") - module_name = '.'.join(page_path.parts[page_path.parts.index('fate_flow')+2:-1] + (page_name, )) + fate_flow_index = len(page_path.parts) - 1 - page_path.parts[::-1].index("fate_flow") + module_name = '.'.join(page_path.parts[fate_flow_index:-1] + (page_name, )) spec = spec_from_file_location(module_name, page_path) page = module_from_spec(spec) page.app = app