Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature](mluOpExecFFT)): add fft op #934

Merged
merged 14 commits into from
Feb 2, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
648 changes: 648 additions & 0 deletions docs/design_docs/fft/fft.md

Large diffs are not rendered by default.

19 changes: 19 additions & 0 deletions docs/user_guide/9_operators/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -729,3 +729,22 @@ mluOpSyncBatchnormBackwardReduce
mluOpSyncBatchNormBackwardElemt
---------------------------------
该算子用来计算输入的梯度,与 :ref:`sync_batchnorm_backward_reduce` 共同实现了sync_batchnorm_backward。

.. _execFFT:

mluOpExecFFT
-----------
对一个长度为N的实数数列进行傅里叶变换。

计算公式如下:

.. math::

y = DFT_{N} x

其中:

- ``x`` 为输入信号。
- ``y`` 为输出信号。
- :math:`DFT_{N}` 为长度为N傅里叶变换的变换矩阵。

52 changes: 52 additions & 0 deletions kernels/fft/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,52 @@
# Introducation

在这个目录中,我们将整合所有与FFT算子相关的代码,代码支持了FFT算子四种模式:rfft(r2c), irfft(c2r), fft(c2c), ifft(c2c);同时,出于性能考虑,在具体实现中,不同规模会调用不同的算法,共有三种:DFT、FFT_cooley-tukey和FFT_stockham,具体的代码组织方式如下:

## 代码目录以及说明

1.目录的树状图如下: <br>
├── c2c_fft <br>
│   ├── c2c_fft.h <br>
│   └── c2c_fft_host.cpp <br>
├── common <br>
│   ├── fft_basic_ops.cpp <br>
│   ├── fft_basic_ops.h <br>
│   ├── fft_common_kernels.h <br>
│   └── fft_common_kernels.mlu <br>
├── fft.h <br>
├── fft.cpp <br>
├── fft_optm_device <br>
│   ├── fft_cooley-tukey_ux_device.mlu <br>
│   └── fft_stockham_u1_device.mlu <br>
├── irfft <br>
│   ├── irfft.h <br>
│   └── irfft_host.cpp <br>
└── rfft <br>
├── rfft.h <br>
└── rfft_host.cpp <br>

2.fft.h和fft.mlu:
* fft.h:文件中定义了一些基本的结构体,如:不同模式、策略、地址等;进行了golbal函数的声明;
* fft.mlu:文件中定义了用户调用的公共接口,如:策略初始化、workspace初始化、host函数选择、基本防呆操作等;每一种模式都会先进入到这个文件,然后根据判断结果,调用对应模式的host代码;

3.common文件夹:
* fft_basic_ops.h:在进行FFT调用时,也会使用到别的接口,如转置、量化、矩阵乘等,这些接口的函数调用封装的声明均放置在这个文件;还有一些封装的基本公共函数也放在这里:如findLimit函数;
* fft_basic_ops.cpp:给出fft_basic_ops.h中声明接口的实现;
* fft_common_kernels.h:生成W矩阵通常是一个耗时的操作,在网络训练中,只有第一次迭代时会生成一次,这里进行了预生成W矩阵接口函数的声明;
* fft_common_kernels.mlu:给出fft_common_kernels.h中声明接口的实现;

4.rfft文件夹:
* rfft.h:给出rfft的策略函数、workspace空间申请函数和执行函数等的声明;
* rfft_host.cpp:rfft host函数的具体实现;会根据策略函数的结果选择:DFT、FFT_cooley-tukey或FFT_stockham算法;

5.irfft文件夹:
* 文件夹结构同rfft,只是针对irfft的声明和实现;

6.c2c_fft文件夹:
* 文件夹结构同rfft,只是针对fft和ifft的声明和实现,因为两者差别只有一个常数因子,所以放在了同一个文件夹中。

7.fft_optm_device文件夹:
* fft_cooley-tukey_ux_device.mlu:优化kernel device代码,基于cooley-tukey算法思想实现;
* fft_stockham_u1_device.mlu:优化kernel device代码,基于stockham算法思想实现;
* 备注:DFT调用cnnlTranspose, cnnlMatmul等kernel实现,调用fft_basic_ops.cpp中封装好的函数即可,未单独实现kernel device代码。

38 changes: 38 additions & 0 deletions kernels/fft/c2c_fft/c2c_fft.h
Original file line number Diff line number Diff line change
@@ -0,0 +1,38 @@
/*************************************************************************
* Copyright (C) [2024] by Cambricon, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a
* copy of this software and associated documentation files (the
* "Software"), to deal in the Software without restriction, including
* without limitation the rights to use, copy, modify, merge, publish,
* distribute, sublicense, and/or sell copies of the Software, and to
* permit persons to whom the Software is furnished to do so, subject to
* the following conditions:
*
* The above copyright notice and this permission notice shall be included
* in all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS
* OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
* MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT.
* IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY
* CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT,
* TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE
* SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.
*************************************************************************/
#ifndef KERNELS_FFT_C2C_FFT_C2C_FFT_H_
#define KERNELS_FFT_C2C_FFT_C2C_FFT_H_

#include <string>
#include "kernels/fft/fft.h"

mluOpStatus_t makeFFT1dPolicy(mluOpHandle_t handle, mluOpFFTPlan_t fft_plan);

mluOpStatus_t setFFT1dReserveArea(mluOpHandle_t handle, mluOpFFTPlan_t fft_plan,
const std::string api);

mluOpStatus_t execFFT1d(mluOpHandle_t handle, const mluOpFFTPlan_t fft_plan,
const void *input, const float scale_factor,
void *workspace, void *output, int direction);

#endif // KERNELS_FFT_C2C_FFT_C2C_FFT_H_
Loading
Loading