Qualcomm® AI Engine Direct 使用手册(26)
- 8.2 高级的
- 8.2.1 QNN HTP 共享缓冲区教程
- 8.2.2 使用 DLC 执行
8.2 高级的
8.2.1 QNN HTP 共享缓冲区教程
介绍
本教程介绍如何使用数据缓冲区在 QNN HTP 后端的处理域之间进行共享访问。使用共享缓冲区可以消除主机 CPU 上的客户端代码和 HTP 加速器之间的数据复制。
HTP 后端支持两种类型的共享内存。
Qnn_MemDescriptor_t 类型 QnnMemHtp_Descriptor_t 类型 描述符 QNN_MEM_TYPE_ION 1、不适用 每个张量将被映射到它自己的共享缓冲区; 2、文件描述符和内存句柄之间的一对一关系 QNN_MEM_TYPE_CUSTOM QNN_HTP_MEM_SHARED_BUFFER 1、多个张量将被映射到一个共享缓冲区;2、文件描述符和内存句柄之间的一对多关系 》笔记
本教程仅关注共享缓冲区的使用。SDK 示例代码中有一些先决条件,此处未详细讨论。用户可以参考QNN文档中的相应部分,或者参考SampleApp。
SampleApp 文档:示例应用程序教程
示例应用代码:${QNN_SDK_ROOT}/examples/QNN/SampleApp
加载必备共享库
配备高通芯片组的硬件设备包含一个共享库,该库提供共享缓冲区操作的功能。
加载共享库
该libcdsprpc.so共享库可在大多数配备高通芯片组的主流设备(SD888 及更高版本)上使用。
我们可以动态加载它,如下所示:
1 void* libCdspHandle = dlopen("libcdsprpc.so", RTLD_NOW | RTLD_LOCAL); 2 3 if (nullptr == libCdspHandle) {4 // handle errors 5 }
解析符号
共享库成功加载后,我们可以继续解析所有必需的符号。
下面的代码片段显示了解析共享库中符号的模板:
1/** 2* Defination: void* rpcmem_alloc(int heapid, uint32 flags, int size); 3* Allocate a buffer via ION and register it with the FastRPC framework. 4* @param[in] heapid Heap ID to use for memory allocation. 5* @param[in] flags ION flags to use for memory allocation. 6* @param[in] size Buffer size to allocate. 7* @return Pointer to the buffer on success; NULL on failure. 8*/ 9typedef void *(*RpcMemAllocFn_t)(int, uint32_t, int); 10 11/** 12* Defination: void rpcmem_free(void* po); 13* Free a buffer and ignore invalid buffers. 14*/ 15typedef void (*RpcMemFreeFn_t)(void *); 16 17/** 18* Defination: int rpcmem_to_fd(void* po); 19* Return an associated file descriptor. 20* @param[in] po Data pointer for an RPCMEM-allocated buffer. 21* @return Buffer file descriptor. 22*/ 23typedef int (*RpcMemToFdFn_t)(void *); 24 25RpcMemFreeFn_t rpcmem_alloc = (RpcMemAllocFn_t)dlsym(libCdspHandle, "rpcmem_alloc"); 26RpcMemFreeFn_t rpcmem_free = (RpcMemFreeFn_t)dlsym(libCdspHandle, "rpcmem_free"); 27RpcMemToFdFn_t rpcmem_to_fd = (RpcMemToFdFn_t)dlsym(libCdspHandle, "rpcmem_to_fd"); 28if (nullptr == rpcmem_alloc || nullptr == rpcmem_free || nullptr == rpcmem_to_fd) {29 dlclose(libCdspHandle); 30 // handle errors 31}
将 QNN_MEM_TYPE_ION 与 QNN API 结合使用
以下是 ION 共享缓冲区的表示,其中每个张量都有自己的共享缓冲区,具有自己唯一的内存指针、文件描述符和内存句柄。
一个例子如下所示:
HTP 共享缓冲区示例
1// QnnInterface_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h 2QnnInterface_t qnnInterface; 3// Init qnn interface ...... 4// See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code 5 6// Qnn_Tensor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnTypes.h 7Qnn_Tensor_t inputTensor; 8// Set up common setting for inputTensor ...... 9/* There are 2 specific settings for shared buffer: 10* 1. memType should be QNN_TENSORMEMTYPE_MEMHANDLE; (line 40) 11* 2. union member memHandle should be used instead of clientBuf, and it 12* should be set to nullptr. (line 41) 13*/ 14 15 16size_t bufSize; 17// Calculate the bufSize base on tensor dimensions and data type ...... 18 19#define RPCMEM_HEAP_ID_SYSTEM 25 20#define RPCMEM_DEFAULT_FLAGS 1 21 22// Allocate the shared buffer 23uint8_t* memPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, bufSize); 24if (nullptr == memPointer) {25 // handle errors 26} 27 28int memFd = rpcmem_to_fd(memPointer); 29if (-1 == memfd) {30 // handle errors 31} 32 33// Fill the info of Qnn_MemDescriptor_t and regist the buffer to QNN 34// Qnn_MemDescriptor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnMem.h 35Qnn_MemDescriptor_t memDescriptor = QNN_MEM_DESCRIPTOR_INIT; 36memDescriptor.memShape = {inputTensor.rank, inputTensor.dimensions, nullptr}; 37memDescriptor.dataType = inputTensor.dataType; 38memDescriptor.memType = QNN_MEM_TYPE_ION; 39memDescriptor.ionInfo.fd = memfd; 40inputTensor.memType = QNN_TENSORMEMTYPE_MEMHANDLE; 41inputTensor.memHandle = nullptr; 42Qnn_ContextHandle_t context; // Must obtain a QNN context handle before memRegister() 43// To obtain QNN context handle: 44// For online prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#create-context 45// For offline prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#load-context-from-a-cached-binary 46Qnn_ErrorHandle_t registRet = qnnInterface->memRegister(context, &memDescriptor, 1u, &(inputTensor.memHandle)); 47if (QNN_SUCCESS != registRet) {48 rpcmem_free(memPointer); 49 // handle errors 50} 51 52/** 53* At this place, the allocation and registration of the shared buffer has been complete. 54* On QNN side, the buffer has been bound by memfd 55* On user side, this buffer can be manipulated through memPointer. 56*/ 57 58/** 59* Optionally, user can also allocate and register shared buffer for output as adove codes (lines 7-46). 60* And if so the output buffer also should be deregistered and freed as below codes (lines 66-70). 61*/ 62 63// Load the input data to memPointer ...... 64 65// Execute QNN graph with input tensor and output tensor ...... 66 67// Get output data ...... 68 69// Deregister and free all buffers if it's not being used 70Qnn_ErrorHandle_t deregisterRet = qnnInterface->memDeRegister(&tensors.memHandle, 1); 71if (QNN_SUCCESS != registRet) {72 // handle errors 73} 74rpcmem_free(memPointer);
将 QNN_HTP_MEM_SHARED_BUFFER 与 QNN API 结合使用
以下是多张量共享缓冲区的表示,其中一组张量映射到单个共享缓冲区。这个单个共享缓冲区有一个内存指针和一个文件描述符,但是每个张量都有自己的内存指针偏移量和内存句柄。
一个例子如下所示:
HTP 多张量共享缓冲区示例
1// QnnInterface_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnInterface.h 2QnnInterface_t qnnInterface; 3// Init qnn interface ...... 4// See ${QNN_SDK_ROOT}/examples/QNN/SampleApp code 5 6// Total number of input tensors 7size_t numTensors; 8 9// Qnn_Tensor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnTypes.h 10Qnn_Tensor_t inputTensors[numTensors]; 11// Set up common setting for inputTensor ...... 12/* There are 2 specific settings for shared buffer: 13* 1. memType should be QNN_TENSORMEMTYPE_MEMHANDLE; (line 40) 14* 2. union member memHandle should be used instead of clientBuf, and it 15* should be set to nullptr. (line 41) 16*/ 17 18// Calculate the shared buffer size 19uint64_t totalBufferSize; 20for (size_t tensorIdx = 0; tensorIdx < numTensors; tensorIdx++) { 21 // Calculate the tensorSize based on tensor dimensions and data type 22 totalBufferSize += tensorSize; 23} 24 25#define RPCMEM_HEAP_ID_SYSTEM 25 26#define RPCMEM_DEFAULT_FLAGS 1 27 28// Allocate the shard buffer 29uint8_t* memPointer = (uint8_t*)rpcmem_alloc(RPCMEM_HEAP_ID_SYSTEM, RPCMEM_DEFAULT_FLAGS, totalBufferSize); 30if (nullptr == memPointer) { 31 // handle errors 32} 33 34// Get a file descriptor for the buffer 35int memFd = rpcmem_to_fd(memPointer); 36if (-1 == memfd) { 37 // handle errors 38} 39 40// Regiter the memory handles using memory descriptors 41// This is the offset of the tensor location in the shared buffer 42uint64_t offset; 43for (size_t tensorIdx = 0; tensorIdx < numTensors; tensorIdx++) { 44 // Fill the info of Qnn_MemDescriptor_t and register the descriptor to QNN 45 // Qnn_MemDescriptor_t is defined in ${QNN_SDK_ROOT}/include/QNN/QnnMem.h 46 Qnn_MemDescriptor_t memDescriptor; 47 memDescriptor.memShape = {inputTensors[tensorIdx].rank, inputTensors[tensorIdx].dimensions, nullptr}; 48 memDescriptor.dataType = inputTensors[tensorIdx].dataType; 49 memDescriptor.memType = QNN_MEM_TYPE_CUSTOM; 50 inputTensor[tensorIdx].memType = QNN_TENSORMEMTYPE_MEMHANDLE; 51 inputTensor[tensorIdx].memHandle = nullptr; 52 53 // Fill the info of QnnMemHtp_Descriptor_t and set as custom info 54 // QnnMemHtp_Descriptor_t is defined in ${QNN_SDK_ROOT}/include/QNN/HTP/QnnHtpMem.h 55 QnnMemHtp_Descriptor_t htpMemDescriptor; 56 htpMemDescriptor.type = QNN_HTP_MEM_SHARED_BUFFER; 57 htpMemDescriptor.size = totalBufferSize; //Note: it's total buffer size 58 59 QnnHtpMem_SharedBufferConfig_t htpSharedBuffConfig = {memFd, offset}; 60 htpMemDescriptor.sharedBufferConfig = htpSharedBuffConfig; 61 62 memDescriptor.customInfo = &htpMemDescriptor; 63 64 Qnn_ContextHandle_t context; // Must obtain a QNN context handle before memRegister() 65 // To obtain QNN context handle: 66 // For online prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#create-context 67 // For offline prepare, refer to ${QNN_SDK_ROOT}/docs/general/sample_app.html#load-context-from-a-cached-binary 68 69 Qnn_ErrorHandle_t registRet = qnnInterface->memRegister(context, &memDescriptor, 1u, &(inputTensor[tensorIdx].memHandle)); 70 if (QNN_SUCCESS != registRet) { 71 // Deregister already created memory handles 72 rpcmem_free(memPointer); 73 // handle errors 74 } 75 76 // move offset by the tensor size 77 offset = offset + tensorSize; 78} 79 80/** 81* At this place, the allocation and registration of the shared buffer has been complete. 82* On QNN side, the buffer has been bound by memfd 83* On user side, this buffer can be manipulated through memPointer and offset. 84*/ 85 86/** 87* Optionally, user can also allocate and register shared buffer for output as adove codes (lines 7-78). 88* And if so the output buffer also should be deregistered and freed as below codes (lines 98-104). 89*/ 90 91// Load the input data to memPointer with respecitve offsets ...... 92 93// Execute QNN graph with input tensors and output tensors ...... 94 95// Get output data from the memPointer and offset combination ...... 96 97// Deregister all mem handles the buffer if it's not being used 98for (size_t tensorIdx = 0; tensorIdx < numTensors; tensorIdx++) { 99 Qnn_ErrorHandle_t deregisterRet = qnnInterface->memDeRegister(&(inputTensors[tensorIdx].memHandle), 1); 100 if (QNN_SUCCESS != registRet) {101 // handle errors 102 } 103} 104rpcmem_free(memPointer);
8.2.2 使用 DLC 执行
教程设置
本教程假设已遵循QNN和SNPE的一般设置说明。特别是,使用工具转换为 DLC需要适当设置 PYTHONPATH 和 SNPE_ROOT。
此外,本教程需要获取 Inception V3 Tensorflow 模型文件和示例图像。这是由提供的安装脚本处理的setup_inceptionv3.py。该脚本位于:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py
用法如下:
usage: setup_inceptionv3.py [-h] -a ASSETS_DIR [-d] [-c] [-q] Prepares the inception_v3 assets for tutorial examples. required arguments: -a ASSETS_DIR, --assets_dir ASSETS_DIR directory containing the inception_v3 assets optional arguments: -d, --download Download inception_v3 assets to inception_v3 example directory -c, --convert_model Convert and compile model once acquired. -q, --quantize_model Quantize the model during conversion. Only available if --c or --convert_model option is chosen
在使用脚本之前,请将环境变量设置TENSORFLOW_HOME为指向TensorFlow包的安装位置。该脚本使用 TensorFlow 实用程序,例如 optimize_for_inference.py,它们位于 TensorFlow 安装目录中。
- 找到TensorFlow包的位置:
$ python3 -m pip show tensorflow
- TENSORFLOW_HOME使用 TensorFlow 包的安装位置(步骤 #1 中输出的位置字段)设置环境变量:
$ export TENSORFLOW_HOME=
/tensorflow_core - 使用以下脚本安装 Inception V3 TensorFlow 模型和示例图像setup_inceptionv3.py:
$ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/setup_inceptionv3.py -a ~/tmpdir -d
该模型文件现在应填充在以下位置:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb
此原始图像现在应填充在以下位置:
${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped
型号转换
获取模型资产后,可以使用 Qualcomm® 神经处理 SDK 中的转换工具将模型转换为 DLC。
笔记
HTP 和 DSP 后端需要使用量化模型。请参阅模型量化以生成量化的 DLC。
使用snpe-tensorflow-to-dlc工具转换 Inception V3 模型 。
$ ${SNPE_ROOT}/bin/x86_64-linux-clang/snpe-tensorflow-to-dlc \ --input_network ${QNN_SDK_ROOT}/examples/Models/InceptionV3/tensorflow/inception_v3_2016_08_28_frozen.pb \ --input_dim input 1,299,299,3 \ --out_node InceptionV3/Predictions/Reshape_1 \ --output_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc \
这会生成${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlcDLC 文件。
DLC 包含序列化模型、网络拓扑和关联的模型数据。
模型量化
DLC 可以使用snpe-dlc-quantize 工具进行量化。用法示例如下:
$ ${SNPE_ROOT}/bin/x86_64-linux-clang/snpe-dlc-quantize \ --input_dlc ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc \ --input_list ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped/raw_list.txt \ --output_dlc ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc \
这将产生以下工件:
- ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc
笔记
量化模型时,输入列表必须包含输入数据的绝对路径。
执行需要生成的 DLC 和提供的实用程序库libQnnModelDlc.so。该库扩展了QNN 模型 API 以组成 QNN 图并从提供的 DLC 路径返回其句柄。
ModelError_t QnnModel_composeGraphsFromDlc(Qnn_BackendHandle_t backendHandle, QNN_INTERFACE_VER_TYPE interface, Qnn_ContextHandle_t contextHandle, const GraphConfigInfo_t **graphsConfigInfo, const char *dlcPath, const uint32_t numGraphsConfigInfo, GraphInfoPtr_t **graphsInfo, uint32_t *numGraphsInfo, bool debug, QnnLog_Callback_t logCallback, QnnLog_Level_t maxLogLevel)
QnnGraph_ComposeGraphs这与添加了输入参数的 API相同dlcPath 。然后可以最终确定并执行返回的 QNN 图句柄。
以下部分演示了 DLC 的执行。
CPU后端执行
在Linux主机上执行
- qnn-net-run使用libQnnModelDlc.so实用程序库作为–model参数和 Inception_v3.dlc 作为参数来执行模型–dlc_path。
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-net-run \ --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnCpu.so \ --model ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnModelDlc.so \ --dlc_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc \ --input_list data/cropped/raw_list.txt
结果将位于${QNN_SDK_ROOT}/examples/Models/InceptionV3/output。
查看结果。
$ python ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output/ \ -l data/imagenet_slim_labels.txt
在安卓上执行
在 Android 目标上运行 CPU 后端与在 Linux x86 目标上运行类似。
在 Android 设备上为示例创建一个目录。
$ adb shell "mkdir /data/local/tmp/inception_v3"
将必要的库和 DLC 推送到设备。
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnCpu.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnModelDlc.so /data/local/tmp/inception_v3
将输入数据和列表推送到设备。
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
将qnn-net-run工具推至设备。
$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
设置设备环境。
$ adb shell $ cd /data/local/tmp/inception_v3 $ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3
qnn-net-run使用以下参数运行。
$ ./qnn-net-run --backend libQnnCpu.so --model libQnnModelDlc.so --dlc_path Inception_v3.dlc --input_list target_raw_list.txt
运行的输出将位于默认的 ./output 目录中。
退出设备并查看结果。
$ exit $ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ adb pull /data/local/tmp/inception_v3/output output_android $ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output_android/ \ -l data/imagenet_slim_labels.txt
GPU后端执行
笔记
不支持在 Windows 设备上运行 GPU 后端。
在安卓上执行
在 Android 目标上运行 GPU 后端与在 Android 目标上运行 CPU 后端类似。
在 Android 设备上为示例创建一个目录。
$ adb shell "mkdir /data/local/tmp/inception_v3"
将必要的库和 DLC 推送到设备。
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnGpu.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3.dlc /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnModelDlc.so /data/local/tmp/inception_v3
将输入数据和列表推送到设备。
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
将qnn-net-run工具推至设备。
$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
设置设备环境。
$ adb shell $ cd /data/local/tmp/inception_v3 $ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3
qnn-net-run使用以下参数运行。
$ ./qnn-net-run --backend libQnnGpu.so --model libQnnModelDlc.so --dlc_path Inception_v3.dlc --input_list target_raw_list.txt
运行的输出将位于默认的 ./output 目录中。
退出设备并查看结果。
$ exit $ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ adb pull /data/local/tmp/inception_v3/output output_android $ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output_android/ \ -l data/imagenet_slim_labels.txt
HTP 后端执行
在Linux主机上执行
笔记
可以使用 HTP 模拟后端在 Linux 主机上运行 HTP 后端。
qnn-net-run使用libQnnModelDlc.so实用程序库作为–model参数和 Inception_v3_quantized.dlc 作为参数来执行模型–dlc_path。
$ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-net-run \ --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \ --model ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnModelDlc.so \ --dlc_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc \ --input_list data/cropped/raw_list.txt
笔记
HTP 仿真后端需要量化模型。有关量化的更多信息,请参阅模型量化。
结果将位于${QNN_SDK_ROOT}/examples/Models/InceptionV3/output。
查看结果。
$ python ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output/ \ -l data/imagenet_slim_labels.txt
在安卓上执行
在 Android 目标上运行 HTP 后端与在 Android 目标上运行 CPU 和 GPU 后端类似,不同之处在于 HTP 后端需要量化模型和用户生成的序列化上下文。有关量化的更多信息,请参阅模型量化。
- qnn-context-binary-generator通过使用 libQnnModelDlc.so 作为–model参数和量化 DLC 作为参数运行,从 DLC 生成序列化上下文–dlc_path。
$ ${QNN_SDK_ROOT}/bin/x86_64-linux-clang/qnn-context-binary-generator \ --backend ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnHtp.so \ --model ${QNN_SDK_ROOT}/lib/x86_64-linux-clang/libQnnModelDlc.so \ --dlc_path ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc \ --binary_file Inception_v3_quantized.serialized
上下文将在 处创建./output/Inception_v3_quantized.serialized.bin。
- 在 Android 设备上为示例创建一个目录。
$ adb shell "mkdir /data/local/tmp/inception_v3"
- 将必要的库和 DLC 推送到设备。
$ adb push ${QNN_SDK_ROOT}/lib/hexagon-v68/unsigned/libQnnHtpV68Skel.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpV68Stub.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtp.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/output/Inception_v3_quantized.serialized.bin /data/local/tmp/inception_v3
笔记
本节演示了 Android 上的 HTP 执行以及离线准备的图形步骤。要执行设备上(在线)准备好的图表,请推送设备上准备库和量化 DLC。
$ adb push ${QNN_SDK_ROOT}/lib/aarch64-android/libQnnHtpPrepare.so /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/model/Inception_v3_quantized.dlc /data/local/tmp/inception_v3
将输入数据和列表推送到设备。
$ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/cropped /data/local/tmp/inception_v3 $ adb push ${QNN_SDK_ROOT}/examples/Models/InceptionV3/data/target_raw_list.txt /data/local/tmp/inception_v3
将qnn-net-run工具推至设备。
$ adb push ${QNN_SDK_ROOT}/bin/aarch64-android/qnn-net-run /data/local/tmp/inception_v3
设置设备环境。
$ adb shell $ cd /data/local/tmp/inception_v3 $ export LD_LIBRARY_PATH=/data/local/tmp/inception_v3 $ export ADSP_LIBRARY_PATH="/data/local/tmp/inception_v3"
qnn-net-run使用以下参数运行。
$ ./qnn-net-run --backend libQnnHtp.so --input_list target_raw_list.txt --retrieve_context Inception_v3_quantized.serialized.bin
运行的输出将位于默认的 ./output 目录中。
退出设备并查看结果。
$ exit $ cd ${QNN_SDK_ROOT}/examples/Models/InceptionV3 $ adb pull /data/local/tmp/inception_v3/output output_android $ python3 ${QNN_SDK_ROOT}/examples/Models/InceptionV3/scripts/show_inceptionv3_classifications.py -i data/cropped/raw_list.txt \ -o output_android/ \ -l data/imagenet_slim_labels.txt