ffaa3550 by 乔峰昇

update

1 parent 298fa6f9
Showing 59 changed files with 11705 additions and 1 deletions
model_repository/
__pycache__/
......
# turnsole
A series of convenience functions make your machine learning project easier
## 安装方法
### Latest release
`pip install turnsole`
> 项目暂不开源,因此该安装方法暂时不保证能用
### Developer mode
`pip install -e .`
## 快速上手
### PDF 操作
#### 智能 PDF 文件转图片
智能的把 PDF 文件里面的插图找出来,例如没有插图就将整页 PDF 截图下来,也能智能的将碎图拼接在一起
##### Example:
<pre># pdf_path 表示 PDF 文件的路径,输出 images 按页码进行汇总输出
images = turnsole.pdf_to_images(pdf_path)</pre>
### 图像操作工具箱
#### base64_to_bgr / bgr_to_base64
图像和 base64 互相转换
##### Example:
<pre>image = turnsole.base64_to_bgr(img64)
img64 = turnsole.bgr_to_base64(image)</pre>
### image_crop
根据 bbox 在 image 上进行切片,如果指定 perspective 为 True 则切片方式为透视变换(可以切旋转目标)
##### Example:
<pre>im_slice_no_perspective = turnsole.image_crop(image, bbox)
im_slice = turnsole.image_crop(image, bbox, perspective=True)</pre>
##### Output:
<img src="docs/images/image_crop.png?raw=true" alt="image crop example" style="max-width: 200px;">
### OCR 引擎模块
OCR 引擎指的是一系列跟 OCR 相关的底层模型,我们提供了这些模型的函数式调用接口和标准 API
- [x] ADC :tada:
- [x] DBNet :tada:
- [x] CRNN :tada:
- [x] Object Detector :tada:
- [x] Signature Detector :tada:
#### 免费试用
```python
import requests
results = requests.post(url=r'http://139.196.149.46:9001/gen_ocr', files={'file': open(file_path, 'rb')}).json()
ocr_results = results['ocr_results']
```
#### Prerequisites
由于 OCR 引擎模块依赖于底层神经网络模型,因此需要先用 Docker 挂载底层神经网络模型
首先把 ./model_repository 文件夹和里面的模型放到项目根目录下再启动,如果没有相关模型找 [lvkui](lvkui@situdata.com)
使用起来非常简单,你只需要启动对应的 Docker 容器即可
```bash
docker run --gpus="device=0" --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD/model_repository:/models nvcr.io/nvidia/tritonserver:21.10-py3 tritonserver --model-repository=/models
```
#### ADC
通用文件摆正算法
```
from turnsole.ocr_engine import angle_detector
image_rotated, direction = angle_detector.ADC(image, fine_degree=False)
```
#### DBNet
通用文字检测算法
```
from turnsole.ocr_engine import text_detector
boxes = text_detector.predict(image)
```
#### CRNN
通用文字识别算法
```
from turnsole.ocr_engine import text_recognizer
ocr_result, ocr_time = text_recognizer.predict_batch(image, boxes)
```
#### Object Detector
通用文件检测算法
```
from turnsole.ocr_engine import object_detector
object_list = object_detector.process(image)
```
#### Signature Detector
签字盖章二维码检测算法
```
from turnsole.ocr_engine import signature_detector
signature_list = signature_detector.process(image)
```
#### 标准 API
```
python api/ocr_engine_server.py
```
\ No newline at end of file
[2022-10-21 14:12:17 +0800] [8546] [INFO] Goin' Fast @ http://192.168.10.11:9001
[2022-10-21 14:12:17 +0800] [8567] [INFO] Starting worker [8567]
[2022-10-21 14:12:17 +0800] [8568] [INFO] Starting worker [8568]
[2022-10-21 14:12:17 +0800] [8569] [INFO] Starting worker [8569]
[2022-10-21 14:12:17 +0800] [8570] [INFO] Starting worker [8570]
[2022-10-21 14:12:17 +0800] [8571] [INFO] Starting worker [8571]
[2022-10-21 14:12:17 +0800] [8572] [INFO] Starting worker [8572]
[2022-10-21 14:12:17 +0800] [8573] [INFO] Starting worker [8573]
[2022-10-21 14:12:17 +0800] [8576] [INFO] Starting worker [8576]
[2022-10-21 14:12:17 +0800] [8574] [INFO] Starting worker [8574]
[2022-10-21 14:12:17 +0800] [8575] [INFO] Starting worker [8575]
[2022-10-21 14:13:51 +0800] [8575] [ERROR] Exception occurred while handling uri: 'http://192.168.10.11:9001/gen_ocr'
Traceback (most recent call last):
File "/home/situ/miniconda3/envs/workenv/lib/python3.6/site-packages/sanic/app.py", line 944, in handle_request
response = await response
File "ocr_engine_server.py", line 37, in ocr_engine
boxes = text_detector.predict(image)
File "/home/situ/qfs/invoice_tamper/09_project/project/bank_bill_ocr/OCR_Engine/turnsole/ocr_engine/DBNet/text_detector.py", line 113, in predict
outputs=outputs
File "/home/situ/miniconda3/envs/workenv/lib/python3.6/site-packages/tritonclient/grpc/__init__.py", line 1431, in infer
raise_error_grpc(rpc_error)
File "/home/situ/miniconda3/envs/workenv/lib/python3.6/site-packages/tritonclient/grpc/__init__.py", line 62, in raise_error_grpc
raise get_error_grpc(rpc_error) from None
tritonclient.utils.InferenceServerException: [StatusCode.UNAVAILABLE] Request for unknown model: 'dbnet_model' is not found
[2022-10-21 14:13:51 +0800] - (sanic.access)[INFO][192.168.10.11:57260]: POST http://192.168.10.11:9001/gen_ocr 500 735
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Create Date : 2022-06-05 20:49:51
# @Last Modified : 2022-08-19 17:24:55
# @Description :
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
from sanic import Sanic
from sanic.response import json
from turnsole.ocr_engine import angle_detector
from turnsole.ocr_engine import text_detector
from turnsole.ocr_engine import text_recognizer
from turnsole.ocr_engine import object_detector
from turnsole.ocr_engine import signature_detector
from turnsole import bytes_to_bgr
app = Sanic("OCR_ENGINE")
app.config.REQUEST_MAX_SIZE = 1000000000 # 请求的大小(字节)/ 1GB
app.config.REQUEST_BUFFER_QUEUE_SIZE = 1000 # 请求流缓冲区队列大小
app.config.REQUEST_TIMEOUT = 600 # 请求到达需要多长时间(秒)
app.config.RESPONSE_TIMEOUT = 600 # 处理响应需要多长时间(秒)
@app.post('/gen_ocr')
async def ocr_engine(request):
# request.files.get() 具有 type/body/name 三个属性
file = request.files.get('file').body
# 将 bytes 转成 bgr 图片
image = bytes_to_bgr(file)
# 文字检测
boxes = text_detector.predict(image)
# 文字识别
res, _ = text_recognizer.predict_batch(image[..., ::-1], boxes)
resp = {}
resp["ocr_results"] = res
return json(resp)
@app.post('/gen_ocr_with_rotation', )
async def ocr_engine_with_rotation(request):
# request.files.get() 具有 type/body/name 三个属性
file = request.files.get('file').body
# 将 bytes 转成 bgr 图片
image = bytes_to_bgr(file)
# 方向检测
image, direction = angle_detector.ADC(image.copy(), fine_degree=False)
# 文字检测
boxes = text_detector.predict(image)
# 文字识别
res, _ = text_recognizer.predict_batch(image[..., ::-1], boxes)
resp = {}
resp["ocr_results"] = res
resp["direction"] = direction
return json(resp)
@app.post("/object_detect")
async def object_detect(request):
# request.files.get() 具有 type/body/name 三个属性
file = request.files.get('file').body
# 将 bytes 转成 bgr 图片
image = bytes_to_bgr(file)
# 通用文件检测
object_list = object_detector.process(image)
return json(object_list)
@app.post("/signature_detect")
async def signature_detect(request):
# request.files.get() 具有 type/body/name 三个属性
file = request.files.get('file').body
# 将 bytes 转成 bgr 图片
image = bytes_to_bgr(file)
# 签字盖章二维码条形码检测
signature_list = signature_detector.process(image)
return json(signature_list)
if __name__ == "__main__":
# app.run(host="0.0.0.0", port=9001)
app.run(host="192.168.10.11", port=9002, workers=10)
# uvicorn server:app --port 9001 --workers 10
No preview for this file type
No preview for this file type
# Modified from:
# https://www.pyimagesearch.com/2017/02/06/faster-video-file-fps-with-cv2-videocapture-and-opencv/
# Performance:
# Python 2.7: 105.78 --> 131.75
# Python 3.7: 15.36 --> 50.13
# USAGE
# python read_frames_fast.py --video videos/jurassic_park_intro.mp4
# import the necessary packages
from turnsole.video import FileVideoStream
from turnsole.video import FPS
import numpy as np
import argparse
import imutils
import time
import cv2
def filterFrame(frame):
frame = imutils.resize(frame, width=450)
frame = cv2.cvtColor(frame, cv2.COLOR_BGR2GRAY)
frame = np.dstack([frame, frame, frame])
return frame
# construct the argument parse and parse the arguments
ap = argparse.ArgumentParser()
ap.add_argument("-v", "--video", required=True,
help="path to input video file")
args = vars(ap.parse_args())
# start the file video stream thread and allow the buffer to
# start to fill
print("[INFO] starting video file thread...")
fvs = FileVideoStream(args["video"], transform=filterFrame).start()
time.sleep(1.0)
# start the FPS timer
fps = FPS().start()
# loop over frames from the video file stream
while fvs.running():
# grab the frame from the threaded video file stream, resize
# it, and convert it to grayscale (while still retaining 3
# channels)
frame = fvs.read()
# Relocated filtering into producer thread with transform=filterFrame
# Python 2.7: FPS 92.11 -> 131.36
# Python 3.7: FPS 41.44 -> 50.11
#frame = filterFrame(frame)
# display the size of the queue on the frame
cv2.putText(frame, "Queue Size: {}".format(fvs.Q.qsize()),
(10, 30), cv2.FONT_HERSHEY_SIMPLEX, 0.6, (0, 255, 0), 2)
# show the frame and update the FPS counter
cv2.imshow("Frame", frame)
cv2.waitKey(1)
if fvs.Q.qsize() < 2: # If we are low on frames, give time to producer
time.sleep(0.001) # Ensures producer runs now, so 2 is sufficient
fps.update()
# stop the timer and display FPS information
fps.stop()
print("[INFO] elasped time: {:.2f}".format(fps.elapsed()))
print("[INFO] approx. FPS: {:.2f}".format(fps.fps()))
# do a bit of cleanup
cv2.destroyAllWindows()
fvs.stop()
\ No newline at end of file
import cv2
import turnsole
if __name__ == '__main__':
img = cv2.imread('./images/sunflower.jpg')
img = turnsole.resize(img, width=512)
cv2.imshow('image', img)
cv2.waitKey()
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Created Date : 2021-03-05 16:51:22
# @Last Modified : 2021-03-05 18:15:53
# @Description :
from turnsole.model import EasyDet
if __name__ == '__main__':
model = EasyDet(phi=0)
model.summary()
import time
import numpy as np
x = np.random.random_sample((1, 640, 640, 3))
# warm up
output = model.predict(x)
print('\n[INFO] Test start')
time_start = time.time()
for i in range(1000):
output = model.predict(x)
time_end = time.time()
print('[INFO] Time used: {:.2f} ms'.format((time_end - time_start)*1000/(i+1)))
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Create Date : 2022-07-22 13:10:47
# @Last Modified : 2022-09-08 19:03:24
# @Description :
import os
os.environ['CUDA_VISIBLE_DEVICES'] = '-1'
import cv2
# from turnsole.ocr_engine import angle_detector
from turnsole.ocr_engine import object_detector
import matplotlib.pyplot as plt
if __name__ == "__main__":
base_dir = '/home/lk/MyProject/BMW/数据集/文件分类/身份证'
for (rootDir, dirNames, filenames) in os.walk(base_dir):
for filename in filenames:
if not filename.endswith('.jpg'):
continue
img_path = os.path.join(rootDir, filename)
print(img_path)
image = cv2.imread(img_path)
results = object_detector.process(image)
print(results)
for item in results:
xmin = item['location']['xmin']
ymin = item['location']['ymin']
xmax = item['location']['xmax']
ymax = item['location']['ymax']
cv2.rectangle(image, (xmin, ymin), (xmax, ymax), (0, 255, 0), 2)
plt.imshow(image[...,::-1])
plt.show()
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Create Date : 2022-07-22 13:10:47
# @Last Modified : 2022-08-24 15:39:55
# @Description :
import os
import cv2
import fitz
from turnsole import pdf_to_images # pip install turnsole PyMuPDF opencv-python==4.4.0.44
if __name__ == "__main__":
base_dir = '/PATH/TO/YOUR/WORKDIR'
for (rootDir, dirNames, filenames) in os.walk(base_dir):
for filename in filenames:
if not filename.endswith('.pdf'):
continue
pdf_path = os.path.join(rootDir, filename)
print(pdf_path)
images = pdf_to_images(pdf_path)
images = sum(images, [])
image_dir = os.path.join(rootDir, filename.replace('.pdf', ''))
if not os.path.exists(image_dir):
os.makedirs(image_dir)
for index, image in enumerate(images):
save_path = os.path.join(image_dir, filename.replace('.pdf', '')+'-'+str(index)+'.jpg')
cv2.imwrite(save_path, image)
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Create Date : 2022-05-06 22:02:01
# @Last Modified : 2022-08-03 14:59:51
# @Description :
import os
import time
import random
import requests
import numpy as np
from threading import Thread
class API_test:
def __init__(self, file_dir, test_time, num_request):
self.file_paths = []
for fn in os.listdir(file_dir):
file_path = os.path.join(file_dir, fn)
self.file_paths.append(file_path)
self.time_start = time.time()
self.test_time = test_time * 60 # 单位:秒
threads = []
for i in range(num_request):
t = Thread(target=self.update, args=())
threads.append(t)
for t in threads:
print(f'[INFO] {t} is running')
t.start()
self.results = list()
self.index = 0
def update(self):
while True:
file_path = random.choice(self.file_paths)
# 二进制方式打开图片文件
data = open(file_path, 'rb')
t0 = time.time()
response = requests.post(url=r'http://localhost:9001/gen_ocr_with_rotation', files={'file': data})
# 失败请求统计
if response.status_code != 200:
print(response)
t1 = time.time()
self.results.append((t1-t0))
time_cost = (time.time() - self.time_start)
time_remaining = self.test_time - time_cost
self.index += 1
if time_remaining > 0:
print(f'\r[INFO] 剩余时间 {time_remaining} 秒, 平均响应时间 {np.mean(self.results)} 秒, TPS {len(self.results)/time_cost}, 吞吐量 {self.index}', end=' ', flush=True)
else:
break
if __name__ == '__main__':
imageDir = './demos/img_ocr' # 测试数据路径
testTime = 10 # 加压时间, 单位:分钟
numRequest = 10 # 并发数,单位:个
API_test(imageDir, testTime, numRequest)
[metadata]
name = turnsole
version = 0.0.27
author = Kui Lyu
author_email = 9428.al@gmail.com
description = A series of convenience functions make your machine learning project easier
long_description = file: README.md
long_description_content_type = text/markdown
url = https://github.com/Antonio-hi/turnsole
project_urls =
Bug Tracker = https://github.com/Antonio-hi/turnsole/issues
classifiers =
Programming Language :: Python :: 3
License :: OSI Approved :: MIT License
Operating System :: OS Independent
[options]
packages = find:
python_requires = >=3.6
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : lk
# @Email : 9428.al@gmail.com
# @Created Date : 2021-03-04 16:56:27
# @Last Modified : 2021-03-04 17:16:57
# @Description :
import setuptools
setuptools.setup()
\ No newline at end of file
Metadata-Version: 2.1
Name: turnsole
Version: 0.0.27
Summary: A series of convenience functions make your machine learning project easier
Home-page: https://github.com/Antonio-hi/turnsole
Author: Kui Lyu
Author-email: 9428.al@gmail.com
License: UNKNOWN
Project-URL: Bug Tracker, https://github.com/Antonio-hi/turnsole/issues
Platform: UNKNOWN
Classifier: Programming Language :: Python :: 3
Classifier: License :: OSI Approved :: MIT License
Classifier: Operating System :: OS Independent
Requires-Python: >=3.6
Description-Content-Type: text/markdown
License-File: LICENSE
# turnsole
A series of convenience functions make your machine learning project easier
## 安装方法
### Latest release
`pip install turnsole`
> 项目暂不开源,因此该安装方法暂时不保证能用
### Developer mode
`pip install -e .`
## 快速上手
### PDF 操作
#### 智能 PDF 文件转图片
智能的把 PDF 文件里面的插图找出来,例如没有插图就将整页 PDF 截图下来,也能智能的将碎图拼接在一起
##### Example:
<pre># pdf_path 表示 PDF 文件的路径,输出 images 按页码进行汇总输出
images = turnsole.pdf_to_images(pdf_path)</pre>
### 图像操作工具箱
#### base64_to_bgr / bgr_to_base64
图像和 base64 互相转换
##### Example:
<pre>image = turnsole.base64_to_bgr(img64)
img64 = turnsole.bgr_to_base64(image)</pre>
### image_crop
根据 bbox 在 image 上进行切片,如果指定 perspective 为 True 则切片方式为透视变换(可以切旋转目标)
##### Example:
<pre>im_slice_no_perspective = turnsole.image_crop(image, bbox)
im_slice = turnsole.image_crop(image, bbox, perspective=True)</pre>
##### Output:
<img src="docs/images/image_crop.png?raw=true" alt="image crop example" style="max-width: 200px;">
### OCR 引擎模块
OCR 引擎指的是一系列跟 OCR 相关的底层模型,我们提供了这些模型的函数式调用接口和标准 API
- [x] ADC :tada:
- [x] DBNet :tada:
- [x] CRNN :tada:
- [x] Object Detector :tada:
- [x] Signature Detector :tada:
#### 免费试用
```python
import requests
results = requests.post(url=r'http://139.196.149.46:9001/gen_ocr', files={'file': open(file_path, 'rb')}).json()
ocr_results = results['ocr_results']
```
#### Prerequisites
由于 OCR 引擎模块依赖于底层神经网络模型,因此需要先用 Docker 挂载底层神经网络模型
首先把 ./model_repository 文件夹和里面的模型放到项目根目录下再启动,如果没有相关模型找 [lvkui](lvkui@situdata.com) 要
使用起来非常简单,你只需要启动对应的 Docker 容器即可
```bash
docker run --gpus="device=0" --rm -p 8000:8000 -p 8001:8001 -p 8002:8002 -v $PWD/model_repository:/models nvcr.io/nvidia/tritonserver:21.10-py3 tritonserver --model-repository=/models
```
#### ADC
通用文件摆正算法
```
from turnsole.ocr_engine import angle_detector
image_rotated, direction = angle_detector.ADC(image, fine_degree=False)
```
#### DBNet
通用文字检测算法
```
from turnsole.ocr_engine import text_detector
boxes = text_detector.predict(image)
```
#### CRNN
通用文字识别算法
```
from turnsole.ocr_engine import text_recognizer
ocr_result, ocr_time = text_recognizer.predict_batch(image, boxes)
```
#### Object Detector
通用文件检测算法
```
from turnsole.ocr_engine import object_detector
object_list = object_detector.process(image)
```
#### Signature Detector
签字盖章二维码检测算法
```
from turnsole.ocr_engine import signature_detector
signature_list = signature_detector.process(image)
```
#### 标准 API
```
python api/ocr_engine_server.py
```
LICENSE
README.md
setup.cfg
setup.py
turnsole/__init__.py
turnsole/convenience.py
turnsole/encodings.py
turnsole/model.py
turnsole/paths.py
turnsole/pdf_tools.py
turnsole.egg-info/PKG-INFO
turnsole.egg-info/SOURCES.txt
turnsole.egg-info/dependency_links.txt
turnsole.egg-info/top_level.txt
turnsole/face_utils/__init__.py
turnsole/face_utils/agedetector.py
turnsole/face_utils/facedetector.py
turnsole/nets/__init__.py
turnsole/nets/efficientnet.py
turnsole/ocr_engine/__init__.py
turnsole/ocr_engine/ADC/__init__.py
turnsole/ocr_engine/ADC/angle_detector.py
turnsole/ocr_engine/CRNN/__init__.py
turnsole/ocr_engine/CRNN/alphabets.py
turnsole/ocr_engine/CRNN/text_rec.py
turnsole/ocr_engine/DBNet/__init__.py
turnsole/ocr_engine/DBNet/text_detector.py
turnsole/ocr_engine/object_det/__init__.py
turnsole/ocr_engine/object_det/utils.py
turnsole/ocr_engine/signature_det/__init__.py
turnsole/ocr_engine/signature_det/utils.py
turnsole/ocr_engine/utils/__init__.py
turnsole/ocr_engine/utils/read_data.py
turnsole/video/__init__.py
turnsole/video/count_frames.py
turnsole/video/filevideostream.py
turnsole/video/fps.py
turnsole/video/pivideostream.py
turnsole/video/videostream.py
turnsole/video/webcamvideostream.py
\ No newline at end of file
try:
from . import ocr_engine
except:
# print('[INFO] OCR engine can not import successful')
pass
from .convenience import resize
from .convenience import resize_with_pad
from .convenience import image_crop
from .encodings import bytes_to_bgr
from .encodings import base64_to_image
from .encodings import base64_encode_file
from .encodings import base64_encode_image
from .encodings import base64_decode_image
from .encodings import base64_to_bgr
from .encodings import bgr_to_base64
from .pdf_tools import pdf_to_images
\ No newline at end of file
import cv2
import numpy as np
def resize(image, width=None, height=None, inter=cv2.INTER_AREA):
# initialize the dimensions of the image to be resized and grab the image size
dim = None
(h, w) = image.shape[:2]
# if both the width and height are None, then return the original image
if width is None and height is None:
return image
# check to see if the width is None
if width is None:
# calculate the ratio of the height and construct the dimensions
r = height / float(h)
dim = (int(w * r), height)
# otherwise, the height is None
else:
# calculate the ratio of the width and construct the dimensions
r = width / float(w)
dim = (width, int(h * r))
# resize the image
resized = cv2.resize(image, dim, interpolation=inter)
# return the resized image
return resized
def resize_with_pad(image, target_width, target_height):
"""Resuzes and pads an image to a target width and height.
Resizes an image to a target width and height by keeping the aspect ratio the same
without distortion.
ratio must be less than 1.0.
width and height will pad with zeroes.
Args:
image (Array): RGB/BGR
target_width (Int): Target width.
target_height (Int): Target height.
Returns:
Array: Resized and padded image. The image paded with zeroes.
Float: Image resized ratio. The ratio must be less than 1.0.
"""
height, width, _ = image.shape
min_ratio = min(target_height/height, target_width/width)
ratio = min_ratio if min_ratio < 1.0 else 1.0
# To shrink an image, it will generally look best with INTER_AREA interpolation.
resized = cv2.resize(image, None, fx=ratio, fy=ratio, interpolation=cv2.INTER_AREA)
h, w, _ = resized.shape
canvas = np.zeros((target_height, target_width, 3), image.dtype)
canvas[:h, :w, :] = resized
return canvas, ratio
def image_crop(image, bbox, perspective=False):
"""根据 Bbox 在 image 上进行切片,如果指定 perspective 为 True 则切片方式为透视变换(可以切旋转目标)
Args:
image (array): 三通道图片,切片结果保持原图颜色通道
bbox (array/list): 支持两点矩形框和四点旋转矩形框
支持以下两种格式:
1. bbox = [xmin, ymin, xmax, ymax]
2. bbox = [x0, y0, x1, y1, x2, y2, x3, y3]
perspective (bool, optional): 是否切出旋转目标. Defaults to False.
Returns:
array: 小切图,和原图颜色通道一致
"""
# 按照 bbox 的正外接矩形切图
bbox = np.array(bbox, dtype=np.int32).reshape((-1, 2))
xmin, ymin, xmax, ymax = [min(bbox[:, 0]),
min(bbox[:, 1]),
max(bbox[:, 0]),
max(bbox[:, 1])]
xmin, ymin = max(0, xmin), max(0, ymin)
im_slice = image[ymin:ymax, xmin:xmax, :]
if perspective and bbox.shape[0] == 4:
# 获得旋转矩形的宽和高
w, h = [int(np.linalg.norm(bbox[0] - bbox[1])),
int(np.linalg.norm(bbox[3] - bbox[0]))]
# 把 bbox 平移到正切图的对应位置上
bbox[:, 0] -= xmin
bbox[:, 1] -= ymin
# 执行透视切图
pts1 = np.float32(bbox)
pts2 = np.float32([[0, 0], [w, 0], [w, h], [0, h]])
M = cv2.getPerspectiveTransform(pts1, pts2)
im_slice = cv2.warpPerspective(im_slice, M, (w, h))
return im_slice
# -*- coding: utf-8 -*-
# @Author : Antonio-hi
# @Email : 9428.al@gmail.com
# @Create Date : 2021-08-09 19:08:49
# @Last Modified : 2021-08-10 10:11:06
# @Description :
# import the necessary packages
import numpy as np
import base64
import json
import sys
import cv2
import os
def base64_encode_image(a):
# return a JSON-encoded list of the base64 encoded image, image data type, and image shape
# return json.dumps([base64_encode_array(a), str(a.dtype), a.shape])
return json.dumps([base64_encode_array(a).decode("utf-8"), str(a.dtype),
a.shape])
def base64_decode_image(a):
# grab the array, data type, and shape from the JSON-decoded object
(a, dtype, shape) = json.loads(a)
# set the correct data type and reshape the matrix into an image
a = base64_decode_array(a, dtype).reshape(shape)
# return the loaded image
return a
def base64_encode_array(a):
# return the base64 encoded array
return base64.b64encode(a)
def base64_decode_array(a, dtype):
# decode and return the array
return np.frombuffer(base64.b64decode(a), dtype=dtype)
def base64_encode_file(image_path):
filename = os.path.basename(image_path)
# encode image file to base64 string
with open(image_path, 'rb') as f:
buffer = f.read()
# convert bytes buffer string then encode to base64 string
img64_bytes = base64.b64encode(buffer)
img64_str = img64_bytes.decode('utf-8') # bytes to str
return json.dumps({"filename" : filename, "img64": img64_str})
def base64_to_image(img64):
image_buffer = base64_decode_array(img64, dtype=np.uint8)
# In the case of color images, the decoded images will have the channels stored in B G R order.
image = cv2.imdecode(image_buffer, cv2.IMREAD_COLOR)
return image
def bytes_to_bgr(buffer: bytes):
"""Read a byte stream as a OpenCV image
Args:
buffer (TYPE): bytes of a decoded image
"""
img_array = np.frombuffer(buffer, np.uint8)
image = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
return image
def base64_to_bgr(img64):
"""把 base64 转换成图片
单通道的灰度图或四通道的透明图都将自动转换成三通道的 BGR 图
Args:
img64 (TYPE): Description
Returns:
TYPE: image is a 3-D uint8 Tensor of shape [height, width, channels] where channels is BGR
"""
encoded_image = base64.b64decode(img64)
img_array = np.frombuffer(encoded_image, np.uint8)
image = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
return image
def bgr_to_base64(image):
""" 把图片转换成 base64 格式,过程中把图片以 JPEG 格式进行了压缩,通常这会导致图像质量变差
Args:
image (TYPE): image is a 3-D uint8 or uint16 Tensor of shape [height, width, channels] where channels is BGR
Returns:
TYPE: base64 格式的图片
"""
retval, encoded_image = cv2.imencode('.jpg', image) # Encodes an image(BGR) into a memory buffer.
img64 = base64.b64encode(encoded_image)
return img64.decode('utf-8')
if __name__ == '__main__':
image_path = '/home/lk/Repository/Project/turnsole/demos/images/sunflower.jpg'
# 1)将图片文件转换成 base64 base64编码的字符串(理论上支持任意文件)
json_str = base64_encode_file(image_path)
img64_dict = json.loads(json_str)
suffix = os.path.splitext(img64_dict['filename'])[-1].lower()
if suffix not in ['.jpg', '.jpeg', '.png', '.bmp']:
print(f'[INFO] 暂不支持格式为 {suffix} 的文件!')
# 2)将 base64 编码的字符串转成图片
image = base64_to_image(img64_dict['img64'])
inputs = image/255.
# 3)自创的, 将 array 转 base64 编码再转回array, 中间不经历图片操作, 还能保持 array 的数据类型
base64_encode_json_string = base64_encode_image(inputs)
inputs = base64_decode_image(base64_encode_json_string)
print(inputs)
# 3、字符串前加 b
# 例: response = b'<h1>Hello World!</h1>' # b' ' 表示这是一个 bytes 对象
# 作用:
# b" "前缀表示:后面字符串是bytes 类型。
# 用处:
# 网络编程中,服务器和浏览器只认bytes 类型数据。
# 如:send 函数的参数和 recv 函数的返回值都是 bytes 类型
# 附:
# 在 Python3 中,bytes 和 str 的互相转换方式是
# str.encode('utf-8')
# bytes.decode('utf-8')
# -*- coding: utf-8 -*-
# @Author : lk
# @Email : 9428.al@gmail.com
# @Create Date : 2021-08-11 17:10:16
# @Last Modified : 2021-08-12 16:14:53
# @Description :
import os
import tensorflow as tf
class AgeDetector:
def __init__(self, model_path):
self.age_map = {
0: '0-2',
1: '4-6',
2: '8-13',
3: '15-20',
4: '25-32',
5: '38-43',
6: '48-53',
7: '60+'
}
self.model = tf.keras.models.load_model(filepath=model_path,
compile=False)
self.inference_model = self.build_inference_model()
def build_inference_model(self):
image = self.model.input
x = tf.keras.applications.mobilenet_v2.preprocess_input(image)
predictions = self.model(x, training=False)
inference_model = tf.keras.Model(inputs=image, outputs=predictions)
return inference_model
def predict_batch(self, images):
# 输入一个人脸图片列表,列表不应为空
images = tf.stack([tf.image.resize(image, [96, 96]) for image in images], axis=0)
preds = self.inference_model.predict(images)
indexes = tf.argmax(preds, axis=-1)
classes = [self.age_map[index.numpy()] for index in indexes]
return classes
if __name__ == '__main__':
import cv2
from turnsole import paths
age_det = AGE_DETECTION(model_path='./ckpt/age_detector.h5')
data_dir = '/home/lk/Project/Face_Age_Gender/data/Emotion/emotion/010003_female_yellow_22'
for image_path in paths.list_images(data_dir):
image = cv2.imread(image_path)
classes = age_det.predict_batch([image])
print(classes)
# -*- coding: utf-8 -*-
# @Author : Antonio-hi
# @Email : 9428.al@gmail.com
# @Create Date : 2021-08-11 18:28:36
# @Last Modified : 2021-08-12 19:27:59
# @Description :
import os
import time
import numpy as np
import tensorflow as tf
def convert_to_corners(boxes):
"""Changes the box format to corner coordinates
Arguments:
boxes: A tensor of rank 2 or higher with a shape of `(..., num_boxes, 4)`
representing bounding boxes where each box is of the format
`[x, y, width, height]`.
Returns:
converted boxes with shape same as that of boxes.
"""
return tf.concat(
[boxes[..., :2] - boxes[..., 2:] / 2.0, boxes[..., :2] + boxes[..., 2:] / 2.0],
axis=-1,
)
class AnchorBox:
"""Generates anchor boxes.
This class has operations to generate anchor boxes for feature maps at
strides `[8, 16, 32, 64, 128]`. Where each anchor each box is of the
format `[x, y, width, height]`.
Attributes:
aspect_ratios: A list of float values representing the aspect ratios of
the anchor boxes at each location on the feature map
scales: A list of float values representing the scale of the anchor boxes
at each location on the feature map.
num_anchors: The number of anchor boxes at each location on feature map
areas: A list of float values representing the areas of the anchor
boxes for each feature map in the feature pyramid.
strides: A list of float value representing the strides for each feature
map in the feature pyramid.
"""
def __init__(self):
self.aspect_ratios = [0.5, 1.0, 2.0]
self.scales = [2 ** x for x in [0, 1 / 3, 2 / 3]]
self._num_anchors = len(self.aspect_ratios) * len(self.scales)
self._strides = [2 ** i for i in range(3, 8)]
self._areas = [x ** 2 for x in [32.0, 64.0, 128.0, 256.0, 512.0]]
self._anchor_dims = self._compute_dims()
def _compute_dims(self):
"""Computes anchor box dimensions for all ratios and scales at all levels
of the feature pyramid.
"""
anchor_dims_all = []
for area in self._areas:
anchor_dims = []
for ratio in self.aspect_ratios:
anchor_height = tf.math.sqrt(area / ratio)
anchor_width = area / anchor_height
dims = tf.reshape(
tf.stack([anchor_width, anchor_height], axis=-1), [1, 1, 2]
)
for scale in self.scales:
anchor_dims.append(scale * dims)
anchor_dims_all.append(tf.stack(anchor_dims, axis=-2))
return anchor_dims_all
def _get_anchors(self, feature_height, feature_width, level):
"""Generates anchor boxes for a given feature map size and level
Arguments:
feature_height: An integer representing the height of the feature map.
feature_width: An integer representing the width of the feature map.
level: An integer representing the level of the feature map in the
feature pyramid.
Returns:
anchor boxes with the shape
`(feature_height * feature_width * num_anchors, 4)`
"""
rx = tf.range(feature_width, dtype=tf.float32) + 0.5
ry = tf.range(feature_height, dtype=tf.float32) + 0.5
centers = tf.stack(tf.meshgrid(rx, ry), axis=-1) * self._strides[level - 3]
centers = tf.expand_dims(centers, axis=-2)
centers = tf.tile(centers, [1, 1, self._num_anchors, 1])
dims = tf.tile(
self._anchor_dims[level - 3], [feature_height, feature_width, 1, 1]
)
anchors = tf.concat([centers, dims], axis=-1)
return tf.reshape(
anchors, [feature_height * feature_width * self._num_anchors, 4]
)
def get_anchors(self, image_height, image_width):
"""Generates anchor boxes for all the feature maps of the feature pyramid.
Arguments:
image_height: Height of the input image.
image_width: Width of the input image.
Returns:
anchor boxes for all the feature maps, stacked as a single tensor
with shape `(total_anchors, 4)`
"""
anchors = [
self._get_anchors(
tf.math.ceil(image_height / 2 ** i),
tf.math.ceil(image_width / 2 ** i),
i,
)
for i in range(3, 8)
]
return tf.concat(anchors, axis=0)
class DecodePredictions(tf.keras.layers.Layer):
"""A Keras layer that decodes predictions of the RetinaNet model.
Attributes:
num_classes: Number of classes in the dataset
confidence_threshold: Minimum class probability, below which detections
are pruned.
nms_iou_threshold: IOU threshold for the NMS operation
max_detections_per_class: Maximum number of detections to retain per
class.
max_detections: Maximum number of detections to retain across all
classes.
box_variance: The scaling factors used to scale the bounding box
predictions.
"""
def __init__(
self,
num_classes=80,
confidence_threshold=0.05,
nms_iou_threshold=0.5,
max_detections_per_class=100,
max_detections=100,
box_variance=[0.1, 0.1, 0.2, 0.2],
**kwargs
):
super(DecodePredictions, self).__init__(**kwargs)
self.num_classes = num_classes
self.confidence_threshold = confidence_threshold
self.nms_iou_threshold = nms_iou_threshold
self.max_detections_per_class = max_detections_per_class
self.max_detections = max_detections
self._anchor_box = AnchorBox()
self._box_variance = tf.convert_to_tensor(
[0.1, 0.1, 0.2, 0.2], dtype=tf.float32
)
def _decode_box_predictions(self, anchor_boxes, box_predictions):
boxes = box_predictions * self._box_variance
boxes = tf.concat(
[
boxes[:, :, :2] * anchor_boxes[:, :, 2:] + anchor_boxes[:, :, :2],
tf.math.exp(boxes[:, :, 2:]) * anchor_boxes[:, :, 2:],
],
axis=-1,
)
boxes_transformed = convert_to_corners(boxes)
return boxes_transformed
def _decode_landm_predictions(self, anchor_boxes, landm_predictions): # anchor_boxes shape=(1, 138105, 4)
landmarks = tf.reshape(landm_predictions,
[tf.shape(landm_predictions)[0], tf.shape(anchor_boxes)[1], 5, 2])
anchor_boxes = tf.broadcast_to(
input=tf.expand_dims(anchor_boxes, 2),
shape=[tf.shape(landm_predictions)[0], tf.shape(anchor_boxes)[1], 5, 4])
landmarks *= (self._box_variance[:2] * anchor_boxes[:, :, :, 2:])
landmarks += anchor_boxes[:, :, :, :2]
return landmarks
def call(self, images, predictions):
image_shape = tf.cast(tf.shape(images), dtype=tf.float32)
anchor_boxes = self._anchor_box.get_anchors(image_shape[1], image_shape[2])
box_predictions = predictions[:, :, :4]
cls_predictions = tf.nn.sigmoid(predictions[:, :, 4])
landm_predictions = predictions[:, :, 5:15]
boxes = self._decode_box_predictions(anchor_boxes[None, ...], box_predictions)
landmarks = self._decode_landm_predictions(anchor_boxes[None, ...], landm_predictions)
selected_indices = tf.image.non_max_suppression(
boxes=boxes[0],
scores=cls_predictions[0],
max_output_size=self.max_detections,
iou_threshold=0.5,
score_threshold=self.confidence_threshold
)
selected_boxes = tf.gather(boxes[0], selected_indices)
selected_landmarks = tf.gather(landmarks[0], selected_indices)
return selected_boxes, selected_landmarks
class FaceDetector:
def __init__(self, model_path, confidence_threshold=0.5):
self.confidence_threshold = confidence_threshold
self.model = tf.keras.models.load_model(filepath=model_path,
compile=False)
self.inference_model = self.build_inference_model()
def build_inference_model(self):
image = self.model.input
x = tf.keras.applications.mobilenet_v2.preprocess_input(image)
predictions = self.model(x, training=False)
detections = DecodePredictions(confidence_threshold=self.confidence_threshold)(image, predictions)
inference_model = tf.keras.Model(inputs=image, outputs=detections)
return inference_model
def resize_and_pad_image(
self, image, min_side=128.0, max_side=1333.0, jitter=[256, 960], stride=128.0
):
"""Resizes and pads image while preserving aspect ratio.
Returns:
image: Resized and padded image.
image_shape: Shape of the image before padding.
ratio: The scaling factor used to resize the image
"""
image_shape = tf.cast(tf.shape(image)[:2], dtype=tf.float32)
if jitter is not None:
min_side = tf.random.uniform((), jitter[0], jitter[1], dtype=tf.float32)
ratio = min_side / tf.reduce_min(image_shape)
if ratio * tf.reduce_max(image_shape) > max_side:
ratio = max_side / tf.reduce_max(image_shape)
image_shape = ratio * image_shape # tf.float32
image = tf.image.resize(image, tf.cast(image_shape, dtype=tf.int32))
padded_image_shape = tf.cast(
tf.math.ceil(image_shape / stride) * stride, dtype=tf.int32
)
image = tf.image.pad_to_bounding_box(
image, 0, 0, padded_image_shape[0], padded_image_shape[1]
)
return image, image_shape, ratio
def predict(self, image, min_side=128):
# input a image return boxes and landmarks
image, _, ratio = self.resize_and_pad_image(image, min_side=min_side, jitter=None)
detections = self.inference_model.predict(tf.expand_dims(image, axis=0))
boxes, landmarks = detections
boxes = np.array(boxes/ratio, dtype=np.int32)
landmarks = np.array(landmarks/ratio, dtype=np.int32)
return boxes, landmarks
# 格式转换
results = {
'boxes': boxes.tolist(),
'landmarks': landmarks.tolist(),
}
return results
if __name__ == '__main__':
import cv2
facedetector = FaceDetector(model_path='./model/facedetector.h5')
image_path = '/home/lk/Project/Face_Age_Gender/data/WIDER/WIDER_train/images/28--Sports_Fan/28_Sports_Fan_Sports_Fan_28_615.jpg'
# image_path = '/home/lk/Project/Face_Age_Gender/data/Emotion/emotion/010021_female_yellow_22/angry.jpg'
image = cv2.imread(image_path)
x = facedetector.predict(image, min_side=256)
print(x)
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Created Date : 2021-02-24 13:58:46
# @Last Modified : 2021-03-05 18:14:17
# @Description :
import tensorflow as tf
from .nets.efficientnet import EfficientNetB0, EfficientNetB1, EfficientNetB2, EfficientNetB3
from .nets.efficientnet import EfficientNetB4, EfficientNetB5, EfficientNetB6, EfficientNetB7
def load_backbone(phi, input_tensor, weights='imagenet'):
if phi == 0:
model = EfficientNetB0(include_top=False,
weights=weights,
input_tensor=input_tensor)
# 从这些层提取特征
layer_names = [
'block2b_add', # 1/4
'block3b_add', # 1/8
'block5c_add', # 1/16
'block7a_project_bn', # 1/32
]
elif phi == 1:
model = EfficientNetB1(include_top=False,
weights=weights,
input_tensor=input_tensor)
layer_names = [
'block2c_add', # 1/4
'block3c_add', # 1/8
'block5d_add', # 1/16
'block7b_add', # 1/32
]
elif phi == 2:
model = EfficientNetB2(include_top=False,
weights=weights,
input_tensor=input_tensor)
layer_names = [
'block2c_add', # 1/4
'block3c_add', # 1/8
'block5d_add', # 1/16
'block7b_add', # 1/32
]
elif phi == 3:
model = EfficientNetB3(include_top=False,
weights=weights,
input_tensor=input_tensor)
layer_names = [
'block2c_add', # 1/4
'block3c_add', # 1/8
'block5e_add', # 1/16
'block7b_add', # 1/32
]
elif phi == 4:
model = EfficientNetB4(include_top=False,
weights=weights,
input_tensor=input_tensor)
layer_names = [
'block2c_add', # 1/4
'block3d_add', # 1/8
'block5f_add', # 1/16
'block7b_add', # 1/32
]
elif phi == 5:
model = EfficientNetB5(include_top=False,
weights=weights,
input_tensor=input_tensor)
layer_names = [
'block2e_add', # 1/4
'block3e_add', # 1/8
'block5g_add', # 1/16
'block7c_add', # 1/32
]
elif phi == 6:
model = EfficientNetB6(include_top=False,
weights=weights,
input_tensor=input_tensor)
layer_names = [
'block2f_add', # 1/4
'block3f_add', # 1/8
'block5h_add', # 1/16
'block7c_add', # 1/32
]
elif phi == 7:
model = EfficientNetB7(include_top=False,
weights=weights,
input_tensor=input_tensor)
layer_names = [
'block2g_add', # 1/4
'block3g_add', # 1/8
'block5j_add', # 1/16
'block7d_add', # 1/32
]
skips = [model.get_layer(name).output for name in layer_names]
return model, skips
def EasyDet(phi=0, input_size=(None, None, 3), weights='imagenet'):
image_input = tf.keras.layers.Input(shape=input_size)
backbone, skips = load_backbone(phi=phi, input_tensor=image_input, weights=weights)
C2, C3, C4, C5 = skips
in2 = tf.keras.layers.Conv2D(256, (1, 1), padding='same', kernel_initializer='he_normal', name='in2')(C2)
in3 = tf.keras.layers.Conv2D(256, (1, 1), padding='same', kernel_initializer='he_normal', name='in3')(C3)
in4 = tf.keras.layers.Conv2D(256, (1, 1), padding='same', kernel_initializer='he_normal', name='in4')(C4)
in5 = tf.keras.layers.Conv2D(256, (1, 1), padding='same', kernel_initializer='he_normal', name='in5')(C5)
# 1 / 32 * 8 = 1 / 4
P5 = tf.keras.layers.UpSampling2D(size=(8, 8))(
tf.keras.layers.Conv2D(64, (3, 3), padding='same', kernel_initializer='he_normal')(in5))
# 1 / 16 * 4 = 1 / 4
out4 = tf.keras.layers.Add()([in4, tf.keras.layers.UpSampling2D(size=(2, 2))(in5)])
P4 = tf.keras.layers.UpSampling2D(size=(4, 4))(
tf.keras.layers.Conv2D(64, (3, 3), padding='same', kernel_initializer='he_normal')(out4))
# 1 / 8 * 2 = 1 / 4
out3 = tf.keras.layers.Add()([in3, tf.keras.layers.UpSampling2D(size=(2, 2))(out4)])
P3 = tf.keras.layers.UpSampling2D(size=(2, 2))(
tf.keras.layers.Conv2D(64, (3, 3), padding='same', kernel_initializer='he_normal')(out3))
# 1 / 4
P2 = tf.keras.layers.Conv2D(64, (3, 3), padding='same', kernel_initializer='he_normal')(
tf.keras.layers.Add()([in2, tf.keras.layers.UpSampling2D(size=(2, 2))(out3)]))
# (b, 1/4, 1/4, 256)
fuse = tf.keras.layers.Concatenate()([P2, P3, P4, P5])
model = tf.keras.models.Model(inputs=image_input, outputs=fuse)
return model
if __name__ == '__main__':
model = EasyDet(phi=0)
model.summary()
import time
import numpy as np
x = np.random.random_sample((1, 640, 640, 3))
# warm up
output = model.predict(x)
print('\n[INFO] Test start')
time_start = time.time()
for i in range(1000):
output = model.predict(x)
time_end = time.time()
print('[INFO] Time used: {:.2f} ms'.format((time_end - time_start)*1000/(i+1)))
# Copyright 2019 The TensorFlow Authors. All Rights Reserved.
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
# http://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.
# ==============================================================================
# pylint: disable=invalid-name
# pylint: disable=missing-docstring
"""EfficientNet models for Keras.
Reference:
- [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](
https://arxiv.org/abs/1905.11946) (ICML 2019)
"""
from __future__ import absolute_import
from __future__ import division
from __future__ import print_function
import copy
import math
from tensorflow.keras import layers
from tensorflow.python.keras import backend
from tensorflow.python.keras.applications import imagenet_utils
from tensorflow.python.keras.engine import training
# from tensorflow.python.keras.layers import VersionAwareLayers
from tensorflow.python.keras.utils import data_utils
from tensorflow.python.keras.utils import layer_utils
from tensorflow.python.lib.io import file_io
from tensorflow.python.util.tf_export import keras_export
BASE_WEIGHTS_PATH = 'https://storage.googleapis.com/keras-applications/'
WEIGHTS_HASHES = {
'b0': ('902e53a9f72be733fc0bcb005b3ebbac',
'50bc09e76180e00e4465e1a485ddc09d'),
'b1': ('1d254153d4ab51201f1646940f018540',
'74c4e6b3e1f6a1eea24c589628592432'),
'b2': ('b15cce36ff4dcbd00b6dd88e7857a6ad',
'111f8e2ac8aa800a7a99e3239f7bfb39'),
'b3': ('ffd1fdc53d0ce67064dc6a9c7960ede0',
'af6d107764bb5b1abb91932881670226'),
'b4': ('18c95ad55216b8f92d7e70b3a046e2fc',
'ebc24e6d6c33eaebbd558eafbeedf1ba'),
'b5': ('ace28f2a6363774853a83a0b21b9421a',
'38879255a25d3c92d5e44e04ae6cec6f'),
'b6': ('165f6e37dce68623721b423839de8be5',
'9ecce42647a20130c1f39a5d4cb75743'),
'b7': ('8c03f828fec3ef71311cd463b6759d99',
'cbcfe4450ddf6f3ad90b1b398090fe4a'),
}
DEFAULT_BLOCKS_ARGS = [{
'kernel_size': 3,
'repeats': 1,
'filters_in': 32,
'filters_out': 16,
'expand_ratio': 1,
'id_skip': True,
'strides': 1,
'se_ratio': 0.25
}, {
'kernel_size': 3,
'repeats': 2,
'filters_in': 16,
'filters_out': 24,
'expand_ratio': 6,
'id_skip': True,
'strides': 2,
'se_ratio': 0.25
}, {
'kernel_size': 5,
'repeats': 2,
'filters_in': 24,
'filters_out': 40,
'expand_ratio': 6,
'id_skip': True,
'strides': 2,
'se_ratio': 0.25
}, {
'kernel_size': 3,
'repeats': 3,
'filters_in': 40,
'filters_out': 80,
'expand_ratio': 6,
'id_skip': True,
'strides': 2,
'se_ratio': 0.25
}, {
'kernel_size': 5,
'repeats': 3,
'filters_in': 80,
'filters_out': 112,
'expand_ratio': 6,
'id_skip': True,
'strides': 1,
'se_ratio': 0.25
}, {
'kernel_size': 5,
'repeats': 4,
'filters_in': 112,
'filters_out': 192,
'expand_ratio': 6,
'id_skip': True,
'strides': 2,
'se_ratio': 0.25
}, {
'kernel_size': 3,
'repeats': 1,
'filters_in': 192,
'filters_out': 320,
'expand_ratio': 6,
'id_skip': True,
'strides': 1,
'se_ratio': 0.25
}]
CONV_KERNEL_INITIALIZER = {
'class_name': 'VarianceScaling',
'config': {
'scale': 2.0,
'mode': 'fan_out',
'distribution': 'truncated_normal'
}
}
DENSE_KERNEL_INITIALIZER = {
'class_name': 'VarianceScaling',
'config': {
'scale': 1. / 3.,
'mode': 'fan_out',
'distribution': 'uniform'
}
}
# layers = VersionAwareLayers()
BASE_DOCSTRING = """Instantiates the {name} architecture.
Reference:
- [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](
https://arxiv.org/abs/1905.11946) (ICML 2019)
Optionally loads weights pre-trained on ImageNet.
Note that the data format convention used by the model is
the one specified in your Keras config at `~/.keras/keras.json`.
If you have never configured it, it defaults to `"channels_last"`.
Arguments:
include_top: Whether to include the fully-connected
layer at the top of the network. Defaults to True.
weights: One of `None` (random initialization),
'imagenet' (pre-training on ImageNet),
or the path to the weights file to be loaded. Defaults to 'imagenet'.
input_tensor: Optional Keras tensor
(i.e. output of `layers.Input()`)
to use as image input for the model.
input_shape: Optional shape tuple, only to be specified
if `include_top` is False.
It should have exactly 3 inputs channels.
pooling: Optional pooling mode for feature extraction
when `include_top` is `False`. Defaults to None.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
classes: Optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified. Defaults to 1000 (number of
ImageNet classes).
classifier_activation: A `str` or callable. The activation function to use
on the "top" layer. Ignored unless `include_top=True`. Set
`classifier_activation=None` to return the logits of the "top" layer.
Defaults to 'softmax'.
Returns:
A `keras.Model` instance.
"""
def EfficientNet(
width_coefficient,
depth_coefficient,
default_size,
dropout_rate=0.2,
drop_connect_rate=0.2,
depth_divisor=8,
activation='swish',
blocks_args='default',
model_name='efficientnet',
include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation='softmax'):
"""Instantiates the EfficientNet architecture using given scaling coefficients.
Reference:
- [EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](
https://arxiv.org/abs/1905.11946) (ICML 2019)
Optionally loads weights pre-trained on ImageNet.
Note that the data format convention used by the model is
the one specified in your Keras config at `~/.keras/keras.json`.
Arguments:
width_coefficient: float, scaling coefficient for network width.
depth_coefficient: float, scaling coefficient for network depth.
default_size: integer, default input image size.
dropout_rate: float, dropout rate before final classifier layer.
drop_connect_rate: float, dropout rate at skip connections.
depth_divisor: integer, a unit of network width.
activation: activation function.
blocks_args: list of dicts, parameters to construct block modules.
model_name: string, model name.
include_top: whether to include the fully-connected
layer at the top of the network.
weights: one of `None` (random initialization),
'imagenet' (pre-training on ImageNet),
or the path to the weights file to be loaded.
input_tensor: optional Keras tensor
(i.e. output of `layers.Input()`)
to use as image input for the model.
input_shape: optional shape tuple, only to be specified
if `include_top` is False.
It should have exactly 3 inputs channels.
pooling: optional pooling mode for feature extraction
when `include_top` is `False`.
- `None` means that the output of the model will be
the 4D tensor output of the
last convolutional layer.
- `avg` means that global average pooling
will be applied to the output of the
last convolutional layer, and thus
the output of the model will be a 2D tensor.
- `max` means that global max pooling will
be applied.
classes: optional number of classes to classify images
into, only to be specified if `include_top` is True, and
if no `weights` argument is specified.
classifier_activation: A `str` or callable. The activation function to use
on the "top" layer. Ignored unless `include_top=True`. Set
`classifier_activation=None` to return the logits of the "top" layer.
Returns:
A `keras.Model` instance.
Raises:
ValueError: in case of invalid argument for `weights`,
or invalid input shape.
ValueError: if `classifier_activation` is not `softmax` or `None` when
using a pretrained top layer.
"""
if blocks_args == 'default':
blocks_args = DEFAULT_BLOCKS_ARGS
if not (weights in {'imagenet', None} or file_io.file_exists_v2(weights)):
raise ValueError('The `weights` argument should be either '
'`None` (random initialization), `imagenet` '
'(pre-training on ImageNet), '
'or the path to the weights file to be loaded.')
if weights == 'imagenet' and include_top and classes != 1000:
raise ValueError('If using `weights` as `"imagenet"` with `include_top`'
' as true, `classes` should be 1000')
# Determine proper input shape
input_shape = imagenet_utils.obtain_input_shape(
input_shape,
default_size=default_size,
min_size=32,
data_format=backend.image_data_format(),
require_flatten=include_top,
weights=weights)
if input_tensor is None:
img_input = layers.Input(shape=input_shape)
else:
if not backend.is_keras_tensor(input_tensor):
img_input = layers.Input(tensor=input_tensor, shape=input_shape)
else:
img_input = input_tensor
bn_axis = 3 if backend.image_data_format() == 'channels_last' else 1
def round_filters(filters, divisor=depth_divisor):
"""Round number of filters based on depth multiplier."""
filters *= width_coefficient
new_filters = max(divisor, int(filters + divisor / 2) // divisor * divisor)
# Make sure that round down does not go down by more than 10%.
if new_filters < 0.9 * filters:
new_filters += divisor
return int(new_filters)
def round_repeats(repeats):
"""Round number of repeats based on depth multiplier."""
return int(math.ceil(depth_coefficient * repeats))
# Build stem
x = img_input
x = layers.experimental.preprocessing.Rescaling(1. / 255.)(x)
x = layers.experimental.preprocessing.Normalization(axis=bn_axis)(x)
x = layers.ZeroPadding2D(
padding=imagenet_utils.correct_pad(x, 3),
name='stem_conv_pad')(x)
x = layers.Conv2D(
round_filters(32),
3,
strides=2,
padding='valid',
use_bias=False,
kernel_initializer=CONV_KERNEL_INITIALIZER,
name='stem_conv')(x)
x = layers.BatchNormalization(axis=bn_axis, name='stem_bn')(x)
x = layers.Activation(activation, name='stem_activation')(x)
# Build blocks
blocks_args = copy.deepcopy(blocks_args)
b = 0
blocks = float(sum(round_repeats(args['repeats']) for args in blocks_args))
for (i, args) in enumerate(blocks_args):
assert args['repeats'] > 0
# Update block input and output filters based on depth multiplier.
args['filters_in'] = round_filters(args['filters_in'])
args['filters_out'] = round_filters(args['filters_out'])
for j in range(round_repeats(args.pop('repeats'))):
# The first block needs to take care of stride and filter size increase.
if j > 0:
args['strides'] = 1
args['filters_in'] = args['filters_out']
x = block(
x,
activation,
drop_connect_rate * b / blocks,
name='block{}{}_'.format(i + 1, chr(j + 97)),
**args)
b += 1
# Build top
x = layers.Conv2D(
round_filters(1280),
1,
padding='same',
use_bias=False,
kernel_initializer=CONV_KERNEL_INITIALIZER,
name='top_conv')(x)
x = layers.BatchNormalization(axis=bn_axis, name='top_bn')(x)
x = layers.Activation(activation, name='top_activation')(x)
if include_top:
x = layers.GlobalAveragePooling2D(name='avg_pool')(x)
if dropout_rate > 0:
x = layers.Dropout(dropout_rate, name='top_dropout')(x)
imagenet_utils.validate_activation(classifier_activation, weights)
x = layers.Dense(
classes,
activation=classifier_activation,
kernel_initializer=DENSE_KERNEL_INITIALIZER,
name='predictions')(x)
else:
if pooling == 'avg':
x = layers.GlobalAveragePooling2D(name='avg_pool')(x)
elif pooling == 'max':
x = layers.GlobalMaxPooling2D(name='max_pool')(x)
# Ensure that the model takes into account
# any potential predecessors of `input_tensor`.
if input_tensor is not None:
inputs = layer_utils.get_source_inputs(input_tensor)
else:
inputs = img_input
# Create model.
model = training.Model(inputs, x, name=model_name)
# Load weights.
if weights == 'imagenet':
if include_top:
file_suffix = '.h5'
file_hash = WEIGHTS_HASHES[model_name[-2:]][0]
else:
file_suffix = '_notop.h5'
file_hash = WEIGHTS_HASHES[model_name[-2:]][1]
file_name = model_name + file_suffix
weights_path = data_utils.get_file(
file_name,
BASE_WEIGHTS_PATH + file_name,
cache_subdir='models',
file_hash=file_hash)
model.load_weights(weights_path)
elif weights is not None:
model.load_weights(weights)
return model
def block(inputs,
activation='swish',
drop_rate=0.,
name='',
filters_in=32,
filters_out=16,
kernel_size=3,
strides=1,
expand_ratio=1,
se_ratio=0.,
id_skip=True):
"""An inverted residual block.
Arguments:
inputs: input tensor.
activation: activation function.
drop_rate: float between 0 and 1, fraction of the input units to drop.
name: string, block label.
filters_in: integer, the number of input filters.
filters_out: integer, the number of output filters.
kernel_size: integer, the dimension of the convolution window.
strides: integer, the stride of the convolution.
expand_ratio: integer, scaling coefficient for the input filters.
se_ratio: float between 0 and 1, fraction to squeeze the input filters.
id_skip: boolean.
Returns:
output tensor for the block.
"""
bn_axis = 3 if backend.image_data_format() == 'channels_last' else 1
# Expansion phase
filters = filters_in * expand_ratio
if expand_ratio != 1:
x = layers.Conv2D(
filters,
1,
padding='same',
use_bias=False,
kernel_initializer=CONV_KERNEL_INITIALIZER,
name=name + 'expand_conv')(
inputs)
x = layers.BatchNormalization(axis=bn_axis, name=name + 'expand_bn')(x)
x = layers.Activation(activation, name=name + 'expand_activation')(x)
else:
x = inputs
# Depthwise Convolution
if strides == 2:
x = layers.ZeroPadding2D(
padding=imagenet_utils.correct_pad(x, kernel_size),
name=name + 'dwconv_pad')(x)
conv_pad = 'valid'
else:
conv_pad = 'same'
x = layers.DepthwiseConv2D(
kernel_size,
strides=strides,
padding=conv_pad,
use_bias=False,
depthwise_initializer=CONV_KERNEL_INITIALIZER,
name=name + 'dwconv')(x)
x = layers.BatchNormalization(axis=bn_axis, name=name + 'bn')(x)
x = layers.Activation(activation, name=name + 'activation')(x)
# Squeeze and Excitation phase
if 0 < se_ratio <= 1:
filters_se = max(1, int(filters_in * se_ratio))
se = layers.GlobalAveragePooling2D(name=name + 'se_squeeze')(x)
se = layers.Reshape((1, 1, filters), name=name + 'se_reshape')(se)
se = layers.Conv2D(
filters_se,
1,
padding='same',
activation=activation,
kernel_initializer=CONV_KERNEL_INITIALIZER,
name=name + 'se_reduce')(
se)
se = layers.Conv2D(
filters,
1,
padding='same',
activation='sigmoid',
kernel_initializer=CONV_KERNEL_INITIALIZER,
name=name + 'se_expand')(se)
x = layers.multiply([x, se], name=name + 'se_excite')
# Output phase
x = layers.Conv2D(
filters_out,
1,
padding='same',
use_bias=False,
kernel_initializer=CONV_KERNEL_INITIALIZER,
name=name + 'project_conv')(x)
x = layers.BatchNormalization(axis=bn_axis, name=name + 'project_bn')(x)
if id_skip and strides == 1 and filters_in == filters_out:
if drop_rate > 0:
x = layers.Dropout(
drop_rate, noise_shape=(None, 1, 1, 1), name=name + 'drop')(x)
x = layers.add([x, inputs], name=name + 'add')
return x
@keras_export('keras.applications.efficientnet.EfficientNetB0',
'keras.applications.EfficientNetB0')
def EfficientNetB0(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation='softmax',
**kwargs):
return EfficientNet(
1.0,
1.0,
224,
0.2,
model_name='efficientnetb0',
include_top=include_top,
weights=weights,
input_tensor=input_tensor,
input_shape=input_shape,
pooling=pooling,
classes=classes,
classifier_activation=classifier_activation,
**kwargs)
@keras_export('keras.applications.efficientnet.EfficientNetB1',
'keras.applications.EfficientNetB1')
def EfficientNetB1(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation='softmax',
**kwargs):
return EfficientNet(
1.0,
1.1,
240,
0.2,
model_name='efficientnetb1',
include_top=include_top,
weights=weights,
input_tensor=input_tensor,
input_shape=input_shape,
pooling=pooling,
classes=classes,
classifier_activation=classifier_activation,
**kwargs)
@keras_export('keras.applications.efficientnet.EfficientNetB2',
'keras.applications.EfficientNetB2')
def EfficientNetB2(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation='softmax',
**kwargs):
return EfficientNet(
1.1,
1.2,
260,
0.3,
model_name='efficientnetb2',
include_top=include_top,
weights=weights,
input_tensor=input_tensor,
input_shape=input_shape,
pooling=pooling,
classes=classes,
classifier_activation=classifier_activation,
**kwargs)
@keras_export('keras.applications.efficientnet.EfficientNetB3',
'keras.applications.EfficientNetB3')
def EfficientNetB3(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation='softmax',
**kwargs):
return EfficientNet(
1.2,
1.4,
300,
0.3,
model_name='efficientnetb3',
include_top=include_top,
weights=weights,
input_tensor=input_tensor,
input_shape=input_shape,
pooling=pooling,
classes=classes,
classifier_activation=classifier_activation,
**kwargs)
@keras_export('keras.applications.efficientnet.EfficientNetB4',
'keras.applications.EfficientNetB4')
def EfficientNetB4(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation='softmax',
**kwargs):
return EfficientNet(
1.4,
1.8,
380,
0.4,
model_name='efficientnetb4',
include_top=include_top,
weights=weights,
input_tensor=input_tensor,
input_shape=input_shape,
pooling=pooling,
classes=classes,
classifier_activation=classifier_activation,
**kwargs)
@keras_export('keras.applications.efficientnet.EfficientNetB5',
'keras.applications.EfficientNetB5')
def EfficientNetB5(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation='softmax',
**kwargs):
return EfficientNet(
1.6,
2.2,
456,
0.4,
model_name='efficientnetb5',
include_top=include_top,
weights=weights,
input_tensor=input_tensor,
input_shape=input_shape,
pooling=pooling,
classes=classes,
classifier_activation=classifier_activation,
**kwargs)
@keras_export('keras.applications.efficientnet.EfficientNetB6',
'keras.applications.EfficientNetB6')
def EfficientNetB6(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation='softmax',
**kwargs):
return EfficientNet(
1.8,
2.6,
528,
0.5,
model_name='efficientnetb6',
include_top=include_top,
weights=weights,
input_tensor=input_tensor,
input_shape=input_shape,
pooling=pooling,
classes=classes,
classifier_activation=classifier_activation,
**kwargs)
@keras_export('keras.applications.efficientnet.EfficientNetB7',
'keras.applications.EfficientNetB7')
def EfficientNetB7(include_top=True,
weights='imagenet',
input_tensor=None,
input_shape=None,
pooling=None,
classes=1000,
classifier_activation='softmax',
**kwargs):
return EfficientNet(
2.0,
3.1,
600,
0.5,
model_name='efficientnetb7',
include_top=include_top,
weights=weights,
input_tensor=input_tensor,
input_shape=input_shape,
pooling=pooling,
classes=classes,
classifier_activation=classifier_activation,
**kwargs)
EfficientNetB0.__doc__ = BASE_DOCSTRING.format(name='EfficientNetB0')
EfficientNetB1.__doc__ = BASE_DOCSTRING.format(name='EfficientNetB1')
EfficientNetB2.__doc__ = BASE_DOCSTRING.format(name='EfficientNetB2')
EfficientNetB3.__doc__ = BASE_DOCSTRING.format(name='EfficientNetB3')
EfficientNetB4.__doc__ = BASE_DOCSTRING.format(name='EfficientNetB4')
EfficientNetB5.__doc__ = BASE_DOCSTRING.format(name='EfficientNetB5')
EfficientNetB6.__doc__ = BASE_DOCSTRING.format(name='EfficientNetB6')
EfficientNetB7.__doc__ = BASE_DOCSTRING.format(name='EfficientNetB7')
@keras_export('keras.applications.efficientnet.preprocess_input')
def preprocess_input(x, data_format=None): # pylint: disable=unused-argument
return x
@keras_export('keras.applications.efficientnet.decode_predictions')
def decode_predictions(preds, top=5):
return imagenet_utils.decode_predictions(preds, top=top)
decode_predictions.__doc__ = imagenet_utils.decode_predictions.__doc__
\ No newline at end of file
from . import angle_detector
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : lk
# @Email : 9428.al@gmail.com
# @Created Date : 2019-09-03 15:40:54
# @Last Modified : 2022-07-18 16:10:36
# @Description :
import os
import cv2
import time
import numpy as np
# import tensorflow as tf
# import grpc
# from tensorflow_serving.apis import predict_pb2
# from tensorflow_serving.apis import prediction_service_pb2_grpc
import tritonclient.grpc as grpcclient
def resize(image, width=None, height=None, inter=cv2.INTER_AREA):
'''
Resize the input image according to the dimensions and keep aspect ratio of this image
'''
dim = None
(h, w) = image.shape[:2]
# if both the width and height are None, then return the original image
if width is None and height is None:
return image
# check to see if the width is None
if width is None:
# calculate the ratio of the height and construct the dimensions
r = height / float(h)
dim = (int(w * r), height)
# otherwise, the height is None
else:
# calculate the ratio of the width and construct the dimensions
r = width / float(w)
dim = (width, int(h * r))
# resize the image
resized = cv2.resize(image, dim, interpolation=inter)
return resized
def predict(image):
ROTATE = [0, 90, 180, 270]
# pre-process the image for classification
# Test 1: 直接resize到目标尺寸
# image = cv2.resize(image, (512, 512))
# Test 2: 按照短边resize到目标尺寸,长边按比例缩放
short_side = 768
if min(image.shape[:2]) > short_side:
image = resize(image, width=short_side) if image.shape[0] > image.shape[1] else resize(image, height=short_side)
# Test 3: 带padding的resize策略
# image = resize_image_with_pad(image, 1024, 1024)
# Test 4: 直接使用原图
# image = image
image = np.array(image, dtype="float32")
image = 2 * (image / 255.0) - 1 # Let data input to be normalized to the [-1,1] range
input_data = np.expand_dims(image, 0)
# options = [('grpc.max_send_message_length', 1000 * 1024 * 1024),
# ('grpc.max_receive_message_length', 1000 * 1024 * 1024)]
# channel = grpc.insecure_channel('localhost:8500', options=options)
# stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
# request = predict_pb2.PredictRequest()
# request.model_spec.name = 'adc_model'
# request.model_spec.signature_name = 'serving_default'
# request.inputs['input_1'].CopyFrom(tf.make_tensor_proto(inputs))
# result = stub.Predict(request, 100.0) # 100 secs timeout
# preds = tf.make_ndarray(result.outputs['dense'])
triton_client = grpcclient.InferenceServerClient("localhost:8001")
# Initialize the data
inputs = [grpcclient.InferInput('input_1', input_data.shape, "FP32")] # [InferInput 类的一个对象用于描述推理请求的输入张量。]
inputs[0].set_data_from_numpy(input_data) # 从指定的numpy数组中获取张量数据与此对象关联的输入
outputs = [grpcclient.InferRequestedOutput("dense")]
# Inference
results = triton_client.infer(
model_name="adc_model",
inputs=inputs,
outputs=outputs
)
# Get the output arrays from the results
preds = results.as_numpy("dense")
index = np.argmax(preds, axis=-1)[0]
return index
# return ROTATE[index]
def DegreeTrans(theta):
'''
Convert radians to angles
'''
res = theta / np.pi * 180
return res
def rotateImage(src, degree):
'''
Calculate the rotation matrix and rotate the image
param src:image after rot90
param degree:the Hough degree
'''
h, w = src.shape[:2]
RotateMatrix = cv2.getRotationMatrix2D((w/2.0, h/2.0), degree, 1)
# affine transformation, background color fills white
rotate = cv2.warpAffine(src, RotateMatrix, (w, h), borderValue=(255, 255, 255))
return rotate
def CalcDegree(srcImage):
'''
Calculating angles by Hough transform
param srcImage:image after rot90
'''
midImage = cv2.cvtColor(srcImage, cv2.COLOR_BGR2GRAY)
dstImage = cv2.Canny(midImage, 100, 300, 3)
lineimage = srcImage.copy()
# 通过霍夫变换检测直线
# 第4个参数(th)就是阈值,阈值越大,检测精度越高
th = 500
while True:
if th > 0:
lines = cv2.HoughLines(dstImage, 1, np.pi/180, th)
else:
lines = None
break
if lines is not None:
if len(lines) > 10:
break
else:
th -= 50
# print ('阈值是:', th)
else:
th -= 100
# print ('阈值是:', th)
continue
sum_theta = 0
num_theta = 0
if lines is not None:
for i in range(len(lines)):
for rho, theta in lines[i]:
# control the angle of line between -30 to +30
if theta > 1 and theta < 2.1:
sum_theta += theta
num_theta += 1
# Average all angles
if num_theta == 0:
average = np.pi/2
else:
average = sum_theta / num_theta
return DegreeTrans(average) - 90
def ADC(image, fine_degree=False):
'''
return param rotate: Corrected image
return param angle_degree:image offset image
'''
# Return a wide angle index
img = np.copy(image)
angle_index = predict(img)
img_rot = np.rot90(img, -angle_index)
# if fine_degree then the image will be corrected more accurately based on character line features.
if fine_degree:
degree = CalcDegree(img_rot)
angle_degree = (angle_index * 90 - degree) % 360
rotate = rotateImage(img_rot, degree)
return rotate, angle_degree
return img_rot, int(angle_index*90)
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Create Date : 2022-07-28 19:40:10
# @Last Modified : 2022-09-08 18:00:40
# @Description :
from .text_rec import textRecServer
text_recognizer = textRecServer()
alphabet = """ \
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
亿\
\
\
\
\
\
\
\
\
广\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
西\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
屿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
仿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
访\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
寿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
怀\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
尿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
齿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
使\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
忿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
沿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
线\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
竿\
\
便\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
姿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
穿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
退\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
轿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
贿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
饿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
鹿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
鸿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
宿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
绿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
湿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
婿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
椿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
殿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
稿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
窿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
簿\
\
\
\
\
\
\
\
\
\
\
\
\
耀\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
廿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
岿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
驿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𠳐\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
羿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
趿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
诿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
涿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
鱿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
粿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
槿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𥻗\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
蹿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𠙶\
\
\
\
\
\
\
\
\
氿\
\
\
\
\
\
\
\
\
\
\
\
\
\
𨙸\
\
\
\
\
\
辿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𣲘\
𣲗\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𨚕\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𦭜\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𫠊\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𦰡\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𦙶\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
洿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𨐈\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
峿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𨺙\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𠅤\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𡎚\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𧿹\
\
\
\
\
\
\
\
\
\
崿\
\
\
\
\
\
\
\
\
\
\
\
\
𨱇\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𣸣\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𤧛\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𦝼\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𡐓\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𣗋\
\
\
\
\
\
\
\
\
\
\
\
\
𥔲\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𨱏\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𩽾\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𩾃\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𥕢\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𨱑\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𤩽\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𨱔\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
谿\
\
\
\
\
鲿\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𦈡\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𥖨\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𦒍\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
𩾌\
\
\
\
\
\
\
嬿\
\
\
\
\
\
\
𨟠\
\
\
\
\
\
𨭉\
\
\
\
\
\
\
\
\
\
\
\
𤫉\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
\
觿\
\
\
\
\
\
\
:\
.\
,\
;\
?\
\
\
-\
)\
(\
\
\
!\
[\
]\
%\
"\
\
/\
\
\
_\
=\
+\
\
'\
\
\
\
*\
\
\
\
&\
\
\
\
~\
\
#\
>\
{\
\
}\
@\
\
|\
\
\\
\
\
\
\
\
β\
$\
°\
\
\
\
±\
\
`\
^\
÷\
\
\
\
α\
\
\
\
<\
\
\
П\
\
\
\
\
\
\
®\
\
\
·\
0\
1\
2\
3\
4\
5\
6\
7\
8\
9\
a\
b\
c\
d\
e\
f\
g\
h\
i\
j\
k\
l\
m\
n\
o\
p\
q\
r\
s\
t\
u\
v\
w\
x\
y\
z\
A\
B\
C\
D\
E\
F\
G\
H\
I\
J\
K\
L\
M\
N\
O\
P\
Q\
R\
S\
T\
U\
V\
W\
X\
Y\
Z"""
import cv2
import time
import numpy as np
from .alphabets import alphabet
import tritonclient.grpc as grpcclient
def sort_poly(p):
# Find the minimum coordinate using (Xi+Yi)
min_axis = np.argmin(np.sum(p, axis=1))
# Sort the box coordinates
p = p[[min_axis, (min_axis + 1) % 4, (min_axis + 2) % 4, (min_axis + 3) % 4]]
if abs(p[0, 0] - p[1, 0]) > abs(p[0, 1] - p[1, 1]):
return p
else:
return p[[0, 3, 2, 1]]
def client_init(url="localhost:8001",
ssl=False, private_key=None, root_certificates=None, certificate_chain=None,
verbose=False):
triton_client = grpcclient.InferenceServerClient(
url=url,
verbose=verbose,
ssl=ssl,
root_certificates=root_certificates,
private_key=private_key,
certificate_chain=certificate_chain)
return triton_client
class textRecServer:
"""_summary_
"""
def __init__(self):
super().__init__()
self.charactersS = ' ' + alphabet
self.batchsize = 8
self.input_name = 'INPUT__0'
self.output_name = 'OUTPUT__0'
self.model_name = 'text_rec_torch'
self.np_type = np.float32
self.quant_type = "FP32"
self.compression_algorithm = None
self.outputs = []
self.outputs.append(grpcclient.InferRequestedOutput(self.output_name))
def preprocess_one_image(self, image):
_, w, _ = image.shape
image = self._transform(image, w)
return image
def predict_batch(self, im, boxes):
"""Summary
Args:
im (TYPE): RGB
boxes (TYPE): Description
Returns:
TYPE: Description
"""
triton_client = client_init("localhost:8001")
count_boxes = len(boxes)
boxes = sorted(boxes,
key=lambda box: int(32.0 * (np.linalg.norm(box[0] - box[1])) / (np.linalg.norm(box[3] - box[0]))),
reverse=True)
results = {}
labels = []
rectime = 0.0
if len(boxes) != 0:
for i in range(len(boxes) // self.batchsize + int(len(boxes) % self.batchsize != 0)):
box = boxes[min(len(boxes)-1, i * self.batchsize)]
w, h = [int(np.linalg.norm(box[0] - box[1])), int(np.linalg.norm(box[3] - box[0]))]
width = max(32, min(int(32.0 * w / h), 960))
if width < 32:
continue
slices = []
for index, box in enumerate(boxes[i * self.batchsize:(i + 1) * self.batchsize]):
_box = [n for a in box for n in a]
if i * self.batchsize + index < count_boxes:
results[i * self.batchsize + index] = [list(map(int, _box))]
w, h = [int(np.linalg.norm(box[0] - box[1])), int(np.linalg.norm(box[3] - box[0]))]
pts1 = np.float32(box)
pts2 = np.float32([[0, 0], [w, 0], [w, h], [0, h]])
# 前处理优化
xmin, ymin, _w, _h = cv2.boundingRect(pts1)
xmax, ymax = xmin+_w, ymin+_h
xmin, ymin = max(0, xmin), max(0, ymin)
im_sclice = im[int(ymin):int(ymax), int(xmin):int(xmax), :]
pts1[:, 0] -= xmin
pts1[:, 1] -= ymin
M = cv2.getPerspectiveTransform(pts1, pts2)
im_crop = cv2.warpPerspective(im_sclice, M, (w, h))
im_crop = self._transform(im_crop, width)
slices.append(im_crop)
start_rec = time.time()
slices = self.np_type(slices)
slices = slices.transpose(0, 3, 1, 2)
slices = slices/127.5-1.
inputs = []
inputs.append(grpcclient.InferInput(self.input_name, list(slices.shape), self.quant_type))
inputs[0].set_data_from_numpy(slices)
# inference
preds = triton_client.infer(
model_name=self.model_name,
inputs=inputs,
outputs=self.outputs,
compression_algorithm=self.compression_algorithm
)
preds = preds.as_numpy(self.output_name).copy()
preds = preds.transpose(1, 0)
tmp_labels = self.decode(preds)
rectime += (time.time() - start_rec)
labels.extend(tmp_labels)
for index, label in enumerate(labels[:count_boxes]):
label = label.replace(' ', '').replace('¥', '¥')
if label == '':
del results[index]
continue
results[index].append(label)
# 重新排序
results = list(results.values())
results = sorted(results, key=lambda x: x[0][1], reverse=False) # 按 y0 从小到大排
keys = [str(i) for i in range(len(results))]
results = dict(zip(keys, results))
else:
results = dict()
rectime = -1
return results, rectime
def decode(self, preds):
res = []
for t in preds:
length = len(t)
char_list = []
for i in range(length):
if t[i] != 0 and (not (i > 0 and t[i-1] == t[i])):
char_list.append(self.charactersS[t[i]])
res.append(u''.join(char_list))
return res
def _transform(self, im, width):
height=32
ori_h, ori_w = im.shape[:2]
ratio1 = width * 1.0 / ori_w
ratio2 = height * 1.0 / ori_h
if ratio1 < ratio2:
ratio = ratio1
else:
ratio = ratio2
new_w, new_h = int(ori_w * ratio), int(ori_h * ratio)
if new_w<4:
new_w = 4
im = cv2.resize(im, (new_w, new_h))
img = np.ones((height, width, 3), dtype=np.uint8)*230
img[:im.shape[0], :im.shape[1], :] = im
return img
from . import text_detector
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Create Date : 2022-06-01 19:00:18
# @Last Modified : 2022-07-15 11:41:25
# @Description :
import os
import cv2
import time
import pyclipper
import numpy as np
# import tensorflow as tf
from shapely.geometry import Polygon
# import grpc
# from tensorflow_serving.apis import predict_pb2
# from tensorflow_serving.apis import prediction_service_pb2_grpc
import tritonclient.grpc as grpcclient
def resize_with_padding(src, limit_max=1024):
'''限制长边不大于 limit_max 短边等比例缩放,以 0 填充'''
img = src.copy()
h, w, _ = img.shape
max_side = max(h, w)
ratio = limit_max / max_side if max_side > limit_max else 1
h, w = int(h * ratio), int(w * ratio)
proc = cv2.resize(img, (w, h))
canvas = np.zeros((limit_max, limit_max, 3), dtype=np.float32)
canvas[0:h, 0:w, :] = proc
return canvas, ratio
def rectangle_boxes_zoom(boxes, offset=1):
'''Scale the rectangle boxes via offset
Input:
boxes: with shape (-1, 4, 2)
offset: how many pix do you wanna zoom, we recommend less than 5
Output:
boxes: zoomed
'''
boxes = np.array(boxes)
boxes += [[[-offset,-offset], [offset,-offset], [offset,offset], [-offset,offset]]]
return boxes
def polygons_from_probmap(preds, ratio):
# 二值化
prob_map_pred = np.array(preds, dtype=np.uint8)[0,:,:,0]
# 输入:二值图、轮廓检索(层次)模式、轮廓渐进方法
# 输出:轮廓、层级关系
contours, hierarchy = cv2.findContours(prob_map_pred, cv2.RETR_EXTERNAL, cv2.CHAIN_APPROX_SIMPLE)
boxes = []
for contour in contours:
if len(contour) < 4:
continue
# Vatti clipping
polygon = Polygon(np.array(contour).reshape((-1, 2))).buffer(0)
polygon = polygon.convex_hull if polygon.type == 'MultiPolygon' else polygon # Note: 这里不是 bug 是我们故意而为之
if polygon.area < 10:
continue
distance = polygon.area * 1.5 / polygon.length
offset = pyclipper.PyclipperOffset()
offset.AddPath(list(polygon.exterior.coords), pyclipper.JT_ROUND, pyclipper.ET_CLOSEDPOLYGON)
expanded = np.array(offset.Execute(distance)[0]) # Note: 这里不是 bug 是我们故意而为之
# Convert polygon to rectangle
rect = cv2.minAreaRect(expanded)
box = cv2.boxPoints(rect)
# make clock-wise order
box = np.roll(box, 4-box.sum(axis=1).argmin(), 0)
box = np.array(box/ratio, dtype=np.int32)
boxes.append(box)
return boxes
def predict(image):
image_resized, ratio = resize_with_padding(image, limit_max=1280)
input_data = np.expand_dims(image_resized/255., axis=0)
# options = [('grpc.max_send_message_length', 1000 * 1024 * 1024),
# ('grpc.max_receive_message_length', 1000 * 1024 * 1024)]
# channel = grpc.insecure_channel('localhost:8500', options=options)
# stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
# request = predict_pb2.PredictRequest()
# request.model_spec.name = 'dbnet_model'
# request.model_spec.signature_name = 'serving_default'
# request.inputs['input_1'].CopyFrom(tf.make_tensor_proto(inputs))
# result = stub.Predict(request, 100.0) # 100 secs timeout
# preds = tf.make_ndarray(result.outputs['tf.math.greater'])
triton_client = grpcclient.InferenceServerClient("localhost:8001")
# Initialize the data
inputs = [grpcclient.InferInput('input_1', input_data.shape, "FP32")]
inputs[0].set_data_from_numpy(input_data)
outputs = [grpcclient.InferRequestedOutput("tf.math.greater")]
# Inference
results = triton_client.infer(
model_name="dbnet_model",
inputs=inputs,
outputs=outputs
)
# Get the output arrays from the results
preds = results.as_numpy("tf.math.greater")
boxes = polygons_from_probmap(preds, ratio)
#boxes = rectangle_boxes_zoom(boxes, offset=0)
return boxes
# import the necessary packages
from .ADC import angle_detector
from .DBNet import text_detector
from .CRNN import text_recognizer
from .object_det import object_detector
from .signature_det import signature_detector
\ No newline at end of file
from .utils import ObjectDetection
object_detector = ObjectDetection()
\ No newline at end of file
# import grpc
import turnsole
import numpy as np
# import tensorflow as tf
# from tensorflow_serving.apis import predict_pb2, prediction_service_pb2_grpc
import tritonclient.grpc as grpcclient
class ObjectDetection():
"""通用文件检测算法
输入图片输出检测结果
API 文档请参阅:
"""
def __init__(self, confidence_threshold=0.5):
"""初始化检测对象
Args:
confidence_threshold (float, optional): 目标检测模型的分类置信度
"""
self.lable2index = {
'id_card_info': 0,
'id_card_guohui': 1,
'lssfz_front': 2,
'lssfz_back': 3,
'jzz_front': 4,
'jzz_back': 5,
'txz_front': 6,
'txz_back': 7,
'bank_card': 8,
'vehicle_license_front': 9,
'vehicle_license_back': 10,
'driving_license_front': 11,
'driving_license_back': 12,
'vrc_page_12': 13,
'vrc_page_34': 14,
}
self.index2lable = list(self.lable2index.keys())
# def resize_and_pad_to_384(self, image, jitter=True):
# """长边在 256-384 之间随机取一个数,四边 pad 到 384
# Args:
# image (TYPE): An image represented as a numpy ndarray.
# """
# image_shape = tf.cast(tf.shape(image)[:2], dtype=tf.float32)
# max_side = tf.random.uniform(
# (), 256, 384, dtype=tf.float32) if jitter else 384.
# ratio = max_side / tf.reduce_max(image_shape)
# image_shape = tf.cast(ratio * image_shape, dtype=tf.int32)
# image = tf.image.resize(image, image_shape)
# image = tf.image.pad_to_bounding_box(image, 0, 0, 384, 384)
# return image, ratio
def process(self, image):
"""Processes an image and returns a list of the detected object location and classes data.
Args:
image (TYPE): An image represented as a numpy ndarray.
"""
h, w, _ = image.shape
# image, ratio = self.resize_and_pad_to_384(image, jitter=False)
image, ratio = turnsole.resize_with_pad(image, target_height=384, target_width=384)
input_data = np.expand_dims(image/255., axis=0)
# options = [('grpc.max_send_message_length', 1000 * 1024 * 1024),
# ('grpc.max_receive_message_length', 1000 * 1024 * 1024)]
# channel = grpc.insecure_channel('localhost:8500', options=options)
# stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
# request = predict_pb2.PredictRequest()
# request.model_spec.name = 'object_detection'
# request.model_spec.signature_name = 'serving_default'
# request.inputs['image'].CopyFrom(tf.make_tensor_proto(inputs, dtype='float32'))
# # 100 secs timeout
# result = stub.Predict(request, 100.0)
# # saved_model_cli show --dir saved_model/ --all # 查看 saved model 的输入输出
# boxes = tf.make_ndarray(result.outputs['decode_predictions'])
# scores = tf.make_ndarray(result.outputs['decode_predictions_1'])
# classes = tf.make_ndarray(result.outputs['decode_predictions_2'])
# valid_detections = tf.make_ndarray(
# result.outputs['decode_predictions_3'])
triton_client = grpcclient.InferenceServerClient("localhost:8001")
# Initialize the data
inputs = [grpcclient.InferInput('image', input_data.shape, "FP32")]
inputs[0].set_data_from_numpy(input_data.astype('float32'))
outputs = [
grpcclient.InferRequestedOutput("decode_predictions"),
grpcclient.InferRequestedOutput("decode_predictions_1"),
grpcclient.InferRequestedOutput("decode_predictions_2"),
grpcclient.InferRequestedOutput("decode_predictions_3")
]
# Inference
results = triton_client.infer(
model_name="object_detection",
inputs=inputs,
outputs=outputs
)
# Get the output arrays from the results
boxes = results.as_numpy("decode_predictions")
scores = results.as_numpy("decode_predictions_1")
classes = results.as_numpy("decode_predictions_2")
valid_detections = results.as_numpy("decode_predictions_3")
boxes = boxes[0][:valid_detections[0]]
scores = scores[0][:valid_detections[0]]
classes = classes[0][:valid_detections[0]]
object_list = []
for box, score, class_index in zip(boxes, scores, classes):
xmin, ymin, xmax, ymax = box / ratio
xmin = max(0, int(xmin))
ymin = max(0, int(ymin))
xmax = min(w, int(xmax))
ymax = min(h, int(ymax))
class_label = self.index2lable[int(class_index)]
item = {
"label": class_label,
"confidence": float(score),
"location": {
"xmin": xmin,
"ymin": ymin,
"xmax": xmax,
"ymax": ymax
}
}
object_list.append(item)
return object_list
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : lk
# @Email : 9428.al@gmail.com
# @Create Date : 2022-06-28 14:38:57
# @Last Modified : 2022-09-06 14:37:47
# @Description :
from .utils import SignatureDetection
signature_detector = SignatureDetection()
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : lk
# @Email : 9428.al@gmail.com
# @Create Date : 2022-02-08 14:10:00
# @Last Modified : 2022-09-06 14:45:10
# @Description :
import turnsole
import numpy as np
# import tensorflow as tf
# import grpc
# from tensorflow_serving.apis import predict_pb2
# from tensorflow_serving.apis import prediction_service_pb2_grpc
import tritonclient.grpc as grpcclient
# def resize_and_pad_to_1024(image, jitter=True):
# # 长边在 512-1024 之间随机取一个数,四边 pad 到 1024
# image_shape = tf.cast(tf.shape(image)[:2], dtype=tf.float32)
# max_side = tf.random.uniform((), 512, 1024, dtype=tf.float32) if jitter else 1024.
# ratio = max_side / tf.reduce_max(image_shape)
# image_shape = tf.cast(ratio * image_shape, dtype=tf.int32)
# image = tf.image.resize(image, image_shape)
# image = tf.image.pad_to_bounding_box(image, 0, 0, 1024, 1024)
# return image, ratio
class SignatureDetection():
"""签字盖章检测算法
输入图片输出检测结果
API 文档请参阅:
"""
def __init__(self, confidence_threshold=0.5):
"""初始化检测对象
Args:
confidence_threshold (float, optional): 目标检测模型的分类置信度
"""
self.lable2index = {
'circle': 0,
'ellipse': 1,
'rectangle': 2,
'signature': 3,
'qr_code': 4,
'bar_code': 5
}
self.index2lable = {
0: 'circle',
1: 'ellipse',
2: 'rectangle',
3: 'signature',
4: 'qr_code',
5: 'bar_code'
}
def process(self, image):
"""Processes an image and returns a list of the detected signature location and classes data.
Args:
image (TYPE): An image represented as a numpy ndarray.
"""
h, w, _ = image.shape
# image, ratio = resize_and_pad_to_1024(image, jitter=False)
image, ratio = turnsole.resize_with_pad(image, target_height=1024, target_width=1024)
input_data = np.expand_dims(np.float32(image/255.), axis=0)
# options = [('grpc.max_send_message_length', 1000 * 1024 * 1024),
# ('grpc.max_receive_message_length', 1000 * 1024 * 1024)]
# channel = grpc.insecure_channel('localhost:8500', options=options)
# stub = prediction_service_pb2_grpc.PredictionServiceStub(channel)
# request = predict_pb2.PredictRequest()
# request.model_spec.name = 'signature_model'
# request.model_spec.signature_name = 'serving_default'
# request.inputs['image'].CopyFrom(tf.make_tensor_proto(inputs, dtype='float32'))
# result = stub.Predict(request, 100.0) # 100 secs timeout
# # saved_model_cli show --dir saved_model/ --all # 查看 saved model 的输入输出
# boxes = tf.make_ndarray(result.outputs['decode_predictions'])
# scores = tf.make_ndarray(result.outputs['decode_predictions_1'])
# classes = tf.make_ndarray(result.outputs['decode_predictions_2'])
# valid_detections = tf.make_ndarray(result.outputs['decode_predictions_3'])
triton_client = grpcclient.InferenceServerClient("localhost:8001")
# Initialize the data
inputs = [grpcclient.InferInput('image', input_data.shape, "FP32")]
inputs[0].set_data_from_numpy(input_data)
outputs = [
grpcclient.InferRequestedOutput("decode_predictions"),
grpcclient.InferRequestedOutput("decode_predictions_1"),
grpcclient.InferRequestedOutput("decode_predictions_2"),
grpcclient.InferRequestedOutput("decode_predictions_3")
]
# Inference
results = triton_client.infer(
model_name="signature_model",
inputs=inputs,
outputs=outputs
)
# Get the output arrays from the results
boxes = results.as_numpy("decode_predictions")
scores = results.as_numpy("decode_predictions_1")
classes = results.as_numpy("decode_predictions_2")
valid_detections = results.as_numpy("decode_predictions_3")
boxes = boxes[0][:valid_detections[0]]
scores = scores[0][:valid_detections[0]]
classes = classes[0][:valid_detections[0]]
signature_list = []
for box, score, class_index in zip(boxes, scores, classes):
xmin, ymin, xmax, ymax = box / ratio
class_label = self.index2lable[class_index]
item = {
"label": class_label,
"confidence": float(score),
"location": {
"xmin": max(0, int(xmin)),
"ymin": max(0, int(ymin)),
"xmax": min(w, int(xmax)),
"ymax": min(h, int(ymax))
}
}
signature_list.append(item)
return signature_list
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Create Date : 2022-06-16 11:01:36
# @Last Modified : 2022-07-15 10:57:06
# @Description :
from .read_data import base64_to_bgr
from .read_data import bytes_to_bgr
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Create Date : 2022-06-16 10:59:50
# @Last Modified : 2022-08-03 14:59:15
# @Description :
import cv2
import base64
import numpy as np
import tensorflow as tf
def base64_to_bgr(img64):
"""把 base64 转换成图片
单通道的灰度图或四通道的透明图都将自动转换成三通道的 BGR 图
Args:
img64 (TYPE): Description
Returns:
TYPE: image is a 3-D uint8 Tensor of shape [height, width, channels] where channels is BGR
"""
encoded_image = base64.b64decode(img64)
img_array = np.frombuffer(encoded_image, np.uint8)
image = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
return image
def bytes_to_bgr(buffer: bytes):
"""Read a byte stream as a OpenCV image
Args:
buffer (TYPE): bytes of a decoded image
"""
img_array = np.frombuffer(buffer, np.uint8)
image = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
# image = tf.io.decode_image(buffer, channels=3)
# image = np.array(image)[...,::-1]
return image
\ No newline at end of file
# -*- coding: utf-8 -*-
# @Author : Lyu Kui
# @Email : 9428.al@gmail.com
# @Created Date : 2021-03-04 17:50:09
# @Last Modified : 2021-03-10 14:03:02
# @Description :
import os
image_types = (".jpg", ".jpeg", ".png", ".bmp", ".tif", ".tiff")
def list_images(basePath, contains=None):
# return the set of files that are valid
return list_files(basePath, validExts=image_types, contains=contains)
def list_files(basePath, validExts=None, contains=None):
# loop over the directory structure
for (rootDir, dirNames, filenames) in os.walk(basePath):
# loop over the filenames in the current directory
for filename in filenames:
# if the contains string is not none and the filename does not contain
# the supplied string, then ignore the file
if contains is not None and filename.find(contains) == -1:
continue
# determine the file extension of the current file
ext = filename[filename.rfind("."):].lower()
# check to see if the file is an image and should be processed
if validExts is None or ext.endswith(validExts):
# construct the path to the image and yield it
imagePath = os.path.join(rootDir, filename)
yield imagePath
def get_filename(filePath):
basename = os.path.basename(filePath)
fname, fextension = os.path.splitext(basename)
return fname
\ No newline at end of file
import cv2
import fitz
import numpy as np
def pdf_to_images(pdf_path: str):
"""PDF 转 OpenCV Image
Args:
pdf_path (str): Description
Returns:
TYPE: Description
"""
images = []
doc = fitz.open(pdf_path)
# producer = doc.metadata.get('producer')
for pno in range(doc.page_count):
page = doc.load_page(pno)
all_texts = page.get_text().replace('\n', '').strip()
# 根据经验过滤掉特殊情况
all_texts = all_texts.strip('Click to buy NOW!PDF-XChangewww.docu-track.comClick to buy NOW!PDF-XChangewww.docu-track.com')
blocks = page.get_text("dict")["blocks"]
imgblocks = [b for b in blocks if b["type"] == 1]
page_images = []
# 如果一个字都没有,
if len(all_texts) == 0 and len(imgblocks) != 0:
# # 这些 producer 包含碎图,如果真的是碎图我们把碎图拼接一下
# if producer in ['Microsoft: Print To PDF',
# 'GPL Ghostscript 8.71',
# 'doPDF Ver 7.3 Build 398 (Windows 7 Business Edition (SP 1) - Version: 6.1.7601 (x64))',
# '福昕阅读器PDF打印机 版本 11.0.114.4386']:
patches = []
for imgblock in imgblocks:
contents = imgblock["image"]
img_array = np.frombuffer(contents, dtype=np.uint8)
image = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
patches.append(image)
try:
try:
image = np.concatenate(patches, axis=0)
page_images.append(image)
except:
image = np.concatenate(patches, axis=1)
page_images.append(image)
except:
# 当两张拼不到一块的时候我们可以认为他是两张图,如果超过两张那就不一定了
if len(patches) == 2:
page_images = patches
else:
pix = page.get_pixmap(dpi=350)
contents = pix.tobytes(output="png")
img_array = np.frombuffer(contents, dtype=np.uint8)
image = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
page_images.append(image)
# else:
# for imgblock in imgblocks:
# contents = imgblock["image"]
# img_array = np.frombuffer(contents, dtype=np.uint8)
# image = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
# page_images.append(image)
else:
pix = page.get_pixmap(dpi=350)
contents = pix.tobytes(output="png")
img_array = np.frombuffer(contents, dtype=np.uint8)
image = cv2.imdecode(img_array, cv2.IMREAD_COLOR)
page_images.append(image)
images.append(page_images)
return images
# import the necessary packages
from .count_frames import count_frames
from .fps import FPS
from .videostream import VideoStream
from .webcamvideostream import WebcamVideoStream
from .filevideostream import FileVideoStream
\ No newline at end of file
# import the necessary packages
# from ..convenience import is_cv3
import cv2
def count_frames(path, override=False):
# grab a pointer to the video file and initialize the total
# number of frames read
video = cv2.VideoCapture(path)
total = 0
# if the override flag is passed in, revert to the manual
# method of counting frames
if override:
total = count_frames_manual(video)
# otherwise, let's try the fast way first
else:
# lets try to determine the number of frames in a video
# via video properties; this method can be very buggy
# and might throw an error based on your OpenCV version
# or may fail entirely based on your which video codecs
# you have installed
try:
# # check if we are using OpenCV 3
# if is_cv3():
# total = int(video.get(cv2.CAP_PROP_FRAME_COUNT))
# # otherwise, we are using OpenCV 2.4
# else:
# total = int(video.get(cv2.cv.CV_CAP_PROP_FRAME_COUNT))
total = int(video.get(cv2.cv.CV_CAP_PROP_FRAME_COUNT))
# uh-oh, we got an error -- revert to counting manually
except:
total = count_frames_manual(video)
# release the video file pointer
video.release()
# return the total number of frames in the video
return total
def count_frames_manual(video):
# initialize the total number of frames read
total = 0
# loop over the frames of the video
while True:
# grab the current frame
(grabbed, frame) = video.read()
# check to see if we have reached the end of the
# video
if not grabbed:
break
# increment the total number of frames read
total += 1
# return the total number of frames in the video file
return total
\ No newline at end of file
# import the necessary packages
from threading import Thread
import sys
import cv2
import time
# import the Queue class from Python 3
if sys.version_info >= (3, 0):
from queue import Queue
# otherwise, import the Queue class for Python 2.7
else:
from Queue import Queue
class FileVideoStream:
def __init__(self, path, transform=None, queue_size=128):
# initialize the file video stream along with the boolean
# used to indicate if the thread should be stopped or not
self.stream = cv2.VideoCapture(path)
self.stopped = False
self.transform = transform
# initialize the queue used to store frames read from
# the video file
self.Q = Queue(maxsize=queue_size)
# intialize thread
self.thread = Thread(target=self.update, args=())
self.thread.daemon = True
def start(self):
# start a thread to read frames from the file video stream
self.thread.start()
return self
def update(self):
# keep looping infinitely
while True:
# if the thread indicator variable is set, stop the
# thread
if self.stopped:
break
# otherwise, ensure the queue has room in it
if not self.Q.full():
# read the next frame from the file
(grabbed, frame) = self.stream.read()
# if the `grabbed` boolean is `False`, then we have
# reached the end of the video file
if not grabbed:
self.stopped = True
break
# if there are transforms to be done, might as well
# do them on producer thread before handing back to
# consumer thread. ie. Usually the producer is so far
# ahead of consumer that we have time to spare.
#
# Python is not parallel but the transform operations
# are usually OpenCV native so release the GIL.
#
# Really just trying to avoid spinning up additional
# native threads and overheads of additional
# producer/consumer queues since this one was generally
# idle grabbing frames.
if self.transform:
frame = self.transform(frame)
# add the frame to the queue
self.Q.put(frame)
else:
time.sleep(0.1) # Rest for 10ms, we have a full queue
self.stream.release()
def read(self):
# return next frame in the queue
return self.Q.get()
# Insufficient to have consumer use while(more()) which does
# not take into account if the producer has reached end of
# file stream.
def running(self):
return self.more() or not self.stopped
def more(self):
# return True if there are still frames in the queue. If stream is not stopped, try to wait a moment
tries = 0
while self.Q.qsize() == 0 and not self.stopped and tries < 5:
time.sleep(0.1)
tries += 1
return self.Q.qsize() > 0
def stop(self):
# indicate that the thread should be stopped
self.stopped = True
# wait until stream resources are released (producer thread might be still grabbing frame)
self.thread.join()
# import the necessary packages
import datetime
class FPS:
def __init__(self):
# store the start time, end time, and total number of frames
# that were examined between the start and end intervals
self._start = None
self._end = None
self._numFrames = 0
def start(self):
# start the timer
self._start = datetime.datetime.now()
return self
def stop(self):
# stop the timer
self._end = datetime.datetime.now()
def update(self):
# increment the total number of frames examined during the
# start and end intervals
self._numFrames += 1
def elapsed(self):
# return the total number of seconds between the start and
# end interval
return (self._end - self._start).total_seconds()
def fps(self):
# compute the (approximate) frames per second
return self._numFrames / self.elapsed()
\ No newline at end of file
# import the necessary packages
from picamera.array import PiRGBArray
from picamera import PiCamera
from threading import Thread
import cv2
class PiVideoStream:
def __init__(self, resolution=(320, 240), framerate=32, **kwargs):
# initialize the camera
self.camera = PiCamera()
# set camera parameters
self.camera.resolution = resolution
self.camera.framerate = framerate
# set optional camera parameters (refer to PiCamera docs)
for (arg, value) in kwargs.items():
setattr(self.camera, arg, value)
# initialize the stream
self.rawCapture = PiRGBArray(self.camera, size=resolution)
self.stream = self.camera.capture_continuous(self.rawCapture,
format="bgr", use_video_port=True)
# initialize the frame and the variable used to indicate
# if the thread should be stopped
self.frame = None
self.stopped = False
def start(self):
# start the thread to read frames from the video stream
t = Thread(target=self.update, args=())
t.daemon = True
t.start()
return self
def update(self):
# keep looping infinitely until the thread is stopped
for f in self.stream:
# grab the frame from the stream and clear the stream in
# preparation for the next frame
self.frame = f.array
self.rawCapture.truncate(0)
# if the thread indicator variable is set, stop the thread
# and resource camera resources
if self.stopped:
self.stream.close()
self.rawCapture.close()
self.camera.close()
return
def read(self):
# return the frame most recently read
return self.frame
def stop(self):
# indicate that the thread should be stopped
self.stopped = True
# import the necessary packages
from .webcamvideostream import WebcamVideoStream
class VideoStream:
def __init__(self, src=0, usePiCamera=False, resolution=(320, 240),
framerate=32, **kwargs):
# check to see if the picamera module should be used
if usePiCamera:
# only import the picamera packages unless we are
# explicity told to do so -- this helps remove the
# requirement of `picamera[array]` from desktops or
# laptops that still want to use the `imutils` package
from .pivideostream import PiVideoStream
# initialize the picamera stream and allow the camera
# sensor to warmup
self.stream = PiVideoStream(resolution=resolution,
framerate=framerate, **kwargs)
# otherwise, we are using OpenCV so initialize the webcam
# stream
else:
self.stream = WebcamVideoStream(src=src)
def start(self):
# start the threaded video stream
return self.stream.start()
def update(self):
# grab the next frame from the stream
self.stream.update()
def read(self):
# return the current frame
return self.stream.read()
def stop(self):
# stop the thread and release any resources
self.stream.stop()
# import the necessary packages
from threading import Thread
import cv2
class WebcamVideoStream:
def __init__(self, src=0, name="WebcamVideoStream"):
# initialize the video camera stream and read the first frame
# from the stream
self.stream = cv2.VideoCapture(src)
(self.grabbed, self.frame) = self.stream.read()
# initialize the thread name
self.name = name
# initialize the variable used to indicate if the thread should
# be stopped
self.stopped = False
def start(self):
# start the thread to read frames from the video stream
t = Thread(target=self.update, name=self.name, args=())
t.daemon = True
t.start()
return self
def update(self):
# keep looping infinitely until the thread is stopped
while True:
# if the thread indicator variable is set, stop the thread
if self.stopped:
return
# otherwise, read the next frame from the stream
(self.grabbed, self.frame) = self.stream.read()
def read(self):
# return the frame most recently read
return self.frame
def stop(self):
# indicate that the thread should be stopped
self.stopped = True
Styling with Markdown is supported
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!