first commit
54
.gitignore
vendored
Normal file
@@ -0,0 +1,54 @@
|
||||
# See https://help.github.com/articles/ignoring-files/ for more about ignoring files.
|
||||
|
||||
# dependencies
|
||||
node_modules
|
||||
/.pnp
|
||||
.pnp.js
|
||||
|
||||
# testing
|
||||
/coverage
|
||||
|
||||
# production
|
||||
/build
|
||||
dist
|
||||
|
||||
# misc
|
||||
.DS_Store
|
||||
.env.local
|
||||
.env.development.local
|
||||
.env.test.local
|
||||
.env.production.local
|
||||
|
||||
npm-debug.log*
|
||||
yarn-debug.log*
|
||||
yarn-error.log*
|
||||
|
||||
# Editor directories and files
|
||||
.idea
|
||||
*.suo
|
||||
*.ntvs*
|
||||
*.njsproj
|
||||
*.sln
|
||||
*.sw?
|
||||
|
||||
# mockm
|
||||
httpData
|
||||
|
||||
public/upload/**
|
||||
!public/upload/*.gitkeep
|
||||
.history
|
||||
|
||||
# Package manager lock file
|
||||
package-lock.json
|
||||
yarn.lock
|
||||
# pnpm-lock.yaml
|
||||
auto-imports.d.ts
|
||||
components.d.ts
|
||||
|
||||
.wxt
|
||||
.output
|
||||
web-ext.config.ts
|
||||
.wrangler
|
||||
|
||||
# vite-plugin-pwa dev output
|
||||
dev-dist
|
||||
3
.vscode/settings.json
vendored
Normal file
@@ -0,0 +1,3 @@
|
||||
{
|
||||
"CodeFree.index": true
|
||||
}
|
||||
102
README.md
Normal file
@@ -0,0 +1,102 @@
|
||||
# 项目说明
|
||||
|
||||
本仓库提供一个基于 TypeScript 的命令行工具,用于识别带有双条干扰线的四位数字验证码。默认情况下,`train/` 目录用于训练,`valid/` 目录用于验证。程序运行时会读取两类数据:训练集中的文件名提供监督标签,用于即时训练一个轻量级分类模型;验证集则用来衡量泛化效果并输出详细结果。
|
||||
|
||||
## 环境要求
|
||||
|
||||
- Node.js 18 及以上版本(开发环境使用的是 Node.js v22)
|
||||
- 已安装的系统依赖:`sharp` 依赖 libvips,macOS / Linux 上安装该库后方可正常编译
|
||||
|
||||
第一次运行前请确认项目目录下已执行:
|
||||
|
||||
```bash
|
||||
npm install
|
||||
```
|
||||
|
||||
该命令会安装以下主要依赖:
|
||||
|
||||
- `tesseract.js`:用于对比基准识别效果(未经过干扰线处理时的识别结果)。
|
||||
- `sharp`:完成灰度化、归一化、阈值化、裁剪与缩放等图像预处理。
|
||||
- `ts-node` / `typescript`:支持直接运行 TypeScript 源代码。
|
||||
|
||||
## 快速开始
|
||||
|
||||
默认使用 `train/` 训练,并在 `valid/` 上验证:
|
||||
|
||||
```bash
|
||||
npm run ocr
|
||||
```
|
||||
|
||||
如需指定数据集,可使用如下方式:
|
||||
|
||||
```bash
|
||||
# 指定训练与验证目录
|
||||
npm run ocr -- ./my-train ./my-valid
|
||||
|
||||
# 仅指定验证目录(训练仍使用默认 train/)
|
||||
npm run ocr -- ./my-valid
|
||||
```
|
||||
|
||||
当仅提供一个自定义路径且该路径缺少标签(文件名不含四位数字)时,脚本会把它视作新的验证目录,同时保留默认训练集。命令执行结束后,终端会分别输出训练集与验证集的准确率、逐文件预测,以及 Tesseract.js 的识别结果以供对比。
|
||||
|
||||
## 工具工作流程
|
||||
|
||||
1. **扫描数据集**
|
||||
按目录读取所有扩展名为 `.png`、`.jpg`、`.jpeg`、`.bmp` 的图片文件。文件名中的连续四位数字作为标签,若文件名缺少四位数字,则只生成预测,不计入准确率。
|
||||
|
||||
2. **图像预处理**
|
||||
- 使用 `sharp` 将图片 resize 到固定高度(120px),保持纵横比。
|
||||
- 转换为灰度图并做 normalize。
|
||||
- 应用固定阈值生成二值化掩膜,用于定位数字区域与干扰线。
|
||||
|
||||
3. **分割四个数字**
|
||||
- 统计每列、每行的黑色像素数量,找出真正包含数字墨迹的区域,并裁掉纯白边缘。
|
||||
- 由于图片始终包含四个数字,横向等分为四段,再根据列统计结果向内收缩,确保裁剪框贴近数字并尽量避开干扰线。
|
||||
- 对每个数字区域加入细小的边缘留白,并缩放到 `20×20` 像素,形成 400 维的浮点特征向量(像素值归一化到 0~1,黑色越接近 1)。
|
||||
|
||||
4. **模型训练**
|
||||
- 所有数字特征与文件名标签构成训练集。
|
||||
- 使用自实现的多分类 softmax(逻辑回归)模型,采用随机顺序的批量梯度下降,训练 1000 个 epoch。由于数据量小且特征维度低,训练耗时通常不足 1 秒。
|
||||
|
||||
5. **验证码识别与验证**
|
||||
- 使用训练后的权重分别对训练集和验证集的四位数字进行推断。
|
||||
- 根据标签统计准确率,并输出每张图片的预测详情。
|
||||
- 同时调用一次 Tesseract.js(未做特别预处理),记录其识别文本,方便与自训练模型对比。
|
||||
|
||||
## 目录结构
|
||||
|
||||
```
|
||||
├── package.json npm 配置,包含运行脚本与依赖
|
||||
├── tsconfig.json TypeScript 编译配置
|
||||
├── src/
|
||||
│ └── ocr.ts 主程序:预处理、分割、训练、验证逻辑均在此
|
||||
├── train/ 训练用验证码图片,文件名即标签
|
||||
├── valid/ 验证用验证码图片,文件名即标签
|
||||
└── README.md 项目说明(本文档)
|
||||
```
|
||||
|
||||
## 识别策略说明
|
||||
|
||||
- **利用文件名作为监督信号**:图片标签直接来自于文件名,不需要额外的标注文件,便于扩充数据集。
|
||||
- **干扰线处理方式**:不是直接删除干扰线,而是通过对列/行墨迹统计裁剪出真正的数字区域。由于干扰线位置基本固定,且覆盖面积较窄,裁剪后得到的数字基本没有残留干扰线。
|
||||
- **模型为何选择 Softmax**:数据量为几十张,模型复杂度越低越稳定。逻辑回归 + 400 维像素特征即可达到 100% 训练准确率,且推理速度极快。
|
||||
- **Tesseract.js 调用**:保留这一步仅为记录传统 OCR 在原始图片上的表现,可作为质量对比或回归基线。
|
||||
|
||||
## 常见问题
|
||||
|
||||
| 问题 | 解决方案 |
|
||||
| --- | --- |
|
||||
| 运行时报 `Module not found: sharp` 或安装 `sharp` 失败 | 确认系统已安装 libvips。macOS 可使用 `brew install vips`,Linux 可通过发行版包管理器安装。 |
|
||||
| 输出中 `predicted` 与 `expected` 均为空 | 检查文件名是否包含连续四位数字,脚本只会对这样的文件进行训练和验证。 |
|
||||
| 想要保存训练结果以复用 | 当前数据集较小,每次训练耗时极短,如仍需持久化,可修改 `trainSoftmax` 在训练结束后将 `weights` 序列化为 JSON 文件,下次运行直接加载。 |
|
||||
| 新增图片后识别出错 | 确保新图片尺寸和干扰线位置与现有样本一致;若差异较大,可能需要调整 `TARGET_HEIGHT` 或阈值等参数。 |
|
||||
|
||||
## 扩展方向
|
||||
|
||||
- 如果干扰线形态发生变化、或字体有明显差异,可进一步加入自定义去噪步骤,例如基于曲线拟合的线条擦除或使用形态学操作。
|
||||
- 若数据集显著扩大,可以将 softmax 替换为轻量级的卷积神经网络(可基于 TensorFlow.js),但在当前数据规模下并非必要。
|
||||
- 可以增加命令行参数,用于切换阈值、输出预测概率、导出训练权重等。
|
||||
|
||||
## 反馈
|
||||
|
||||
如需调整识别策略或扩展功能,可在 `src/ocr.ts` 中直接修改对应逻辑。代码整体结构保持模块化:数据加载 → 特征提取 → 训练 → 推理,便于在任一步骤插入新的处理流程。
|
||||
BIN
debug_clean.png
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
eng.traineddata
Normal file
BIN
gray_clean.png
Normal file
|
After Width: | Height: | Size: 33 KiB |
24
package.json
Normal file
@@ -0,0 +1,24 @@
|
||||
{
|
||||
"name": "digit_cracker",
|
||||
"version": "1.0.0",
|
||||
"description": "",
|
||||
"main": "index.js",
|
||||
"directories": {
|
||||
"test": "test"
|
||||
},
|
||||
"scripts": {
|
||||
"ocr": "ts-node src/ocr.ts"
|
||||
},
|
||||
"keywords": [],
|
||||
"author": "",
|
||||
"license": "ISC",
|
||||
"dependencies": {
|
||||
"sharp": "^0.34.4",
|
||||
"tesseract.js": "^6.0.1"
|
||||
},
|
||||
"devDependencies": {
|
||||
"@types/node": "^24.9.2",
|
||||
"ts-node": "^10.9.2",
|
||||
"typescript": "^5.9.3"
|
||||
}
|
||||
}
|
||||
402
src/ocr.ts
Normal file
@@ -0,0 +1,402 @@
|
||||
import fs from 'node:fs';
|
||||
import path from 'node:path';
|
||||
|
||||
import sharp from 'sharp';
|
||||
import { createWorker, PSM } from 'tesseract.js';
|
||||
|
||||
type DigitSample = {
|
||||
label: number;
|
||||
feature: Float64Array;
|
||||
};
|
||||
|
||||
type RecognizedResult = {
|
||||
file: string;
|
||||
expected?: string;
|
||||
predicted: string;
|
||||
tesseract?: string;
|
||||
};
|
||||
|
||||
type LoadedFile = {
|
||||
file: string;
|
||||
path: string;
|
||||
expected?: string;
|
||||
features: Float64Array[];
|
||||
};
|
||||
|
||||
type LoadedDataset = {
|
||||
directory: string;
|
||||
perDigit: DigitSample[];
|
||||
files: LoadedFile[];
|
||||
};
|
||||
|
||||
const TRAIN_DIR_DEFAULT = path.resolve(process.cwd(), 'train');
|
||||
const VALID_DIR_DEFAULT = path.resolve(process.cwd(), 'valid');
|
||||
const TARGET_HEIGHT = 120;
|
||||
const THRESHOLD = 180;
|
||||
const DIGIT_SIZE = 20;
|
||||
const CLASSES = 10;
|
||||
const MAX_EPOCHS = 1500;
|
||||
const LEARNING_RATE = 0.05;
|
||||
const L2_LAMBDA = 1e-4;
|
||||
|
||||
async function preprocessImage(filePath: string) {
|
||||
return sharp(filePath).resize({ height: TARGET_HEIGHT }).greyscale().normalize();
|
||||
}
|
||||
|
||||
async function buildBinaryMask(image: sharp.Sharp) {
|
||||
// Thresholded mask is used for locating the digits and interference lines.
|
||||
return image.clone().threshold(THRESHOLD).raw().toBuffer({ resolveWithObject: true });
|
||||
}
|
||||
|
||||
async function extractDigits(filePath: string): Promise<Float64Array[]> {
|
||||
const image = await preprocessImage(filePath);
|
||||
const { data, info } = await buildBinaryMask(image);
|
||||
const { width, height } = info;
|
||||
|
||||
const columnInk = new Array<number>(width).fill(0);
|
||||
const rowInk = new Array<number>(height).fill(0);
|
||||
|
||||
for (let y = 0; y < height; y += 1) {
|
||||
for (let x = 0; x < width; x += 1) {
|
||||
if (data[y * width + x] === 0) {
|
||||
columnInk[x] += 1;
|
||||
rowInk[y] += 1;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
let left = 0;
|
||||
let right = width - 1;
|
||||
while (left < width && columnInk[left] === 0) left += 1;
|
||||
while (right >= 0 && columnInk[right] === 0) right -= 1;
|
||||
if (left >= right) {
|
||||
throw new Error(`Unable to find ink columns in ${filePath}`);
|
||||
}
|
||||
|
||||
let top = 0;
|
||||
let bottom = height - 1;
|
||||
while (top < height && rowInk[top] === 0) top += 1;
|
||||
while (bottom >= 0 && rowInk[bottom] === 0) bottom -= 1;
|
||||
|
||||
const digitWidth = (right - left + 1) / 4;
|
||||
const segments: Array<{ left: number; right: number; top: number; bottom: number }> = [];
|
||||
|
||||
for (let i = 0; i < 4; i += 1) {
|
||||
// Snap each segment to the nearest ink to avoid pulling in interference lines.
|
||||
let segLeft = Math.floor(left + i * digitWidth);
|
||||
let segRight = i === 3 ? right : Math.floor(left + (i + 1) * digitWidth - 1);
|
||||
while (segLeft < segRight && columnInk[segLeft] === 0) segLeft += 1;
|
||||
while (segRight > segLeft && columnInk[segRight] === 0) segRight -= 1;
|
||||
|
||||
let segTop = top;
|
||||
let found = false;
|
||||
for (let y = top; y <= bottom && !found; y += 1) {
|
||||
for (let x = segLeft; x <= segRight; x += 1) {
|
||||
if (data[y * width + x] === 0) {
|
||||
segTop = y;
|
||||
found = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
let segBottom = bottom;
|
||||
found = false;
|
||||
for (let y = bottom; y >= top && !found; y -= 1) {
|
||||
for (let x = segLeft; x <= segRight; x += 1) {
|
||||
if (data[y * width + x] === 0) {
|
||||
segBottom = y;
|
||||
found = true;
|
||||
break;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
segments.push({ left: segLeft, right: segRight, top: segTop, bottom: segBottom });
|
||||
}
|
||||
|
||||
const grayscaleBuffer = await image.raw().toBuffer();
|
||||
const grayscale = sharp(grayscaleBuffer, { raw: { width, height, channels: 1 } });
|
||||
|
||||
const digits: Float64Array[] = [];
|
||||
for (const segment of segments) {
|
||||
// Crop with a small margin and resample so every digit maps to DIGIT_SIZE² features.
|
||||
const margin = 2;
|
||||
const cropLeft = Math.max(0, segment.left - margin);
|
||||
const cropRight = Math.min(width - 1, segment.right + margin);
|
||||
const cropTop = Math.max(0, segment.top - margin);
|
||||
const cropBottom = Math.min(height - 1, segment.bottom + margin);
|
||||
const cropWidth = cropRight - cropLeft + 1;
|
||||
const cropHeight = cropBottom - cropTop + 1;
|
||||
|
||||
const { data: cropped } = await grayscale
|
||||
.clone()
|
||||
.extract({ left: cropLeft, top: cropTop, width: cropWidth, height: cropHeight })
|
||||
.resize({ width: DIGIT_SIZE, height: DIGIT_SIZE, fit: 'fill', kernel: sharp.kernel.cubic })
|
||||
.raw()
|
||||
.toBuffer({ resolveWithObject: true });
|
||||
|
||||
const feature = new Float64Array(DIGIT_SIZE * DIGIT_SIZE);
|
||||
for (let i = 0; i < cropped.length; i += 1) {
|
||||
feature[i] = (255 - cropped[i]) / 255;
|
||||
}
|
||||
digits.push(feature);
|
||||
}
|
||||
|
||||
return digits;
|
||||
}
|
||||
|
||||
function parseLabelFromFilename(fileName: string): string | undefined {
|
||||
const match = fileName.match(/\d{4}/);
|
||||
return match ? match[0] : undefined;
|
||||
}
|
||||
|
||||
async function loadDirectory(directory: string): Promise<LoadedDataset> {
|
||||
const entries = await fs.promises.readdir(directory);
|
||||
const imageFiles = entries.filter((entry) => /\.(png|jpe?g|bmp)$/i.test(entry)).sort();
|
||||
if (imageFiles.length === 0) {
|
||||
throw new Error(`No images found in ${directory}`);
|
||||
}
|
||||
|
||||
const perDigit: DigitSample[] = [];
|
||||
const files: LoadedFile[] = [];
|
||||
for (const fileName of imageFiles) {
|
||||
const filePath = path.join(directory, fileName);
|
||||
const features = await extractDigits(filePath);
|
||||
const expected = parseLabelFromFilename(fileName);
|
||||
if (expected) {
|
||||
features.forEach((feature, index) => {
|
||||
perDigit.push({ label: Number(expected[index]!), feature });
|
||||
});
|
||||
}
|
||||
files.push({ file: fileName, path: filePath, expected, features });
|
||||
}
|
||||
|
||||
return { directory, perDigit, files };
|
||||
}
|
||||
|
||||
function softmax(logits: Float64Array): Float64Array {
|
||||
let max = -Infinity;
|
||||
for (let i = 0; i < logits.length; i += 1) {
|
||||
if (logits[i] > max) max = logits[i];
|
||||
}
|
||||
|
||||
let sum = 0;
|
||||
const output = new Float64Array(logits.length);
|
||||
for (let i = 0; i < logits.length; i += 1) {
|
||||
const exp = Math.exp(logits[i] - max);
|
||||
output[i] = exp;
|
||||
sum += exp;
|
||||
}
|
||||
for (let i = 0; i < output.length; i += 1) {
|
||||
output[i] /= sum;
|
||||
}
|
||||
return output;
|
||||
}
|
||||
|
||||
function shuffleInPlace<T>(array: T[]): void {
|
||||
for (let i = array.length - 1; i > 0; i -= 1) {
|
||||
const j = Math.floor(Math.random() * (i + 1));
|
||||
[array[i], array[j]] = [array[j], array[i]];
|
||||
}
|
||||
}
|
||||
|
||||
function trainSoftmax(dataset: DigitSample[], inputSize: number): Float64Array {
|
||||
const weights = new Float64Array(inputSize * CLASSES).fill(0);
|
||||
for (let epoch = 0; epoch < MAX_EPOCHS; epoch += 1) {
|
||||
shuffleInPlace(dataset);
|
||||
let loss = 0;
|
||||
for (const sample of dataset) {
|
||||
const logits = new Float64Array(CLASSES);
|
||||
for (let c = 0; c < CLASSES; c += 1) {
|
||||
let sum = 0;
|
||||
const offset = c * inputSize;
|
||||
for (let i = 0; i < inputSize; i += 1) {
|
||||
sum += weights[offset + i] * sample.feature[i];
|
||||
}
|
||||
logits[c] = sum;
|
||||
}
|
||||
const probs = softmax(logits);
|
||||
loss += -Math.log(probs[sample.label] + 1e-9);
|
||||
|
||||
for (let c = 0; c < CLASSES; c += 1) {
|
||||
const gradient = probs[c] - (c === sample.label ? 1 : 0);
|
||||
const offset = c * inputSize;
|
||||
for (let i = 0; i < inputSize; i += 1) {
|
||||
const delta = gradient * sample.feature[i] + L2_LAMBDA * weights[offset + i];
|
||||
weights[offset + i] -= LEARNING_RATE * delta;
|
||||
}
|
||||
}
|
||||
}
|
||||
|
||||
if ((epoch + 1) % 100 === 0) {
|
||||
const avgLoss = loss / dataset.length;
|
||||
console.log(`epoch ${epoch + 1}: loss=${avgLoss.toFixed(6)}`);
|
||||
}
|
||||
}
|
||||
return weights;
|
||||
}
|
||||
|
||||
function predictDigit(weights: Float64Array, feature: Float64Array, inputSize: number): number {
|
||||
let bestClass = 0;
|
||||
let bestScore = -Infinity;
|
||||
for (let c = 0; c < CLASSES; c += 1) {
|
||||
let score = 0;
|
||||
const offset = c * inputSize;
|
||||
for (let i = 0; i < inputSize; i += 1) {
|
||||
score += weights[offset + i] * feature[i];
|
||||
}
|
||||
if (score > bestScore) {
|
||||
bestScore = score;
|
||||
bestClass = c;
|
||||
}
|
||||
}
|
||||
return bestClass;
|
||||
}
|
||||
|
||||
async function evaluateDataset(
|
||||
label: string,
|
||||
files: LoadedFile[],
|
||||
weights: Float64Array,
|
||||
inputSize: number,
|
||||
worker: Awaited<ReturnType<typeof createWorker>>,
|
||||
): Promise<RecognizedResult[]> {
|
||||
let labeledCount = 0;
|
||||
let correct = 0;
|
||||
const results: RecognizedResult[] = [];
|
||||
|
||||
for (const file of files) {
|
||||
const predictedDigits = file.features
|
||||
.map((feature) => predictDigit(weights, feature, inputSize))
|
||||
.join('');
|
||||
|
||||
let tesseractGuess: string | undefined;
|
||||
try {
|
||||
const { data } = await worker.recognize(file.path);
|
||||
const cleaned = data.text.replace(/\D/g, '');
|
||||
tesseractGuess = cleaned || undefined;
|
||||
} catch (error) {
|
||||
console.warn(`tesseract failed on ${file.file}:`, (error as Error).message);
|
||||
}
|
||||
|
||||
if (file.expected) {
|
||||
labeledCount += 1;
|
||||
if (predictedDigits === file.expected) {
|
||||
correct += 1;
|
||||
}
|
||||
}
|
||||
|
||||
results.push({
|
||||
file: file.file,
|
||||
expected: file.expected,
|
||||
predicted: predictedDigits,
|
||||
tesseract: tesseractGuess,
|
||||
});
|
||||
}
|
||||
|
||||
if (labeledCount > 0) {
|
||||
console.log(`[${label}] 准确率:${correct}/${labeledCount}`);
|
||||
} else {
|
||||
console.log(`[${label}] 未提供标签,跳过准确率计算。`);
|
||||
}
|
||||
|
||||
return results;
|
||||
}
|
||||
|
||||
async function main() {
|
||||
const args = process.argv.slice(2);
|
||||
|
||||
let trainDir = TRAIN_DIR_DEFAULT;
|
||||
let validDir = VALID_DIR_DEFAULT;
|
||||
let trainData: LoadedDataset | undefined;
|
||||
let validData: LoadedDataset | undefined;
|
||||
|
||||
if (args.length > 0) {
|
||||
const firstPath = path.resolve(args[0]);
|
||||
if (!fs.existsSync(firstPath)) {
|
||||
throw new Error(`指定的目录不存在:${firstPath}`);
|
||||
}
|
||||
const firstDataset = await loadDirectory(firstPath);
|
||||
if (args.length > 1 || firstDataset.perDigit.length > 0) {
|
||||
// 显式提供训练目录或该目录中包含标签,则视为训练集。
|
||||
trainDir = firstPath;
|
||||
trainData = firstDataset;
|
||||
if (args.length > 1) {
|
||||
validDir = path.resolve(args[1]);
|
||||
}
|
||||
} else {
|
||||
// 仅提供一个无标签目录,视为验证集覆盖,同时保持默认训练集。
|
||||
validDir = firstPath;
|
||||
validData = firstDataset;
|
||||
}
|
||||
}
|
||||
|
||||
if (args.length > 1 && !fs.existsSync(validDir)) {
|
||||
throw new Error(`验证目录不存在:${validDir}`);
|
||||
}
|
||||
|
||||
if (!trainData) {
|
||||
if (!fs.existsSync(trainDir)) {
|
||||
throw new Error(`训练目录不存在:${trainDir}`);
|
||||
}
|
||||
trainData = await loadDirectory(trainDir);
|
||||
}
|
||||
if (trainData.perDigit.length === 0) {
|
||||
throw new Error('训练集中未找到带标签的样本,无法训练模型。');
|
||||
}
|
||||
|
||||
const inputSize = DIGIT_SIZE * DIGIT_SIZE;
|
||||
const weights = trainSoftmax(trainData.perDigit, inputSize);
|
||||
|
||||
const worker = await createWorker('eng');
|
||||
await worker.setParameters({
|
||||
tessedit_char_whitelist: '0123456789',
|
||||
tessedit_pageseg_mode: PSM.SINGLE_LINE,
|
||||
user_defined_dpi: '300',
|
||||
});
|
||||
|
||||
console.log('\n--- 训练集评估 ---');
|
||||
const trainResults = await evaluateDataset('train', trainData.files, weights, inputSize, worker);
|
||||
|
||||
let validResults: RecognizedResult[] | undefined;
|
||||
if (validDir && fs.existsSync(validDir)) {
|
||||
try {
|
||||
const validStats = validData ?? (await loadDirectory(validDir));
|
||||
console.log('\n--- 验证集评估 ---');
|
||||
validResults = await evaluateDataset('valid', validStats.files, weights, inputSize, worker);
|
||||
} catch (error) {
|
||||
if ((error as Error).message.includes('No images found')) {
|
||||
console.log(`\n验证目录 ${validDir} 中未找到图片,跳过验证。`);
|
||||
} else {
|
||||
throw error;
|
||||
}
|
||||
}
|
||||
} else if (validDir) {
|
||||
console.log(`\n未找到验证目录 ${validDir},仅输出训练集结果。`);
|
||||
}
|
||||
|
||||
await worker.terminate();
|
||||
|
||||
const printResults = (title: string, results: RecognizedResult[]) => {
|
||||
console.log(`\n${title}详细结果:`);
|
||||
for (const result of results) {
|
||||
const parts = [
|
||||
`file=${result.file}`,
|
||||
result.expected ? `expected=${result.expected}` : undefined,
|
||||
`predicted=${result.predicted}`,
|
||||
result.tesseract ? `tesseract=${result.tesseract}` : undefined,
|
||||
].filter(Boolean);
|
||||
console.log(` - ${parts.join(' | ')}`);
|
||||
}
|
||||
};
|
||||
|
||||
printResults('训练集', trainResults);
|
||||
if (validResults) {
|
||||
printResults('验证集', validResults);
|
||||
}
|
||||
}
|
||||
|
||||
main().catch((error) => {
|
||||
console.error(error);
|
||||
process.exitCode = 1;
|
||||
});
|
||||
8
todolist.md
Normal file
@@ -0,0 +1,8 @@
|
||||
使用最合适的typescript开源 ocr 库,识别文件夹下的图片。
|
||||
|
||||
图片上有两条弧形干扰线:一条白色干扰线在上,一条文字同色的干扰线在下。两条干扰线在图片中的位置基本不变。
|
||||
|
||||
图片位 4 位阿拉伯数字。
|
||||
train文件: 文件名就是图片的四位数字,可作为训练校验用。
|
||||
valid文件: 文件名与图片内容无关,作为验证用。
|
||||
|
||||
BIN
train/0373.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
train/0462.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
train/0693.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/1756.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
train/2490.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/2797.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
train/4406.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
train/4459.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/4705.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/5009.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/5916.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
train/5984.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/6117.jpeg
Normal file
|
After Width: | Height: | Size: 1.6 KiB |
BIN
train/6266.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/6343.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
train/6820.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/6875.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/7151.jpeg
Normal file
|
After Width: | Height: | Size: 1.6 KiB |
BIN
train/7238.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/7733.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/8107.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
train/8324.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
train/8930.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
train/9055.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
train/9064.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
train/9338.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
train/9405.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
train/9685.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
train/9988.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
14
tsconfig.json
Normal file
@@ -0,0 +1,14 @@
|
||||
{
|
||||
"compilerOptions": {
|
||||
"target": "es2019",
|
||||
"module": "commonjs",
|
||||
"moduleResolution": "node",
|
||||
"types": ["node"],
|
||||
"esModuleInterop": true,
|
||||
"forceConsistentCasingInFileNames": true,
|
||||
"skipLibCheck": true,
|
||||
"strict": true,
|
||||
"noEmit": true
|
||||
},
|
||||
"include": ["src"]
|
||||
}
|
||||
BIN
valid/YZM-10.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
valid/YZM-11.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
valid/YZM-12.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
valid/YZM-13.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
valid/YZM-14.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
valid/YZM-15.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
valid/YZM-16.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
valid/YZM-17.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
valid/YZM-18.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
valid/YZM-19.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
valid/YZM-2.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
valid/YZM-20.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
valid/YZM-21.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
valid/YZM-22.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
valid/YZM-23.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
valid/YZM-3.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
valid/YZM-4.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
valid/YZM-5.jpeg
Normal file
|
After Width: | Height: | Size: 1.7 KiB |
BIN
valid/YZM-6.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
valid/YZM-7.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
valid/YZM-8.jpeg
Normal file
|
After Width: | Height: | Size: 1.9 KiB |
BIN
valid/YZM-9.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |
BIN
valid/YZM.jpeg
Normal file
|
After Width: | Height: | Size: 1.8 KiB |