Speeding up and improving accuracy of Krita 6's PII detection plugin with Amazon Rekognition

I tried using Amazon Rekognition DetectText for the PII detection plugin for Krita 6. It reduced the processing time from about 30 seconds with the EasyOCR version to about 9 seconds, while also reducing detection misses.

越井琢巳 (Koshii Takumi)

2026.02.11

This page has been translated by machine translation. View original

Introduction

In the previous article, I created a prototype plugin for Krita 6 that uses EasyOCR to automatically select areas containing PII (Personal Identifiable Information). However, in my environment, the processing was slow due to CPU-only execution, and there were also challenges with accuracy in reading small text. In this article, I'll add an implementation using Amazon Rekognition's DetectText to improve speed and accuracy.

As with the previous article, the plugin will focus on the following roles:

Export the current Krita display to a temporary file
Execute PII detection with an external Python script and output rectangles to JSON
Apply the JSON rectangles to the selection area

Test Environment

OS: Windows 11
Krita: Krita 6.0.0 beta1
GPU: None (OCR running on CPU only)
External Python: Python 3.11.9
boto3: 1.42.46
pillow: 12.1.1

Target Audience

Those who want to mask personal information in screenshots or images with Krita 6
Those interested in Krita Python plugin development
Those considering OCR implementation using Amazon Rekognition's DetectText API
Those interested in speed and accuracy improvements from the previous EasyOCR version
Those who can perform basic operations with Python and AWS CLI

References

Preparation

Prepare an IAM policy with minimum permissions to call DetectText.

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Sid": "RekognitionText",
      "Effect": "Allow",
      "Action": [
        "rekognition:DetectText"
      ],
      "Resource": "*"
    }
  ]
}

Next, prepare a profile in your local AWS CLI configuration. In this article, I'll use the profile name krita-ocr.

[profile krita-ocr]
region = ap-northeast-1

Implementation

As in the previous article, the Krita-side plugin is only responsible for exporting to temporary files and applying the detection results to the selection area. The PII detection itself is separated into an external Python script, which calls Amazon Rekognition DetectText via boto3. It converts the line bounding boxes returned by DetectText to pixel coordinates, adds padding, and outputs them to JSON.

The sample code is provided at the end of this article.

How to Use

Place pii_mask_aws under the pykrita directory and enable it in Krita's Python Plugin Manager (same as previous article)
Open an image in Krita and run Tools → Scripts → PII (AWS): Detect (email/phone)
After detection completes, rectangles of areas determined to be PII will be set as selection areas
Adjust the selection area as needed, then apply Krita's standard mosaic or other effects

Results

In my environment, for the same image, the EasyOCR version took around 30 seconds, while the Amazon Rekognition version took around 9 seconds. Measurements were taken once per image. The Amazon Rekognition version also reduced detection misses that occurred with the EasyOCR version.

Email Addresses in Zendesk Console

EasyOCR
- Detected 0 out of 5 cases
- Processing time: 31.25s

Amazon Rekognition
- Detected 5 out of 5 cases
- Processing time: 8.80s

Phone Numbers in Twilio Console

EasyOCR
- Detected 3 out of 6 cases
- Processing time: 28.70s
Amazon Rekognition
- Detected 5 out of 6 cases
- Processing time: 9.54s

Email Addresses in Momento Console (abbreviated with `...` at the end)

EasyOCR
- Detected 0 out of 1 case
- Processing time: 24.42s
Amazon Rekognition
- Detected 1 out of 1 case
- Processing time: 9.12s

Summary

I added an implementation using Amazon Rekognition DetectText to the Krita 6 PII detection plugin. In my CPU-only environment, it processed faster than the EasyOCR version and reduced detection misses. Since the masking process after detection still relies on Krita's standard features, users can switch detection engines with minimal changes to their workflow.

Appendix: Sample Code

`pii_mask_aws/pii_mask_aws.py`

# pykrita/pii_mask_aws/pii_mask_aws.py
from __future__ import annotations

import json
import os
import subprocess
import tempfile
import time
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple

from krita import Extension, InfoObject, Krita, Selection  # type: ignore

try:
    from PyQt6.QtWidgets import QMessageBox
except Exception:  # pragma: no cover
    from PyQt5.QtWidgets import QMessageBox  # type: ignore

PLUGIN_TITLE = "PII Detector (Amazon Rekognition)"
PLUGIN_MENU_TEXT = "PII (AWS): Detect (email/phone)"

def _load_config() -> Dict[str, Any]:
    cfg_path = Path(__file__).with_name("config.json")
    if not cfg_path.exists():
        raise RuntimeError(f"config.json not found: {cfg_path}")
    return json.loads(cfg_path.read_text(encoding="utf-8"))

def _msg(title: str, text: str) -> None:
    app = Krita.instance()
    w = app.activeWindow()
    parent = w.qwindow() if w is not None else None
    QMessageBox.information(parent, title, text)

def _err(title: str, text: str) -> None:
    app = Krita.instance()
    w = app.activeWindow()
    parent = w.qwindow() if w is not None else None
    QMessageBox.critical(parent, title, text)

def _export_projection_png(doc, out_path: Path) -> None:
    # projection (見た目通り) を書き出す
    doc.exportImage(str(out_path), InfoObject())

def _rects_to_selection(doc_w: int, doc_h: int, rects: List[Dict[str, int]]) -> Selection:
    sel = Selection()
    sel.resize(doc_w, doc_h)
    sel.clear()

    for r in rects:
        x, y, w, h = int(r["x"]), int(r["y"]), int(r["w"]), int(r["h"])
        if w <= 0 or h <= 0:
            continue

        tmp = Selection()
        tmp.resize(doc_w, doc_h)
        tmp.clear()
        tmp.select(x, y, w, h, 255)
        sel.add(tmp)

    return sel

def _clean_env_for_external_python() -> Dict[str, str]:
    """
    Krita 同梱 Python / Qt 周りの環境変数が外部 Python を汚染しがちなので、
    Python 系と Qt 系だけ除去して、AWS_* などは温存する。
    """
    env = os.environ.copy()

    for k in list(env.keys()):
        if k.upper().startswith("PYTHON"):
            env.pop(k, None)

    for k in ["QT_PLUGIN_PATH", "QML2_IMPORT_PATH"]:
        env.pop(k, None)

    return env

def _detector_sanity_check(detector_path: Path) -> Optional[str]:
    try:
        head = detector_path.read_text(encoding="utf-8", errors="ignore")[:2000]
    except Exception:
        return None

    if "from krita import" in head or "import krita" in head:
        return (
            "Detector script looks like a Krita plugin (it imports 'krita').\n"
            "config.json 'detector' may be pointing to the wrong file.\n\n"
            f"detector: {detector_path}"
        )

    return None

def _format_timings(data: Dict[str, Any]) -> Tuple[str, str]:
    """
    Returns:
      - short line for message
      - verbose block for message
    """
    timings = data.get("timings")
    if not isinstance(timings, dict):
        return "", ""

    def _f(key: str) -> Optional[float]:
        v = timings.get(key)
        try:
            return float(v)
        except Exception:
            return None

    total = _f("total_sec")
    aws = _f("aws_sec")
    post = _f("post_sec")

    parts: List[str] = []
    if total is not None:
        parts.append(f"detector(total)={total:.2f}s")
    if aws is not None:
        parts.append(f"aws={aws:.2f}s")
    if post is not None:
        parts.append(f"post={post:.2f}s")

    if not parts:
        return "", ""

    short = " / ".join(parts)
    verbose = "\n".join([f"- {p}" for p in parts])
    return short, verbose

class PIIRekognitionExtension(Extension):
    def __init__(self, parent):
        super().__init__(parent)

    def setup(self) -> None:
        pass

    def createActions(self, window) -> None:
        a1 = window.createAction(
            "pii_detect_aws_email_phone",
            PLUGIN_MENU_TEXT,
            "tools/scripts",
        )
        a1.triggered.connect(self.detect)

    def detect(self) -> None:
        app = Krita.instance()
        doc = app.activeDocument()
        if doc is None:
            _err(PLUGIN_TITLE, "No active document.")
            return

        try:
            cfg = _load_config()
        except Exception as e:
            _err(PLUGIN_TITLE, f"Failed to load config.json:\n{e}")
            return

        py = Path(str(cfg.get("python", "")))
        detector = Path(str(cfg.get("detector", "")))

        aws_profile = str(cfg.get("aws_profile", "")).strip()
        aws_region = str(cfg.get("aws_region", "")).strip()

        min_conf = float(cfg.get("min_conf", 0.40))
        pad = int(cfg.get("pad", 2))
        scale = int(cfg.get("scale", 0))

        if not py.exists():
            _err(PLUGIN_TITLE, f"python not found:\n{py}")
            return
        if not detector.exists():
            _err(PLUGIN_TITLE, f"detector not found:\n{detector}")
            return

        bad = _detector_sanity_check(detector)
        if bad:
            _err(PLUGIN_TITLE, bad)
            return

        tmp_dir = Path(tempfile.gettempdir()) / "krita_pii_detector_aws"
        tmp_dir.mkdir(parents=True, exist_ok=True)
        in_png = tmp_dir / "input.png"
        out_json = tmp_dir / "hits.json"

        try:
            _export_projection_png(doc, in_png)
        except Exception as e:
            _err(PLUGIN_TITLE, f"Failed to export image:\n{e}")
            return

        cmd: List[str] = [
            str(py),
            "-E",
            "-s",
            str(detector),
            "--in",
            str(in_png),
            "--out",
            str(out_json),
            "--min-conf",
            str(min_conf),
            "--pad",
            str(pad),
            "--scale",
            str(scale),
        ]
        if aws_profile:
            cmd += ["--aws-profile", aws_profile]
        if aws_region:
            cmd += ["--aws-region", aws_region]

        t0 = time.perf_counter()
        try:
            p = subprocess.run(
                cmd,
                capture_output=True,
                text=True,
                check=False,
                env=_clean_env_for_external_python(),
            )
        except Exception as e:
            _err(PLUGIN_TITLE, f"Failed to run detector:\n{e}")
            return
        elapsed = time.perf_counter() - t0

        print(f"[{PLUGIN_TITLE}] detector elapsed(subprocess): {elapsed:.2f}s")
        if p.stdout.strip():
            print(f"[{PLUGIN_TITLE}] detector stdout:\n{p.stdout}")
        if p.stderr.strip():
            print(f"[{PLUGIN_TITLE}] detector stderr:\n{p.stderr}")

        if p.returncode != 0:
            _err(
                PLUGIN_TITLE,
                f"Detector failed (code={p.returncode}).\n\n"
                f"Elapsed: {elapsed:.2f}s\n\nSTDERR:\n{p.stderr}\n\nSTDOUT:\n{p.stdout}",
            )
            return

        try:
            data = json.loads(out_json.read_text(encoding="utf-8"))
            hits = data.get("hits", [])
        except Exception as e:
            _err(PLUGIN_TITLE, f"Failed to read JSON:\n{e}")
            return

        rects = [h["rect"] for h in hits if isinstance(h, dict) and "rect" in h]
        sel = _rects_to_selection(doc.width(), doc.height(), rects)
        doc.setSelection(sel)

        timing_short, timing_verbose = _format_timings(data)
        timing_line = f"\n{timing_short}" if timing_short else ""

        _msg(
            PLUGIN_TITLE,
            f"Detection done.\nHits: {len(rects)}\nElapsed: {elapsed:.2f}s{timing_line}\n\n"
            "This plugin sends the rendered image to Amazon Rekognition DetectText.\n"
            "Adjust selection manually, then apply pixelize (or other) filter using Krita built-ins.\n"
            + (f"\nTimings:\n{timing_verbose}\n" if timing_verbose else ""),
        )

Krita.instance().addExtension(PIIRekognitionExtension(Krita.instance()))

# detector/pii_detect_aws.py
from __future__ import annotations

import argparse
import json
import re
import sys
import time
import unicodedata
from dataclasses import dataclass
from pathlib import Path
from typing import Any, Dict, List, Optional, Tuple

# assuming boto3 is in external venv
try:
    import boto3  # type: ignore
except Exception as e:  # pragma: no cover
    boto3 = None  # type: ignore
    _BOTO3_IMPORT_ERROR = e
else:
    _BOTO3_IMPORT_ERROR = None

# Regular email
EMAIL_RE = re.compile(r"\b[A-Z0-9._%+\-]+@[A-Z0-9.\-]+\.[A-Z]{2,}\b", re.IGNORECASE)

# Truncated email example: koshii.takumi@classmet...
# - Catches domains ending with "..."
# - Assumes whitespace/end of line/typical punctuation follows
TRUNC_EMAIL_RE = re.compile(
    r"[A-Z0-9._%+\-]+@[A-Z0-9.\-]{2,}\.{3,}(?=\s|$|[)\]】〉》>、。,.])",
    re.IGNORECASE,
)

def _nfkc(s: str) -> str:
    s = unicodedata.normalize("NFKC", s)

    # Unify ellipsis to "..." (OCR/font differences)
    s = s.replace("…", "...")   # U+2026
    s = s.replace("⋯", "...")   # U+22EF

    # Also handle Japanese triple dots "・・・" (middle dot type)
    s = s.replace("・・・", "...")
    return s

def _looks_like_email(s: str) -> bool:
    t = _nfkc(s)
    if EMAIL_RE.search(t):
        return True
    if TRUNC_EMAIL_RE.search(t):
        return True
    return False

def _read_png_size(path: Path) -> Tuple[int, int]:
    """
    Read width/height from PNG IHDR without Pillow.
    """
    with path.open("rb") as f:
        sig = f.read(8)
        if sig != b"\x89PNG\r\n\x1a\n":
            raise RuntimeError("Input is not a PNG (unexpected signature).")
        # IHDR: length(4) type(4) data(13) crc(4)
        length = int.from_bytes(f.read(4), "big")
        ctype = f.read(4)
        if ctype != b"IHDR":
            raise RuntimeError("PNG IHDR chunk not found where expected.")
        ihdr = f.read(13)
        if len(ihdr) != 13:
            raise RuntimeError("PNG IHDR too short.")
        width = int.from_bytes(ihdr[0:4], "big")
        height = int.from_bytes(ihdr[4:8], "big")
        if width <= 0 or height <= 0:
            raise RuntimeError(f"Invalid PNG size: {width}x{height}")
        return width, height

def _maybe_scale_image(in_path: Path, scale_percent: int) -> Tuple[Path, int, int]:
    """
    scale_percent:
      - 0: no scaling
      - 1..100: scale to that percentage
    Returns: (path_to_send, width, height)
    """
    w, h = _read_png_size(in_path)
    if scale_percent <= 0 or scale_percent == 100:
        return in_path, w, h

    # Optional: use Pillow if available
    try:
        from PIL import Image  # type: ignore
    except Exception:
        raise RuntimeError(
            "scale is requested but Pillow is not available in the external Python.\n"
            "Install pillow, or set scale=0 in config.json."
        )

    new_w = max(1, int(round(w * (scale_percent / 100.0))))
    new_h = max(1, int(round(h * (scale_percent / 100.0))))

    out_path = in_path.with_name(in_path.stem + f"_scaled{scale_percent}.png")
    with Image.open(in_path) as im:
        im = im.convert("RGB")
        im = im.resize((new_w, new_h))
        im.save(out_path, format="PNG")

    return out_path, new_w, new_h

def _clamp(v: int, lo: int, hi: int) -> int:
    return max(lo, min(hi, v))

def _pad_rect(x: int, y: int, w: int, h: int, pad: int, img_w: int, img_h: int) -> Dict[str, int]:
    x2 = x + w
    y2 = y + h
    x = _clamp(x - pad, 0, img_w)
    y = _clamp(y - pad, 0, img_h)
    x2 = _clamp(x2 + pad, 0, img_w)
    y2 = _clamp(y2 + pad, 0, img_h)
    return {"x": x, "y": y, "w": max(0, x2 - x), "h": max(0, y2 - y)}

def _looks_like_phone(s: str) -> bool:
    s0 = _nfkc(s)
    # quick reject if too short
    if len(s0) < 6:
        return False

    digits = re.sub(r"\D", "", s0)
    if not digits:
        return False

    has_plus = "+" in s0
    has_sep = any(ch in s0 for ch in ["-", " ", "(", ")", "　", "／", "/"])

    # E.164-ish
    if has_plus and 8 <= len(digits) <= 15:
        return True

    # Domestic-ish: avoid too-short sequences (years, prices, etc.)
    # Typical: 10-11 digits (JP mobile/landline), with separators is more likely a phone.
    if 10 <= len(digits) <= 11 and has_sep:
        return True

    # Some OCR outputs remove separators but keep "tel:" or similar markers
    if 10 <= len(digits) <= 15 and re.search(r"\b(tel|phone|fax)\b", s0, re.IGNORECASE):
        return True

    return False

@dataclass
class DetLine:
    text: str
    confidence: float
    # Rekognition bbox is normalized: left/top/width/height
    left: float
    top: float
    width: float
    height: float

def _extract_lines(rek_resp: Dict[str, Any]) -> List[DetLine]:
    out: List[DetLine] = []
    dets = rek_resp.get("TextDetections", [])
    if not isinstance(dets, list):
        return out

    for d in dets:
        if not isinstance(d, dict):
            continue
        if d.get("Type") != "LINE":
            continue
        text = d.get("DetectedText")
        conf = d.get("Confidence")
        geom = d.get("Geometry", {})
        bbox = geom.get("BoundingBox", {}) if isinstance(geom, dict) else {}

        if not isinstance(text, str):
            continue
        try:
            conf_f = float(conf)
        except Exception:
            conf_f = 0.0

        try:
            left = float(bbox.get("Left", 0.0))
            top = float(bbox.get("Top", 0.0))
            width = float(bbox.get("Width", 0.0))
            height = float(bbox.get("Height", 0.0))
        except Exception:
            continue

        # Sanity clamp normalized coordinates
        left = max(0.0, min(1.0, left))
        top = max(0.0, min(1.0, top))
        width = max(0.0, min(1.0, width))
        height = max(0.0, min(1.0, height))

        out.append(DetLine(text=text, confidence=conf_f, left=left, top=top, width=width, height=height))

    return out

def _bbox_to_rect_px(line: DetLine, img_w: int, img_h: int) -> Tuple[int, int, int, int]:
    x = int(round(line.left * img_w))
    y = int(round(line.top * img_h))
    w = int(round(line.width * img_w))
    h = int(round(line.height * img_h))

    # clamp within image
    x = _clamp(x, 0, img_w)
    y = _clamp(y, 0, img_h)
    w = _clamp(w, 0, img_w - x)
    h = _clamp(h, 0, img_h - y)
    return x, y, w, h

def _build_hits(lines: List[DetLine], img_w: int, img_h: int, min_conf: float, pad: int) -> List[Dict[str, Any]]:
    hits: List[Dict[str, Any]] = []

    for ln in lines:
        if ln.confidence < (min_conf * 100.0):  # Rekognition confidence is 0..100
            continue

        t = _nfkc(ln.text)

        kinds: List[str] = []
        if _looks_like_email(t):
            kinds.append("email")
        if _looks_like_phone(t):
            kinds.append("phone")

        if not kinds:
            continue

        x, y, w, h = _bbox_to_rect_px(ln, img_w, img_h)
        rect = _pad_rect(x, y, w, h, pad, img_w, img_h)

        # For cases where email and phone are mixed in one line, separate hits by kind
        for k in kinds:
            hits.append(
                {
                    "kind": k,
                    "text": "masked",
                    "confidence": float(ln.confidence),
                    "rect": rect,
                }
            )

    return hits

def _make_session(profile: Optional[str], region: Optional[str]):
    if boto3 is None:
        raise RuntimeError(f"boto3 import failed: {_BOTO3_IMPORT_ERROR}")

    kwargs: Dict[str, Any] = {}
    if profile:
        kwargs["profile_name"] = profile

    session = boto3.session.Session(**kwargs)
    client_kwargs: Dict[str, Any] = {}
    if region:
        client_kwargs["region_name"] = region

    return session, session.client("rekognition", **client_kwargs)

def _main(argv: Optional[List[str]] = None) -> int:
    ap = argparse.ArgumentParser(description="PII detector using Amazon Rekognition DetectText (LINE-based).")
    ap.add_argument("--in", dest="in_path", required=True, help="Input image path (PNG recommended).")
    ap.add_argument("--out", dest="out_path", required=True, help="Output JSON path.")
    ap.add_argument("--min-conf", dest="min_conf", type=float, default=0.40, help="Min confidence (0..1).")
    ap.add_argument("--pad", dest="pad", type=int, default=2, help="Padding in pixels.")
    ap.add_argument(
        "--scale",
        dest="scale",
        type=int,
        default=0,
        help="Scale percent for OCR (0=no scaling, 1..100). Recommended 0 or 100.",
    )
    ap.add_argument("--aws-profile", dest="aws_profile", default="", help="AWS profile name (optional).")
    ap.add_argument("--aws-region", dest="aws_region", default="", help="AWS region (optional).")

    args = ap.parse_args(argv)

    in_path = Path(args.in_path)
    out_path = Path(args.out_path)

    if not in_path.exists():
        raise RuntimeError(f"Input not found: {in_path}")

    min_conf = float(args.min_conf)
    if not (0.0 <= min_conf <= 1.0):
        raise RuntimeError("--min-conf must be in range 0..1")

    scale = int(args.scale)
    if scale < 0 or scale > 100:
        raise RuntimeError("--scale must be 0..100")

    aws_profile = args.aws_profile.strip() or None
    aws_region = args.aws_region.strip() or None

    t0 = time.perf_counter()

    # Optional scaling
    scaled_path, img_w, img_h = _maybe_scale_image(in_path, scale)
    img_bytes = scaled_path.read_bytes()

    # AWS call
    t_aws0 = time.perf_counter()
    _, client = _make_session(aws_profile, aws_region)
    rek = client.detect_text(Image={"Bytes": img_bytes})
    t_aws1 = time.perf_counter()

    # Postprocess
    t_post0 = time.perf_counter()
    lines = _extract_lines(rek)
    hits = _build_hits(lines, img_w, img_h, min_conf=min_conf, pad=int(args.pad))
    t_post1 = time.perf_counter()

    t1 = time.perf_counter()

    data: Dict[str, Any] = {
        "engine_used": "rekognition.detect_text",
        "image_w": img_w,
        "image_h": img_h,
        "timings": {
            "total_sec": float(t1 - t0),
            "aws_sec": float(t_aws1 - t_aws0),
            "post_sec": float(t_post1 - t_post0),
        },
        "hits": hits,
    }

    out_path.parent.mkdir(parents=True, exist_ok=True)
    out_path.write_text(json.dumps(data, ensure_ascii=False, indent=2), encoding="utf-8")
    return 0

if __name__ == "__main__":
    try:
        raise SystemExit(_main())
    except Exception as e:
        # Make error output clear and concise for Krita to display stderr
        print(f"[pii_detect_aws] ERROR: {e}", file=sys.stderr)
        raise

Speeding up and improving accuracy of Krita 6's PII detection plugin with Amazon Rekognition

Introduction

Test Environment

Target Audience

References

Preparation

Implementation

How to Use

Results

Email Addresses in Zendesk Console

Phone Numbers in Twilio Console

Email Addresses in Momento Console (abbreviated with `...` at the end)

Summary

Appendix: Sample Code

`pii_mask_aws/pii_mask_aws.py`

Related articles

AWS Topics

Trending Topics

Products & Services

Features and Series

Introduction

Test Environment

Target Audience

References

Preparation

Implementation

How to Use

Results

Email Addresses in Zendesk Console

Phone Numbers in Twilio Console

Email Addresses in Momento Console (abbreviated with ... at the end)

Summary

Appendix: Sample Code

pii_mask_aws/pii_mask_aws.py

Related articles

Email Addresses in Momento Console (abbreviated with `...` at the end)

`pii_mask_aws/pii_mask_aws.py`