Xdoctestでdocstringにテストコードを簡単に記述してみる

2025.11.17

はじめに

データ事業本部のkobayashiです。

Pythonでコードを書く際、ドキュメント文字列（docstring）にサンプルコードを含めることは一般的です。しかし、そのサンプルコードが実際に動作するか確認するのは手間がかかります。
Pythonの標準ライブラリにはdoctestモジュールがありますが、いくつかの制限があり使いづらい面があります。

そこで今回は、doctestを強化したxdoctestというライブラリを試してみました。xdoctestは、doctestの制限を解消し、より柔軟で使いやすいドキュメントテストを実現します。

https://github.com/Erotemic/xdoctest

xdoctestとは

xdoctestは、Pythonの標準doctestモジュールの書き換え版です。ドキュメント文字列内のテストコードを実行するためのツールで、doctestと互換性を保ちながら、多くの改善と新機能を提供しています。
主な特徴は以下の通りです。

AST（抽象構文木）ベースのパーサーによる高精度な解析
すべての行を>>>で始めることが可能（...との混在が不要）
マルチライン文字列のプレフィックス不要
ブロック単位でのテスト実行とディレクティブ適用
Google StyleやNumPy Style docstringへの対応
pytest統合による既存テストフレームワークとの連携
トップレベルの非同期コード対応
カラフルで見やすい出力

xdoctestを使ってみる

環境

今回使用した環境は以下の通りです。

Python 3.12.8
xdoctest 1.2.0
pytest 8.3.4

インストール

pipで簡単にインストールできます。カラー出力を有効にする場合は[all]オプションを付けます。

$ pip install xdoctest
# カラー出力を有効にする場合
$ pip install xdoctest[all]

基本的な使い方

まず、標準のdoctestとxdoctestの違いを確認してみます。

標準doctestの書き方

標準のdoctestでは、対話型シェルの記法に厳密に従う必要があります。

standard_doctest.py

def fibonacci(n):
    """
    フィボナッチ数列のn番目の値を計算します。

    標準doctestの例：
        >>> fibonacci(0)
        0
        >>> fibonacci(1)
        1
        >>> fibonacci(10)
        55
        >>> for i in range(5):
        ...     print(fibonacci(i))
        0
        1
        1
        2
        3
    """
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

標準doctestでは、複数行のテスト結果があるコードブロックで...プレフィックスを使う必要があり少し煩雑です。

xdoctestの書き方

xdoctestでは、すべての行を>>>で始めることができます。

xdoctest_example.py

def fibonacci(n):
    """
    フィボナッチ数列のn番目の値を計算します。

    Example:
        >>> fibonacci(0)
        0
        >>> fibonacci(1)
        1
        >>> fibonacci(10)
        55
        >>> for i in range(5):
        >>>     print(fibonacci(i))
        0
        1
        1
        2
        3
    """
    if n <= 1:
        return n
    return fibonacci(n - 1) + fibonacci(n - 2)

def calculate_sum(numbers):
    """
    数値のリストの合計を計算します。

    Example:
        >>> numbers = [1, 2, 3, 4, 5]
        >>> result = calculate_sum(numbers)
        >>> print(f"合計: {result}")
        合計: 15
        >>> calculate_sum([])
        0
        >>> calculate_sum([10, 20, 30])
        60
    """
    return sum(numbers)

def format_data(data):
    """
    データを整形して出力します。

    マルチライン文字列もシンプルに記述できます。

    Example:
        >>> data = {"name": "Alice", "age": 30, "city": "Tokyo"}
        >>> result = format_data(data)
        >>> print(result)
        Name: Alice
        Age: 30
        City: Tokyo
    """
    return f"Name: {data['name']}\nAge: {data['age']}\nCity: {data['city']}"

xdoctestでは...プレフィックスが不要なため、コードがより読みやすくなります。

テストの実行

xdoctestは複数の方法で実行できます。

ネイティブ実行

# モジュール全体のテストをリスト表示
$ xdoctest xdoctest_example.py list

=====================================
_  _ ___  ____ ____ ___ ____ ____ ___
 \/  |  \ |  | |     |  |___ [__   |
_/\_ |__/ |__| |___  |  |___ ___]  |

=====================================

Start doctest_module('xdoctest_example.py')
Listing tests
    python -m xdoctest xdoctest_example.py fibonacci:0
    python -m xdoctest xdoctest_example.py calculate_sum:0
    python -m xdoctest xdoctest_example.py format_data:0

それではテストを実施してみます。

$ xdoctest xdoctest_example.py all

=====================================
_  _ ___  ____ ____ ___ ____ ____ ___
 \/  |  \ |  | |     |  |___ [__   |
_/\_ |__/ |__| |___  |  |___ ___]  |

=====================================

Start doctest_module('xdoctest_example.py')
Listing tests
gathering tests
running 3 test(s)
====== <exec> ======
* DOCTEST : xdoctest_example.py::fibonacci:0, line 6 <- wrt source file
DOCTEST SOURCE
 1 >>> fibonacci(0)
   0
 3 >>> fibonacci(1)
   1
 5 >>> fibonacci(10)
   55
 7 >>> for i in range(5):
 8 >>>     print(fibonacci(i))
   0
   1
   1
   2
   3
DOCTEST STDOUT/STDERR
0
1
1
2
3
DOCTEST RESULT
* SUCCESS: xdoctest_example.py::fibonacci:0
====== </exec> ======
====== <exec> ======
* DOCTEST : xdoctest_example.py::calculate_sum:0, line 30 <- wrt source file
DOCTEST SOURCE
1 >>> numbers = [1, 2, 3, 4, 5]
2 >>> result = calculate_sum(numbers)
3 >>> print(f"合計: {result}")
  合計: 15
5 >>> calculate_sum([])
  0
7 >>> calculate_sum([10, 20, 30])
  60
DOCTEST STDOUT/STDERR
合計: 15
DOCTEST RESULT
* SUCCESS: xdoctest_example.py::calculate_sum:0
====== </exec> ======
====== <exec> ======
* DOCTEST : xdoctest_example.py::format_data:0, line 49 <- wrt source file
DOCTEST SOURCE
1 >>> data = {"name": "Alice", "age": 30, "city": "Tokyo"}
2 >>> result = format_data(data)
3 >>> print(result)
  Name: Alice
  Age: 30
  City: Tokyo
DOCTEST STDOUT/STDERR
Name: Alice
Age: 30
City: Tokyo
DOCTEST RESULT
* SUCCESS: xdoctest_example.py::format_data:0
====== </exec> ======
============
Finished doctests
3 / 3 passed
=== 3 passed in 0.03 seconds ===

すべてのテストではなく個別のテストのみの実行もできます。

# 特定の関数のみテスト
$ xdoctest xdoctest_example.py fibonacci

pytestとの統合

既存のpytestテストスイートに簡単に統合できます。

$ pytest --xdoctest xdoctest_example.py
============================= test session starts ==============================
platform darwin -- Python 3.12.8, pytest-8.3.4, pluggy-1.5.0
rootdir: /tmp/pytest-xdoctest
plugins: xdoctest-1.2.0
collected 3 items

xdoctest_example.py::xdoctest_example.fibonacci PASSED                   [ 33%]
xdoctest_example.py::xdoctest_example.calculate_sum PASSED               [ 66%]
xdoctest_example.py::xdoctest_example.format_data PASSED                 [100%]

============================== 3 passed in 0.03s ===============================

pytest.iniまたはpyproject.tomlに設定を追加すると、常にxdoctestが有効になります。

pytest.ini

[pytest]
addopts = --xdoctest

pyproject.toml

[tool.pytest.ini_options]
addopts = ["--xdoctest"]

Google Styleのdocstringへの対応

xdoctestは、Google StyleやNumPy Styleのdocstringに対応しています。

google_style_example.py

def divide(a: float, b: float) -> float:
    """2つの数値を除算します。

    Args:
        a: 被除数
        b: 除数

    Returns:
        除算結果

    Raises:
        ZeroDivisionError: bが0の場合

    Example:
        >>> divide(10, 2)
        5.0
        >>> divide(7, 2)
        3.5
        >>> try:
        >>>     divide(10, 0)
        >>> except ZeroDivisionError as e:
        >>>     print(f"Error: {e}")
        Error: division by zero
    """
    if b == 0:
        raise ZeroDivisionError("division by zero")
    return a / b

class DataProcessor:
    """データ処理を行うクラス。

    Attributes:
        data: 処理対象のデータ

    Example:
        >>> processor = DataProcessor([1, 2, 3, 4, 5])
        >>> processor.get_mean()
        3.0
        >>> processor.get_max()
        5
        >>> processor.filter_above(3)
        [4, 5]
    """

    def __init__(self, data):
        self.data = data

    def get_mean(self):
        """平均値を計算します。"""
        return sum(self.data) / len(self.data)

    def get_max(self):
        """最大値を取得します。"""
        return max(self.data)

    def filter_above(self, threshold):
        """閾値より大きい値をフィルタリングします。"""
        return [x for x in self.data if x > threshold]

$ xdoctest google_style_example.py all

=====================================
_  _ ___  ____ ____ ___ ____ ____ ___
 \/  |  \ |  | |     |  |___ [__   |
_/\_ |__/ |__| |___  |  |___ ___]  |

=====================================

Start doctest_module('google_style_example.py')
Listing tests
gathering tests
running 2 test(s)
====== <exec> ======
* DOCTEST : google_style_example.py::divide:0, line 15 <- wrt source file
DOCTEST SOURCE
1 >>> divide(10, 2)
  5.0
3 >>> divide(7, 2)
  3.5
5 >>> try:
6 >>>     divide(10, 0)
7 >>> except ZeroDivisionError as e:
8 >>>     print(f"Error: {e}")
  Error: division by zero
DOCTEST STDOUT/STDERR
Error: division by zero
DOCTEST RESULT
* SUCCESS: google_style_example.py::divide:0
====== </exec> ======
====== <exec> ======
* DOCTEST : google_style_example.py::DataProcessor:0, line 37 <- wrt source file
DOCTEST SOURCE
1 >>> processor = DataProcessor([1, 2, 3, 4, 5])
2 >>> processor.get_mean()
  3.0
4 >>> processor.get_max()
  5
6 >>> processor.filter_above(3)
  [4, 5]
DOCTEST STDOUT/STDERR
DOCTEST RESULT
* SUCCESS: google_style_example.py::DataProcessor:0
====== </exec> ======
============
Finished doctests
2 / 2 passed
=== 2 passed in 0.03 seconds ===

xdoctestディレクティブの使用

xdoctestは、標準doctestのディレクティブをサポートしつつ、ブロック単位での適用が可能です。

directive_example.py

import random

def random_number():
    """
    ランダムな数値を生成します。

    Example:
        >>> # doctest: +SKIP
        >>> # このテストはスキップされます
        >>> random_number()
        42
    """
    return random.randint(1, 100)

def show_platform_info():
    """
    プラットフォーム情報を表示します。

    Example:
        >>> import platform
        >>> info = show_platform_info()
        >>> # 出力は環境によって異なるため、値の検証のみ行う
        >>> assert isinstance(info, str)
        >>> assert len(info) > 0
    """
    import platform
    return platform.platform()

def floating_point_calculation():
    """
    浮動小数点演算を行います。

    Example:
        >>> # 浮動小数点の誤差を許容
        >>> result = 0.1 + 0.2
        >>> abs(result - 0.3) < 0.0001
        True
    """
    return 0.1 + 0.2

$ xdoctest directive_example.py all 

=====================================
_  _ ___  ____ ____ ___ ____ ____ ___
 \/  |  \ |  | |     |  |___ [__   |
_/\_ |__/ |__| |___  |  |___ ___]  |

=====================================

Start doctest_module('directive_example.py')
Listing tests
gathering tests
running 3 test(s)
====== <exec> ======
* DOCTEST : directive_example.py::random_number:0, line 9 <- wrt source file
DOCTEST SOURCE
1 >>> # doctest: +SKIP
2 >>> # このテストはスキップされます
3 >>> random_number()
  42
DOCTEST STDOUT/STDERR
DOCTEST RESULT
* SKIPPED: directive_example.py::random_number:0
====== </exec> ======
====== <exec> ======
* DOCTEST : directive_example.py::show_platform_info:0, line 22 <- wrt source file
DOCTEST SOURCE
1 >>> import platform
2 >>> info = show_platform_info()
3 >>> # 出力は環境によって異なるため、値の検証のみ行う
4 >>> assert isinstance(info, str)
5 >>> assert len(info) > 0
DOCTEST STDOUT/STDERR
DOCTEST RESULT
* SUCCESS: directive_example.py::show_platform_info:0
====== </exec> ======
====== <exec> ======
* DOCTEST : directive_example.py::floating_point_calculation:0, line 37 <- wrt source file
DOCTEST SOURCE
1 >>> # 浮動小数点の誤差を許容
2 >>> result = 0.1 + 0.2
3 >>> abs(result - 0.3) < 0.0001
  True
DOCTEST STDOUT/STDERR
DOCTEST RESULT
* SUCCESS: directive_example.py::floating_point_calculation:0
====== </exec> ======
============
Finished doctests
2 / 3 passed
=== 2 passed, 1 skipped in 0.07 seconds ===

random_numberのテストがスキップされています。
主なディレクティブとして以下のものがあります。

+SKIP: テストをスキップ
+ELLIPSIS: ...で部分的な出力マッチングを許可
+NORMALIZE_WHITESPACE: 空白の違いを無視
+IGNORE_EXCEPTION_DETAIL: 例外の詳細を無視

コマンドラインオプション

xdoctestは豊富なコマンドラインオプションを提供しています。

# 詳細な出力
$ xdoctest xdoctest_example.py all --verbose=2

# テスト実行時間の表示
$ xdoctest xdoctest_example.py all --durations=10

# Google Styleのみをパース
$ xdoctest xdoctest_example.py all --style=google

# カラー出力を有効化
$ xdoctest xdoctest_example.py all --colored=True

# 差分を見やすい形式で表示
$ xdoctest xdoctest_example.py all --report=udiff

主要なオプション：

オプション	説明
`--style {auto,google,freeform}`	パースモードの選択
`--verbose {0,1,2,3}`	冗長性レベル（0=無音、3=ソース表示）
`--colored`	ANSI色付け有効化
`--durations N`	遅いテストのN個を表示
`--report {udiff,ndiff,cdiff}`	差分出力形式

実践的な使用例

データ処理関数のテスト

data_processing.py

from typing import List, Dict, Any

def aggregate_sales(sales_data: List[Dict[str, Any]]) -> Dict[str, int]:
    """
    売上データを集計します。

    Args:
        sales_data: 売上データのリスト

    Returns:
        地域別の売上合計

    Example:
        >>> sales = [
        >>>     {"region": "東京", "amount": 10000},
        >>>     {"region": "大阪", "amount": 8000},
        >>>     {"region": "東京", "amount": 12000},
        >>>     {"region": "福岡", "amount": 5000},
        >>>     {"region": "大阪", "amount": 7000},
        >>> ]
        >>> result = aggregate_sales(sales)
        >>> result["東京"]
        22000
        >>> result["大阪"]
        15000
        >>> result["福岡"]
        5000
        >>>
        >>> # 空のデータ
        >>> aggregate_sales([])
        {}
    """
    from collections import defaultdict

    aggregated = defaultdict(int)
    for sale in sales_data:
        aggregated[sale["region"]] += sale["amount"]

    return dict(aggregated)

def calculate_statistics(values: List[float]) -> Dict[str, float]:
    """
    数値リストの統計情報を計算します。

    Args:
        values: 数値のリスト

    Returns:
        平均、中央値、最大値、最小値を含む辞書

    Example:
        >>> data = [11, 22, 33, 44, 55, 66]
        >>> stats = calculate_statistics(data)
        >>> stats["mean"]
        38.5
        >>> stats["median"]
        38.5
        >>> stats["max"]
        66
        >>> stats["min"]
        11
        >>>
        >>> # 単一要素
        >>> stats = calculate_statistics([42])
        >>> stats["mean"]
        42
        >>> stats["median"]
        42
    """
    import statistics

    return {
        "mean": statistics.mean(values),
        "median": statistics.median(values),
        "max": max(values),
        "min": min(values),
    }

$ pytest --xdoctest data_processing.py -v
============================= test session starts ==============================
platform darwin -- Python 3.12.8, pytest-8.3.4, pluggy-1.5.0
rootdir: /tmp/pytest-xdoctest
plugins: xdoctest-1.2.0
collected 2 items

data_processing.py::data_processing.aggregate_sales PASSED               [ 50%]
data_processing.py::data_processing.calculate_statistics PASSED          [100%]

============================== 2 passed in 0.04s ===============================

まとめ

xdoctestは、標準doctestの制限を解消し、より柔軟で使いやすいドキュメントテストを実現するライブラリです。以下の点がpytestのプラグインとして優れていると感じました。

...プレフィックスが不要で、コードがより自然に書ける
Google StyleやNumPy Styleのdocstringに標準対応
pytestとの統合が簡単で、既存のテストスイートに組み込みやすい
AST ベースのパーサーにより、より正確なテスト解析が可能
非同期コードにも対応
カラフルで見やすい出力

ドキュメント文字列にサンプルコードを書く際、それが実際に動作することを保証するのは重要です。xdoctestを使うことで、ドキュメントの品質を保ちながら、テストカバレッジも向上させることができます。
また、pytestとの統合により、既存のテストフレームワークを変更することなく、ドキュメントテストを追加できる点も大きなメリットです。
ドキュメントとテストを同時に管理したい、サンプルコードの動作を保証したいという方には最適かと思います。

最後まで読んで頂いてありがとうございました。