I tried translating an entire Excel file with the Google Cloud Translation API

I tried translating an entire Excel file with the Google Cloud Translation API

I tried translating an Excel file as-is into English using Google Cloud's Document Translation feature. I will share the detailed results of my verification of the translation output, including formula preservation and proper noun consistency through glossary use.
2026.06.03

This page has been translated by machine translation. View original

Hi, I'm Kema.

There are times when you want to translate a summary table or list created in Excel into another language while preserving the layout and formulas.
A typical requirement is to convert a product comparison table or performance data into English without breaking the appearance or calculations.
However, if you extract only the cell text for translation, formulas disappear or the sheet structure falls apart.
On top of that, unique proper nouns such as product names or character names often need to be consistently translated the same way every time.

Previously, I wrote an article about translating PDFs using the Document Translation feature of Google Cloud's Cloud Translation API.
This feature supports not only PDFs but also Word, Excel, and PowerPoint.
So in this article, I used the same feature to translate an Excel (XLSX) file as-is, and verified how formulas, charts, and sheet tab names are handled, and whether a glossary can lock in custom terminology — by actually running it.

This article is the Excel edition of a series verifying document translation by format.
Specifications and behavior were confirmed against Google Cloud's official documentation, with relevant sections quoted.
I then verified whether the behavior matched the official descriptions by actually running the process.

Series Articles

Format Article
PDF Edition Translate PDFs with Layout Preserved Using Cloud Translation API
Word Edition Translate Word Documents with Layout Preserved Using Cloud Translation API
Excel Edition (this article)

Target audience: Those considering automating the translation of Excel documents

1. Conclusion: Official Documentation vs. Verified Results

For those who want just the conclusion quickly, here is a summary of what Google Cloud's official documentation says about translating Excel (XLSX), along with what I confirmed by actually translating.
Detailed steps and before/after images are in §3 and beyond.

Aspect Official Documentation Verified Result (confirmed in this article)
Body text, data, sheet tab names Translates while preserving format and layout Translated; sheet tab names were also converted to English
Quota unit XLSX and XLS use character quotas, not page quotas As stated in the official docs, handled by character count, not page count
Formulas (Not explicitly stated) Fully preserved and recalculated correctly after translation. String literals inside formulas (e.g., "高評価" in an IF statement) are not translated; immediately after translation the cell displays English, but reverts to Japanese upon recalculation
Charts (Not explicitly stated. PDF docs mention "charts may break") Titles, axes, and legends remain in Japanese. Charts created with Excel's chart feature may lose their data series
Images (Not explicitly stated) Text inside images is not text data, so it remains untranslated
Text boxes (shapes) (Word docs explicitly state "not translated") Same as Word — remained in Japanese
Glossary (locking custom terms) Translations can be locked using a glossary Unified to registered translations across all sheets

The fact that formulas are preserved and recalculated, and that charts are not translated (and may even lose data), are results specific to Excel.
Also, anything pasted as an image is not text data and therefore outside the scope of translation.
If you're in a hurry, the table above and the images in §4 should give you a good overview.

2. What Is Document Translation in Cloud Translation API

Cloud Translation API's Document Translation is a feature that accepts a file as-is and returns it translated while preserving formatting and layout.
It supports DOCX, XLSX, PPTX, and PDF.
The feature overview, authentication (API keys cannot be used; ADC or a service account is required), and how to create a glossary were covered in previous articles.

Quoted from: DevelopersIO: Translate PDFs with Layout Preserved Using Cloud Translation API

Excel (XLSX) has one difference from other formats.
PDF and DOCX have the concept of pages, and the document translation quota is based on pages, but XLSX and XLS have no concept of pages, so only the character quota applies.

Note: For Document Translation, Cloud Translation also checks that the number characters don't exceed your character quotas. For XLSX and XLS files, only the character quotas apply (not the page quotas).

Quoted from: Official documentation: Quotas and limits | Google Cloud

3. Preparing for Verification

3.1 Prerequisites

The prerequisites are the same as the Word edition (macOS, Python 3.12 venv, google-cloud-translate 3.26.0, a personal project with billing enabled).
The steps for enabling the API, ADC authentication, and installing libraries into the venv are also shared with the Word and PDF editions.

3.2 Sample Document (Original Fictional Anime)

For verification, I had Claude create an Excel workbook as setting materials for a fictional anime, "Hoshirei Monogatari: Lumina Chronicle."
It uses the same fictional world as the PDF and Word editions, with matching proper nouns (coined terms).
The content is completely original and has no relation to any real works, persons, or organizations.

The workbook consists of four sheets, packed with elements specific to Excel verification.

  • Sheet 1 "Hoshirei Codex": Data table, colored text, conditional formatting (color scale), formulas for sum, average, max, and rank (RANK)
  • Sheet 2 "Popularity Ranking": Ratio formulas, bar chart (two types: a native Excel chart and one pasted as an image)
  • Sheet 3 "Episode List": Formulas using IF functions to differentiate ratings, average viewership formula
  • Sheet 4 "Glossary": List of coined terms, text in a text box (shape)

The Excel file translated this time has a full four-sheet structure.
Here are all sheets before translation.

Excel Sheet 1 before translation (Hoshirei Codex, Japanese)
Excel Sheet 1 before translation, "Hoshirei Codex": data table, colored text, conditional formatting (color scale), and formulas for sum, average, max, and rank (RANK)

Excel Sheet 2 before translation (Popularity Ranking, Japanese)
Excel Sheet 2 before translation, "Popularity Ranking": ratio formulas, a native Excel bar chart, and a bar chart pasted as an image

Excel Sheet 3 before translation (Episode List, Japanese)
Excel Sheet 3 before translation, "Episode List": formulas using IF functions to differentiate ratings, and an average viewership formula

Excel Sheet 4 before translation (Glossary, Japanese)
Excel Sheet 4 before translation, "Glossary": list of coined terms, and a text box (shape) with a yellow border

4. Translating an Excel (XLSX) File

The script used for translation is the same one from the Word edition. It determines the MIME type from the file extension, so simply using an XLSX file as input is all that's needed.

Full content of translate_document_handson.py (click to expand)
from __future__ import annotations

import argparse
import time
from pathlib import Path

from google.cloud import translate_v3 as translate

# Glossaries and custom models must be placed in us-central1.
DEFAULT_LOCATION = "us-central1"

# Extension → MIME type (Document Translation supported formats)
MIME_BY_SUFFIX = {
    ".pdf": "application/pdf",
    ".docx": "application/vnd.openxmlformats-officedocument.wordprocessingml.document",
    ".pptx": "application/vnd.openxmlformats-officedocument.presentationml.presentation",
    ".xlsx": "application/vnd.openxmlformats-officedocument.spreadsheetml.sheet",
}

def mime_for(path: str) -> str:
    """Returns the MIME type based on the input file extension. Raises ValueError for unsupported extensions."""
    suffix = Path(path).suffix.lower()
    if suffix not in MIME_BY_SUFFIX:
        raise ValueError(f"Unsupported extension: {suffix} (supported: {', '.join(MIME_BY_SUFFIX)})")
    return MIME_BY_SUFFIX[suffix]

def parse_args() -> argparse.Namespace:
    p = argparse.ArgumentParser(description="Cloud Translation API document translation (synchronous)")
    p.add_argument("--project", required=True, help="GCP project ID")
    p.add_argument("--input", required=True, help="Input file path")
    p.add_argument("--output", required=True, help="Output file path (result without glossary)")
    p.add_argument("--source", default="ja", help="Source language code (default: ja)")
    p.add_argument("--target", default="en", help="Target language code (default: en)")
    p.add_argument("--location", default=DEFAULT_LOCATION, help=f"Location (default: {DEFAULT_LOCATION})")
    p.add_argument("--glossary-id", default=None, help="Glossary ID (when specified, outputs both with and without glossary)")
    return p.parse_args()

def build_request(args: argparse.Namespace, content: bytes) -> dict:
    """Builds the request dictionary for translate_document.

    Specifying the source language is required when using a glossary (per official spec).
    """
    parent = f"projects/{args.project}/locations/{args.location}"
    mime_type = mime_for(args.input)
    request: dict = {
        "parent": parent,
        "source_language_code": args.source,
        "target_language_code": args.target,
        "document_input_config": {"content": content, "mime_type": mime_type},
    }
    if args.glossary_id:
        glossary_path = (
            f"projects/{args.project}/locations/{args.location}"
            f"/glossaries/{args.glossary_id}"
        )
        request["glossary_config"] = translate.TranslateTextGlossaryConfig(glossary=glossary_path)
    return request

def write_bytes(path: str, data: bytes) -> None:
    Path(path).parent.mkdir(parents=True, exist_ok=True)
    Path(path).write_bytes(data)

def main() -> None:
    args = parse_args()
    content = Path(args.input).read_bytes()
    print(f"Input: {args.input} ({len(content):,} bytes, {args.source}{args.target})")

    client = translate.TranslationServiceClient()
    request = build_request(args, content)

    started = time.perf_counter()
    response = client.translate_document(request=request)
    elapsed = time.perf_counter() - started

    base = response.document_translation
    write_bytes(args.output, base.byte_stream_outputs[0])
    print(f"Processing time: {elapsed:.2f} seconds")
    print(f"Output (without glossary): {args.output} ({len(base.byte_stream_outputs[0]):,} bytes)")

    # With glossary: a single call returns a separate output in glossary_document_translation
    if args.glossary_id and response.glossary_document_translation.byte_stream_outputs:
        out = Path(args.output)
        glossary_out = str(out.with_name(f"{out.stem}_glossary{out.suffix}"))
        write_bytes(glossary_out, response.glossary_document_translation.byte_stream_outputs[0])
        print(f"Output (with glossary): {glossary_out}")

if __name__ == "__main__":
    main()

First, translate without a glossary.

python translate_document_handson.py \
    --project <YOUR_PROJECT_ID> \
    --input hoshirei_ja.xlsx \
    --output hoshirei_en.xlsx
# Example output
Input: hoshirei_ja.xlsx (105,812 bytes, ja→en)
Processing time: 1.09 seconds
Output (without glossary): hoshirei_en.xlsx (100,967 bytes)

Checking the translated sheets, the headings, data, and sheet titles were translated into English, and even the sheet tab names were translated.

4.1 Formulas Are Preserved, but String Literals Inside Formulas Are Not Translated

The behavior of formulas was the aspect I most wanted to verify in Excel.
Opening the translated file and checking the formulas, I found that all of them — SUM, AVERAGE, MAX, RANK, ratio, IF, and others — were preserved as formulas and recalculated correctly after translation.
Rather than replacing only the cell values, the formula structure itself is preserved.

Comparing Sheet 1 "Hoshirei Codex" before and after translation:
The results of the sum (6935), average (1387), max (2050), and rank (RANK) formulas are displayed correctly, and the conditional formatting color scale is also maintained.

Before/after comparison (Excel Sheet 1 - Hoshirei Codex)
Sheet 1: left is before translation (Japanese), right is after. Headings and data are translated into English, and the sum, average, max, and rank formulas are preserved and recalculated. Conditional formatting is also maintained.

Sheet 3 "Episode List" similarly preserved the IF function-based rating differentiation and the average viewership formula.

Before/after comparison (Excel Sheet 3 - Episode List)
Sheet 3: left is before translation, right is after. Episode subtitles are translated into English. The rating column (IF function) also displays in English as Good / Highly rated after translation, but this is the cached display value — the formula itself remains in Japanese (described below).

On the other hand, the handling of string literals written directly inside formulas showed a somewhat unusual behavior.
Sheet 3's rating column contains the formula =IF(D3>=7.5,"高評価","好調").
Opening the translated file, the rating column displays "Good" and "Highly rated" in English, which at first glance looks like it was translated.
However, selecting the cell and checking the formula bar reveals that the formula itself is still =IF(D3>=7.5,"高評価","好調"), and the strings "高評価" and "好調" have not been translated.

Translated Excel Sheet 3: cells show English but formula is in Japanese
After translation: the rating column cells display "Good" in English, but looking at the formula bar, the formula is still =IF(D3>=7.5,"高評価","好調") with the string literals remaining in Japanese.

The English display is because the cached display value stored in the cell was translated.
When the cell is actually recalculated, the formula returns the Japanese "高評価" and "好調", causing the display to revert to Japanese.

Excel Sheet 3 after recalculation: rating column reverts to Japanese
After recalculation: the formula returns "好調" and "高評価", causing the rating column to revert to Japanese.

In other words, translation only applies to the cell's display value (cache), not to the formula content itself.
Since recalculation reverts the display to the Japanese defined in the formula, if you need rating strings or similar conditional text to be reliably translated, you should avoid hardcoding string literals directly in formulas — instead, consider referencing a lookup table or similar approach.

4.2 Charts Are Not Translated, and Data May Be Lost

Sheet 2 contains two charts: one created with Excel's chart creation feature (a native chart referencing cell data) and one pasted as an image (a PNG containing Japanese text).
Here I'll look at the behavior of the native Excel chart (the image-based one is covered in §4.3).

Before/after comparison (Excel Sheet 2 - Popularity Ranking)
Sheet 2: left is before translation (Japanese), right is after. The table data is translated into English. The chart pasted as an image in the upper right remains in Japanese since it is an image, while the native Excel chart at the bottom has lost its data series after translation, causing the bars to disappear.

The native Excel chart (the large chart at the bottom) had its data series values lost after translation, causing the bars to no longer display.
Opening it in actual Excel showed the same result — the chart frame and title remained, but the actual data was empty.
Although the chart references cell data, it appears that the references were not properly carried over as a result of changes to the sheet and cell structure during translation.
Additionally, the chart's title, axis labels, legend, and category names all remained in Japanese.

4.3 Images Are Not Subject to Translation

The bar chart pasted as an image (a PNG containing Japanese text) placed in the upper right of Sheet 2 remained untranslated.
Text inside images is not text data and cannot be handled by Document Translation.
This applies not only to charts but to anything pasted as an image — it falls outside the scope of translation.
If your workbook contains images with Japanese text, you will need to either replace the images after translation or add text-based captions outside the images.

4.4 Text Boxes Are Not Translated (Same as Word)

Sheet 4 "Glossary" contains one text box (shape) with a yellow border.
While the body text (cells) was translated into English, the content inside the text box remained in Japanese.

Before/after comparison (Excel Sheet 4 - Glossary)
Sheet 4: left is before translation (Japanese), right is after. Cells are translated into English, but the text box (yellow border) remains in Japanese.

This behavior of text boxes not being translated was the same as in the Word edition.
A common pattern is emerging: in both Word and Excel, the content inside shapes and text boxes is not translated.

5. Locking Custom Terms with a Glossary

Let's verify whether proper nouns and custom terminology can be locked to a specific translation.
To use a glossary, you need to: (1) prepare a TSV with source-to-translation mappings, (2) upload it to Cloud Storage, and (3) create a glossary resource.

5.1 Preparing the Glossary TSV

The glossary is prepared as a TSV file with the source text (Japanese) and target translation (English) listed one pair per line, tab-separated.
No header row is needed — the left column is the coined source term, and the right column is the fixed translation.
For this test, I prepared 20 coined terms from the sample as glossary_ja_en.tsv.

星霊	Hoshirei
共鳴進化	Reso-Evolution
輝光石	Lumina Shard
雷狼ボルテ	Voltefang
焔狐ココ	Pyrofox Coco
水亀アクオ	Aquortle
草鹿リーフィ	Leafawn
輝竜ルミナ	Lumidragon
月読の祠	Moonread Shrine
守護者	Warden
星導士	Starwright
星霊守護協会	Hoshirei Warden Guild
共鳴値	Reso-Value
共鳴の灯	Resonance Flame
絆ゲージ	Bond Gauge
星霊酔い	Hoshirei-sickness
星霊図鑑	Hoshirei Codex
ルミナ群島	Lumina Archipelago
七つの祠	Seven Shrines
共鳴結界	Reso-Barrier

5.2 Uploading the TSV to Cloud Storage and Creating the Glossary Resource

Rather than uploading the TSV directly to create a glossary resource, you first place it in Cloud Storage and then specify its GCS URI to create the resource.
Create the bucket in us-central1, the same region as the glossary.

# Create a bucket (skip if it already exists)
gcloud storage buckets create gs://<YOUR_BUCKET> \
    --project <YOUR_PROJECT_ID> --location us-central1

# Upload the TSV
gcloud storage cp glossary_ja_en.tsv \
    gs://<YOUR_BUCKET>/glossaries/glossary_ja_en.tsv

Next, create the glossary resource from the uploaded TSV.
Creation is a long-running operation (LRO), so run it from the client library and wait for completion.
Save the following code as setup_glossary.py and run it within the venv.

Full content of setup_glossary.py (click to expand)
from google.cloud import translate_v3 as translate

PROJECT_ID = "<YOUR_PROJECT_ID>"
LOCATION = "us-central1"   # Glossaries are only supported in us-central1
GLOSSARY_ID = "hoshirei-ja-en"
INPUT_URI = "gs://<YOUR_BUCKET>/glossaries/glossary_ja_en.tsv"

client = translate.TranslationServiceClient()
name = client.glossary_path(PROJECT_ID, LOCATION, GLOSSARY_ID)
glossary = translate.Glossary(
    name=name,
    # Unidirectional (ja→en) glossary
    language_pair=translate.Glossary.LanguageCodePair(
        source_language_code="ja", target_language_code="en"
    ),
    input_config=translate.GlossaryInputConfig(
        gcs_source=translate.GcsSource(input_uri=INPUT_URI)
    ),
)
parent = f"projects/{PROJECT_ID}/locations/{LOCATION}"
operation = client.create_glossary(parent=parent, glossary=glossary)
result = operation.result(180)   # Wait up to 180 seconds for completion
print(f"Creation complete: {result.name} (entry count: {result.entry_count})")
python setup_glossary.py
# Example output
Creation complete: projects/.../locations/us-central1/glossaries/hoshirei-ja-en (entry count: 20)

The official documentation also explicitly states that custom resources must use us-central1.

Note: All of your resources in a single request to Cloud Translation - Advanced must have the same location. Currently, only global and us-central1 locations are supported. For all custom resources—AutoML models, glossaries, long-running-operations—you must use us-central1.

Quoted from: Official documentation: Migrate to Cloud Translation - Advanced (v3) | Google Cloud

5.3 Comparing With and Without Glossary

Translate specifying the glossary that was created.
Passing the glossary ID created in 5.2 to --glossary-id causes a single response to return both a "without glossary" and a "with glossary" result.
The term-locking results shown in subsequent translation output — such as 星霊 becoming Hoshirei or 共鳴値 becoming Reso-Value — correspond to these registered entries.

python translate_document_handson.py \
    --project <YOUR_PROJECT_ID> \
    --input hoshirei_ja.xlsx \
    --output hoshirei_en.xlsx \
    --glossary-id <YOUR_GLOSSARY_ID>

Looking at the without-glossary result first, the translations of proper nouns were inconsistent (per Claude's analysis).
Counting the translations of 星霊, which appears 12 times in the source, the breakdown of major translations in the without-glossary English output was as follows:

Translation of 星霊 (without glossary) Occurrences
Star Spirit 4
star spirit 3
Celestial Spirit 2
star spirits 1
Other variations Several

The same word was split across multiple variations, including differences in capitalization and singular/plural forms.

Switching to the with-glossary version, these were neatly unified.
The terms that actually changed were the Sheet 1 title and the sheet tab name.

Comparison without/with glossary (Excel Sheet 1)
Left is without glossary (sheet title and tab name show "Star Spirit Encyclopedia," with 星霊 varying between Star Spirit and Celestial); right is with glossary ("Hoshirei Codex" is fixed and 星霊 is unified as Hoshirei).

With the glossary, 星霊 was fixed to Hoshirei, 共鳴値 to Reso-Value, 守護者 to Warden, and 共鳴進化 to Reso-Evolution — all as registered.
The glossary was applied consistently across the entire workbook, from sheet titles and cell data to sheet tab names (text boxes are outside the scope of translation to begin with).

Reviewing the full with-glossary output, all registered coined terms in the Excel file were unified to their fixed translations (per Claude's analysis).
In the Word edition, one instance of 星霊 was missed, but in this Excel sample, no remaining literal translations were found.
However, since misses can occasionally occur depending on surrounding context, it's worth noting that even with a glossary, 100% coverage cannot be guaranteed.

6. Pricing

Pricing for Excel (XLSX) differs from other Office formats.
Document translation for DOCX, PPTX, and PDF is priced per page (0.08 USD per page for standard NMT), but XLSX has no concept of pages.
As quoted in §2, XLSX is handled by character count, and pricing falls under NMT "text translation," billed by character count.

The per-character pricing is as follows:

Usage (NMT text translation, including XLSX document translation) Unit price
First 500,000 characters / month Free (applied as a $10 monthly credit)
Beyond 500,000 characters $20 per 1 million characters

The official pricing page also explicitly states that XLSX document translation is included in text translation.

Text translations, which includes: Language detection, Text translation, Batch text translation, XLSX document translation, Romanize text

Quoted from: Pricing page: Pricing | Google Cloud

The sample used this time (a four-sheet workbook) was on the order of a few thousand characters, well within the monthly free tier (500,000 characters).

7. On Attribution ("Machine Translated by Google")

In the PDF edition, the text "Machine Translated by Google" appeared as attribution in the upper left of the translated PDF.
In this Excel (XLSX) case, despite not specifying attribution, this text was not found in the translated file (the same was true for Word and PowerPoint).
Since the behavior differed between PDF and Office formats under the same conditions, I checked the specification.

The attribution text can be specified as customizedAttribution in the API request, and the default when not specified is "Machine Translated by Google."
This field is not PDF-specific — it is common to the entire translateDocument (document translation) operation.

customizedAttribution string

Optional. This flag is to support user customized attribution. If not provided, the default is Machine Translated by Google. Customized attribution should follow rules in https://cloud.google.com/translate/attribution#attribution_and_logos

Quoted from: API reference: Method: projects.locations.translateDocument | Google Cloud

The important thing to note here is that customizedAttribution only describes the attribution textwhere in the official documentation is it explicitly stated how that text is reflected in each format's output (i.e., whether it is embedded in the file) could not be found.
The reason the presence or absence of attribution varies by format could not be confirmed from the official documentation.
Based on what I observed, the attribution was embedded in PDF output (confirmed with a native PDF in the PDF edition), but was not included in Office format (DOCX/XLSX/PPTX) output.
This is presumably because PDF reconstructs the translated text by overlaying it on the page to preserve layout — a fundamentally different output method from editable Office formats — but this is speculation and not backed by official documentation.

One more point to keep in mind from a separate angle: the explicit disclosure requirement under the brand guidelines.
This is separate from whether attribution is embedded in the output file — it requires that whenever translation results are shown to users, regardless of format, it must be made clear that the content is a machine translation.

Whenever you display translation results from Google Translate directly to users, you must make it clear to users that they are viewing automatic translations from Google Translate using the appropriate text or brand elements.

Quoted from: Brand guidelines: Attribution requirements | Google Cloud

In other words, while it is not a problem that Office format outputs do not have attribution embedded, when publishing or distributing translation results, the responsibility to disclose that the content is machine-translated — regardless of format — lies with the user.

8. Summary

Cloud Translation API's Document Translation translated the sheet titles, data, and sheet tab names of an Excel (XLSX) file simply by passing it as-is.
Using a glossary allows proper noun translations to be locked consistently across the entire workbook.

On the other hand, string literals written directly inside formulas are not translated, and chart titles and legends are also not translated.
Native Excel charts may even lose their data after translation.
Text box content is also not translated, just as in Word.
For workbooks that use formula string literals, charts, or text boxes, it is best to plan for post-translation verification and rework.

The fact that text boxes are not translated was a behavior shared with the Word edition.
I will also write a blog post on the PowerPoint edition in the future, so please check that out as well.

I hope this article is helpful for anyone considering automating the translation of Excel documents.

References

Share this article