Simplifying Next.js Visual Novel UI Implementation by Delegating Image Processing to Cloudinary

I'll introduce a demo that outsources image processing for visual novel game-like UI using Cloudinary, allowing screen composition in Next.js with just Public ID and layout specifications. I've organized the key points of structure and implementation from the perspective of reducing asset management burden in web games and separating game logic from image processing responsibilities.
SaaSで加速するゲーム開発 - Advent Calendar 2025 -
越井琢巳 (Koshii Takumi)
2025.12.14
This page has been translated by machine translation. View original
 IntroductionThis article is day 14 of the "Game Development Accelerated by SaaS - Advent Calendar 2025".
In this article, I'll verify how much we can separate image processing and asset management from application code in a Next.js-based visual novel UI using Cloudinary. This is an attempt to reduce the complexity of implementation and operation on the web game side by delegating resizing and format conversion of images such as backgrounds, character sprites, text windows, and text advancement icons to Cloudinary.
 What is Cloudinary?Cloudinary is a service that allows you to store images and videos in the cloud and deliver them after transformation and optimization based on URLs. Its key feature is the ability to specify resizing, cropping, automatic quality adjustment, and automatic format selection via URL paths or queries.
 Target AudienceThose who want to build browser-based visual novels or adventure games using Next.js
Those who want to separate image processing and asset management responsibilities from their application
Those who want to try Cloudinary for game-oriented use cases
 Reference InformationImage & Video APIs overview | Cloudinary
Image transformations | Cloudinary
 Division of Roles Between Visual Novel UI and CloudinaryIn this demo, we'll render a visual novel-style screen on the web with the following elements:
A canvas with a base resolution of 1280 x 720
A background image covering the entire area
A heroine character sprite positioned at the bottom center of the screen
A text window at the bottom of the screen with speaker name and dialogue
A text advancement icon positioned at the bottom right
We'll assemble these elements using only the Public IDs of images on Cloudinary.
The Next.js side only determines which Public ID to place where, while leaving resizing, cropping, and quality adjustment to Cloudinary.
 Benefits in the Context of Web GamesWhen creating browser-based games, challenges often include differences in screen resolution and aspect ratio between PCs and smartphones, and the desire to change image sizes and cropping positions when adjusting the UI. Typically, these issues are addressed through methods like:
Creating and maintaining multiple versions of images at different resolutions
Finely adjusting cropping and scaling with CSS
In this structure, we delegate the image transformation and optimization part almost entirely to Cloudinary:
Background images and window frames are resized on the server side by specifying transformations like c_fill,w_1280,h_720 in the URL
Quality and format are entrusted to the service with specifications like q_auto,f_auto
On the game side, we only need to specify the Public ID and transform combination as strings
As a result, game logic and asset transformation logic are cleanly separated. Designers only need to care about assets on Cloudinary, while engineers only need to care about Public IDs and UI layout, making it easier to divide responsibilities.
 Implementation Overview Preparation on CloudinaryUpload images as Cloudinary assets and note the Cloud name and Public ID.
The uploaded images are as follows:
At this stage, transparency has not been added to text_next_arrow and window_frame. We will set the alpha value in the following steps.
 Cloudinary Proxy API /api/imageTo avoid accessing the external image API directly from the browser, we've prepared /api/image as a Next.js Route Handler.
app/api/image/route.tsimport { NextRequest, NextResponse } from "next/server";

export async function GET(request: NextRequest) {
  const searchParams = request.nextUrl.searchParams;
  const publicId = searchParams.get("publicId");
  const transform = searchParams.get("transform") ?? "";

  if (!publicId) {
    return NextResponse.json(
      { error: "publicId query parameter is required" },
      { status: 400 }
    );
  }

  const cloudName = process.env.CLOUDINARY_CLOUD_NAME;
  if (!cloudName) {
    return NextResponse.json(
      { error: "CLOUDINARY_CLOUD_NAME is not configured" },
      { status: 500 }
    );
  }

  const transformPath = transform ? `${transform}/` : "";
  const cloudinaryUrl =
    `https://res.cloudinary.com/${cloudName}/image/upload/` +
    `${transformPath}${publicId}`;

  try {
    const response = await fetch(cloudinaryUrl);

    if (!response.ok) {
      return NextResponse.json(
        { error: "Failed to fetch image from Cloudinary" },
        { status: 502 }
      );
    }

    const buffer = await response.arrayBuffer();
    const contentType =
      response.headers.get("content-type") ?? "image/png";

    return new NextResponse(buffer, {
      status: 200,
      headers: {
        "Content-Type": contentType,
        "Cache-Control": "public, max-age=86400, s-maxage=86400",
      },
    });
  } catch (error) {
    console.error("Error fetching image from Cloudinary:", error);
    return NextResponse.json(
      { error: "Unexpected error while fetching image" },
      { status: 500 }
    );
  }
}
Here's what it does:
Receive publicId and transform from query parameters
Construct a Cloudinary URL and fetch it
Return the obtained binary as the response
By containing all Cloudinary-specific processing in this file, the frontend always calls /api/image.
 Visual Novel UI ComponentThe NovelScene component renders a visual novel-style screen. The component receives the speaker name, dialogue, and Public IDs for each image, and internally constructs URLs for /api/image.
components/NovelScene.tsx"use client";

type NovelSceneProps = {
  speakerName: string;
  dialogue: string;
  backgroundId: string;
  characterId: string;
  windowFrameId: string;
  nextArrowId?: string;
};

/** Generate URL for image proxy API */
function getImageUrl(publicId: string, transform?: string): string {
  const params = new URLSearchParams({ publicId });
  if (transform) {
    params.set("transform", transform);
  }
  return `/api/image?${params.toString()}`;
}

export default function NovelScene(props: NovelSceneProps) {
  const {
    speakerName,
    dialogue,
    backgroundId,
    characterId,
    windowFrameId,
    nextArrowId,
  } = props;

  return (
    <div className="novel-scene">
      {/* Background image - fit to 1280x720 */}
      <img
        src={getImageUrl(
          backgroundId,
          "c_fill,w_1280,h_720,q_auto,f_auto"
        )}
        alt="Background"
        className="novel-background"
        draggable={false}
      />

      {/* Character sprite */}
      <img
        src={getImageUrl(characterId, "q_auto,f_auto")}
        alt="Character"
        className="novel-character"
        draggable={false}
      />

      {/* Text window area */}
      <div className="novel-text-window">
        <img
          src={getImageUrl(
            windowFrameId,
            "c_fill,w_1280,h_240,q_auto,f_auto"
          )}
          alt=""
          className="novel-window-frame"
          draggable={false}
          aria-hidden="true"
        />

        <div className="novel-text-content">
          <p className="novel-speaker-name">{speakerName}</p>
          <p className="novel-dialogue">{dialogue}</p>
        </div>

        {nextArrowId && (
          <img
            src={getImageUrl(nextArrowId, "q_auto,f_auto")}
            alt=""
            className="novel-next-arrow"
            draggable={false}
            aria-hidden="true"
          />
        )}
      </div>
    </div>
  );
}
In this way, Cloudinary's transformation specifications are treated only as arguments to getImageUrl. The division is as follows:
Backgrounds and window frames rely on the service side for size and cropping with c_fill,w_1280,h_...
Character sprites and icons only specify q_auto,f_auto and control size with CSS
 Layout and StyleThe layout is defined with CSS based on a 1280 x 720 canvas.
globals.css:root {
  --novel-width: 1280px;
  --novel-aspect: 16 / 9;
  --text-window-height: 240px;
  --text-padding-x: 48px;
  --text-padding-y: 24px;
}

body {
  margin: 0;
  padding: 0;
  background: #0a0a0a;
}

/* Display novel scene in the center of screen */
.novel-container {
  display: flex;
  align-items: center;
  justify-content: center;
  width: 100%;
  height: 100dvh;
  background: #000;
}

/* Main body equivalent to 1280x720 */
.novel-scene {
  position: relative;
  width: 100%;
  max-width: var(--novel-width);
  aspect-ratio: var(--novel-aspect);
  overflow: hidden;
  background: #101010;
}

/* Background image */
.novel-background {
  position: absolute;
  inset: 0;
  width: 100%;
  height: 100%;
  object-fit: cover;
}

/* Character sprite */
.novel-character {
  position: absolute;
  bottom: 0;
  left: 50%;
  transform: translateX(-50%);
  width: 25%;
  height: auto;
}

/* Text window */
.novel-text-window {
  position: absolute;
  bottom: 0;
  left: 0;
  right: 0;
  height: var(--text-window-height);
}

/* Frame image - semi-transparent */
.novel-window-frame {
  position: absolute;
  inset: 0;
  width: 100%;
  height: 100%;
  object-fit: cover;
  opacity: 0.5;
}

/* Text content */
.novel-text-content {
  position: relative;
  z-index: 10;
  display: flex;
  flex-direction: column;
  gap: 4px;
  height: 100%;
  padding: var(--text-padding-y) var(--text-padding-x);
}

/* Speaker name and dialogue */
.novel-speaker-name {
  margin: 0;
  font-weight: 700;
  color: #ffd700;
}

.novel-dialogue {
  margin: 0;
  color: #fff;
}

/* Text advancement icon */
.novel-next-arrow {
  position: absolute;
  right: 32px;
  bottom: 16px;
  width: 48px;
  height: auto;
}
The layout is solely the responsibility of CSS, while image resolution and format are processed by Cloudinary. This eliminates the need to re-export images every time UI adjustments are made. The benefit is that you can think about screen composition and image processing separately.
!Although it's possible to reduce the alpha value of the image itself using Cloudinary's transformation parameters, we didn't adopt this approach in this demo for the following reasons:
The transparency of the window is likely to change depending on the scene and presentation
If transparency is changed in Cloudinary, the image will always be delivered with reduced opacity, requiring separate variant images for each presentation
With CSS opacity, the expression can be changed simply by switching classes or animations
By following the policy of leaving static transformations like size and format to Cloudinary, while keeping presentation-heavy parameters like transparency in CSS, we maintain UI presentation freedom without increasing image asset variations excessively.
 Verification Results and Considerations
This is a GIF animation of the implemented "visual novel-style screen." The background image, character sprite, text window, and text advancement icon in the bottom right are all retrieved from Cloudinary.
 Changes in ImplementationBy adopting this structure, the following processes were eliminated from the visual novel code:
Scripts to convert image size and format at build time
Processes to prepare multiple versions of images at different resolutions and serve them using media queries or JavaScript
Efforts to adjust cropping positions of backgrounds and UI frames using only CSS
Instead, the following were added:
The need to be aware of Cloudinary Public IDs
The need to design transform strings passed to getImageUrl
While the change in code lines may not be significant, the key point is that the responsibility for image processing has been removed from the visual novel logic. You can now think about "how to place which image on the screen" and "how to optimize the images themselves" in separate layers.
 Performance ConsiderationsWhile using Cloudinary eliminates image transformation processing on the client side, it cannot be said to always be faster overall:
Image retrieval follows the route of client → your own Next.js → Cloudinary, adding one more network round trip compared to direct access to Cloudinary.
During first access, there may be generation of transform results or CDN cache misses, resulting in some waiting time for the first request.
On the other hand, transform results are cached in the CDN, so subsequent requests can be designed assuming cache hits.
The current structure is not designed to maximize processing speed alone. Instead, it prioritizes separating the image processing process and creating a structure that is resilient to asset increases and replacements. It's best to consider adoption with an understanding of this trade-off.
 ConclusionIn this article, I introduced a structure that externalizes image processing and asset management for Next.js visual novel UIs using Cloudinary. By delegating resizing of backgrounds and UI frames, as well as quality and format optimization to Cloudinary, the game side can focus only on Public IDs and layouts.
Since it goes through the network, performance will not improve in all cases. However, for genres like browser-based visual novels where image assets tend to increase, I felt it is worth considering as an option to reduce development and operational burden.