Simplifying Next.js Visual Novel UI Implementation by Delegating Image Processing to Cloudinary
This page has been translated by machine translation. View original
Introduction
This article is day 14 of the "Game Development Accelerated by SaaS - Advent Calendar 2025".
In this article, I'll verify how much we can separate image processing and asset management from application code in a Next.js-based visual novel UI using Cloudinary. This is an attempt to reduce the complexity of implementation and operation on the web game side by delegating resizing and format conversion of images such as backgrounds, character sprites, text windows, and text advancement icons to Cloudinary.

What is Cloudinary?
Cloudinary is a service that allows you to store images and videos in the cloud and deliver them after transformation and optimization based on URLs. Its key feature is the ability to specify resizing, cropping, automatic quality adjustment, and automatic format selection via URL paths or queries.
Target Audience
- Those who want to build browser-based visual novels or adventure games using Next.js
- Those who want to separate image processing and asset management responsibilities from their application
- Those who want to try Cloudinary for game-oriented use cases
Reference Information
Division of Roles Between Visual Novel UI and Cloudinary
In this demo, we'll render a visual novel-style screen on the web with the following elements:
- A canvas with a base resolution of 1280 x 720
- A background image covering the entire area
- A heroine character sprite positioned at the bottom center of the screen
- A text window at the bottom of the screen with speaker name and dialogue
- A text advancement icon positioned at the bottom right
We'll assemble these elements using only the Public IDs of images on Cloudinary.
The Next.js side only determines which Public ID to place where, while leaving resizing, cropping, and quality adjustment to Cloudinary.
Benefits in the Context of Web Games
When creating browser-based games, challenges often include differences in screen resolution and aspect ratio between PCs and smartphones, and the desire to change image sizes and cropping positions when adjusting the UI. Typically, these issues are addressed through methods like:
- Creating and maintaining multiple versions of images at different resolutions
- Finely adjusting cropping and scaling with CSS
In this structure, we delegate the image transformation and optimization part almost entirely to Cloudinary:
- Background images and window frames are resized on the server side by specifying transformations like
c_fill,w_1280,h_720in the URL - Quality and format are entrusted to the service with specifications like
q_auto,f_auto - On the game side, we only need to specify the Public ID and transform combination as strings
As a result, game logic and asset transformation logic are cleanly separated. Designers only need to care about assets on Cloudinary, while engineers only need to care about Public IDs and UI layout, making it easier to divide responsibilities.
Implementation Overview
Preparation on Cloudinary
Upload images as Cloudinary assets and note the Cloud name and Public ID.

The uploaded images are as follows:

At this stage, transparency has not been added to text_next_arrow and window_frame. We will set the alpha value in the following steps.
Cloudinary Proxy API /api/image
To avoid accessing the external image API directly from the browser, we've prepared /api/image as a Next.js Route Handler.
app/api/image/route.ts
import { NextRequest, NextResponse } from "next/server";
export async function GET(request: NextRequest) {
const searchParams = request.nextUrl.searchParams;
const publicId = searchParams.get("publicId");
const transform = searchParams.get("transform") ?? "";
if (!publicId) {
return NextResponse.json(
{ error: "publicId query parameter is required" },
{ status: 400 }
);
}
const cloudName = process.env.CLOUDINARY_CLOUD_NAME;
if (!cloudName) {
return NextResponse.json(
{ error: "CLOUDINARY_CLOUD_NAME is not configured" },
{ status: 500 }
);
}
const transformPath = transform ? `${transform}/` : "";
const cloudinaryUrl =
`https://res.cloudinary.com/${cloudName}/image/upload/` +
`${transformPath}${publicId}`;
try {
const response = await fetch(cloudinaryUrl);
if (!response.ok) {
return NextResponse.json(
{ error: "Failed to fetch image from Cloudinary" },
{ status: 502 }
);
}
const buffer = await response.arrayBuffer();
const contentType =
response.headers.get("content-type") ?? "image/png";
return new NextResponse(buffer, {
status: 200,
headers: {
"Content-Type": contentType,
"Cache-Control": "public, max-age=86400, s-maxage=86400",
},
});
} catch (error) {
console.error("Error fetching image from Cloudinary:", error);
return NextResponse.json(
{ error: "Unexpected error while fetching image" },
{ status: 500 }
);
}
}
Here's what it does:
- Receive
publicIdandtransformfrom query parameters - Construct a Cloudinary URL and
fetchit - Return the obtained binary as the response
By containing all Cloudinary-specific processing in this file, the frontend always calls /api/image.
Visual Novel UI Component
The NovelScene component renders a visual novel-style screen. The component receives the speaker name, dialogue, and Public IDs for each image, and internally constructs URLs for /api/image.
components/NovelScene.tsx
"use client";
type NovelSceneProps = {
speakerName: string;
dialogue: string;
backgroundId: string;
characterId: string;
windowFrameId: string;
nextArrowId?: string;
};
/** Generate URL for image proxy API */
function getImageUrl(publicId: string, transform?: string): string {
const params = new URLSearchParams({ publicId });
if (transform) {
params.set("transform", transform);
}
return `/api/image?${params.toString()}`;
}
export default function NovelScene(props: NovelSceneProps) {
const {
speakerName,
dialogue,
backgroundId,
characterId,
windowFrameId,
nextArrowId,
} = props;
return (
<div className="novel-scene">
{/* Background image - fit to 1280x720 */}
<img
src={getImageUrl(
backgroundId,
"c_fill,w_1280,h_720,q_auto,f_auto"
)}
alt="Background"
className="novel-background"
draggable={false}
/>
{/* Character sprite */}
<img
src={getImageUrl(characterId, "q_auto,f_auto")}
alt="Character"
className="novel-character"
draggable={false}
/>
{/* Text window area */}
<div className="novel-text-window">
<img
src={getImageUrl(
windowFrameId,
"c_fill,w_1280,h_240,q_auto,f_auto"
)}
alt=""
className="novel-window-frame"
draggable={false}
aria-hidden="true"
/>
<div className="novel-text-content">
<p className="novel-speaker-name">{speakerName}</p>
<p className="novel-dialogue">{dialogue}</p>
</div>
{nextArrowId && (
<img
src={getImageUrl(nextArrowId, "q_auto,f_auto")}
alt=""
className="novel-next-arrow"
draggable={false}
aria-hidden="true"
/>
)}
</div>
</div>
);
}
In this way, Cloudinary's transformation specifications are treated only as arguments to getImageUrl. The division is as follows:
- Backgrounds and window frames rely on the service side for size and cropping with
c_fill,w_1280,h_... - Character sprites and icons only specify
q_auto,f_autoand control size with CSS
Layout and Style
The layout is defined with CSS based on a 1280 x 720 canvas.
globals.css
:root {
--novel-width: 1280px;
--novel-aspect: 16 / 9;
--text-window-height: 240px;
--text-padding-x: 48px;
--text-padding-y: 24px;
}
body {
margin: 0;
padding: 0;
background: #0a0a0a;
}
/* Display novel scene in the center of screen */
.novel-container {
display: flex;
align-items: center;
justify-content: center;
width: 100%;
height: 100dvh;
background: #000;
}
/* Main body equivalent to 1280x720 */
.novel-scene {
position: relative;
width: 100%;
max-width: var(--novel-width);
aspect-ratio: var(--novel-aspect);
overflow: hidden;
background: #101010;
}
/* Background image */
.novel-background {
position: absolute;
inset: 0;
width: 100%;
height: 100%;
object-fit: cover;
}
/* Character sprite */
.novel-character {
position: absolute;
bottom: 0;
left: 50%;
transform: translateX(-50%);
width: 25%;
height: auto;
}
/* Text window */
.novel-text-window {
position: absolute;
bottom: 0;
left: 0;
right: 0;
height: var(--text-window-height);
}
/* Frame image - semi-transparent */
.novel-window-frame {
position: absolute;
inset: 0;
width: 100%;
height: 100%;
object-fit: cover;
opacity: 0.5;
}
/* Text content */
.novel-text-content {
position: relative;
z-index: 10;
display: flex;
flex-direction: column;
gap: 4px;
height: 100%;
padding: var(--text-padding-y) var(--text-padding-x);
}
/* Speaker name and dialogue */
.novel-speaker-name {
margin: 0;
font-weight: 700;
color: #ffd700;
}
.novel-dialogue {
margin: 0;
color: #fff;
}
/* Text advancement icon */
.novel-next-arrow {
position: absolute;
right: 32px;
bottom: 16px;
width: 48px;
height: auto;
}
The layout is solely the responsibility of CSS, while image resolution and format are processed by Cloudinary. This eliminates the need to re-export images every time UI adjustments are made. The benefit is that you can think about screen composition and image processing separately.
Verification Results and Considerations

This is a GIF animation of the implemented "visual novel-style screen." The background image, character sprite, text window, and text advancement icon in the bottom right are all retrieved from Cloudinary.
Changes in Implementation
By adopting this structure, the following processes were eliminated from the visual novel code:
- Scripts to convert image size and format at build time
- Processes to prepare multiple versions of images at different resolutions and serve them using media queries or JavaScript
- Efforts to adjust cropping positions of backgrounds and UI frames using only CSS
Instead, the following were added:
- The need to be aware of Cloudinary Public IDs
- The need to design transform strings passed to
getImageUrl
While the change in code lines may not be significant, the key point is that the responsibility for image processing has been removed from the visual novel logic. You can now think about "how to place which image on the screen" and "how to optimize the images themselves" in separate layers.
Performance Considerations
While using Cloudinary eliminates image transformation processing on the client side, it cannot be said to always be faster overall:
- Image retrieval follows the route of
client → your own Next.js → Cloudinary, adding one more network round trip compared to direct access to Cloudinary. - During first access, there may be generation of transform results or CDN cache misses, resulting in some waiting time for the first request.
- On the other hand, transform results are cached in the CDN, so subsequent requests can be designed assuming cache hits.
The current structure is not designed to maximize processing speed alone. Instead, it prioritizes separating the image processing process and creating a structure that is resilient to asset increases and replacements. It's best to consider adoption with an understanding of this trade-off.
Conclusion
In this article, I introduced a structure that externalizes image processing and asset management for Next.js visual novel UIs using Cloudinary. By delegating resizing of backgrounds and UI frames, as well as quality and format optimization to Cloudinary, the game side can focus only on Public IDs and layouts.
Since it goes through the network, performance will not improve in all cases. However, for genres like browser-based visual novels where image assets tend to increase, I felt it is worth considering as an option to reduce development and operational burden.
