This is the first in a series of posts explaining some of the internals of the NVList engine. If you’ve ever wondered what a visual novel engine looks like on the inside (or at least the way NVList is designed), here’s your chance. I have a few articles already planned, but suggestions for future subjects are also welcome.
Today’s topic is image loading, covering the entire process from a PNG file on the hard drive until the moment it’s ready for rendering to the screen with OpenGL.
The first step is opening a stream to the desired image file. It could be stored as a normal file, or packed together with other resources in an archive (
.nvl). A plugin system takes care of the differences, decoding zipped files as needed, etc. In most cases, files are read from a local hard drive, but streaming them over the internet is also a possibility. Regardless of the source, reading and decoding the file can easily take too much time for a single frame (16ms). Therefore, as much work as possible is done asynchronously by performing most of the image loading on a background thread.
The input stream for the image file is passed to the box labeled
Image Decoder. Most image types are decoded with Java’s built-in ImageIO package. The ImageIO code path decodes the file to a BufferedImage which is then converted to an OpenGL-compatible pixel format. Colors are converted to premultiplied alpha to avoid colors leaking to neighboring pixels during scaling and other operations requiring linear interpolation. Premultiplication also makes certain blending operations less complex.
KTX files use a specialized path. These files are basically raw OpenGL textures with a tiny header, which makes loading them extremely efficient. After parsing the header, the image data can be copied directly into a large in-memory buffer suitable for passing to OpenGL. The KTX files are assumed to already be in premultiplied form, so that step can be skipped as well.
The decoded, premultiplied pixels are wrapped in
TextureData objects; small wrappers containing the decoded pixels with their corresponding OpenGL format/type. A texture cache maintains a fixed-size (configurable with
graphics.imageCacheSize) in-memory collection of decoded textures in order to avoid reloading images as much as possible. Adding TextureData to the cache is where the asynchronous image loader finishes (further operations require an OpenGL context). When the texture cache becomes full, it starts evicting textures that haven’t been used recently, ranking their usefulness with a (simple) heuristic.
"Last-Second Format Conversion" comprises a series of texture format conversions, most of them optional and only required for outdated graphics cards. The currently implemented format conversions:
Pad to power-of-two
Support for textures (image data) with dimensions that aren’t a power of two requires the
GL_ARB_texture_non_power_of_two extension (standard since OpenGL 2.0). For OpenGL implementations that don’t support this extension, NVList has to pad the image data with empty pixels until it’s a power of two in each dimension.
Textures are decoded as
UNSIGNED_INT_8_8_8_8_REV to match the format used by most graphics cards (improves performance). Very old OpenGL implementations don’t support either
UNSIGNED_INT_8_8_8_8_REV (both standard since OpenGL 1.2). If the default format isn’t supported, texture data is converted to the universally supported (but often a bit slower)
Limit to max texture dimensions
OpenGL implementations usually have a limit on the maximum texture dimensions they support. Almost everything these days supports at least
4096x4096, but it’s possible to encounter antiquated hardware with lower limits. When faced with a texture too large for OpenGL, NVList resizes the image in software before uploading it to the graphics card. If debug mode is turned on, a warning is printed anytime such a "panic resize" is performed. The
graphics.maxTextureSize preference can be used to simulate low texture size limits for testing.
The way mipmaps are generated depends on the OpenGL version. Version 1.4 has the
GL_GENERATE_MIPMAP texture parameter to automatically regenerate the mipmaps whenever the texture data is changed. The modern approach would be to use
glGenerateMipmap, but that function requires the
GL_ARB_framebuffer_object extension (standard since OpenGL 3.0). For OpenGL implementations supporting only 1.3 or even lower, mipmap generation is done entirely in software.
After the texture data is in the correct format, it can be uploaded to the graphics card (with
glTexImage2D). A call to
glTexImage2D can take quite a while for the kinds of large textures used by visual novels, possibly resulting in some dropped frames on older machines. Asynchronous texture uploads are possible by using PBOs (requires
GL_ARB_pixel_buffer_object, standard since OpenGL 2.1). PBO texture uploads are used by NVList’s preloader if
graphics.preloadGLTextures is enabled. PBO texture uploads can be asynchronous, but it isn’t an implementation requirement. In practice, for PBO uploads to be asynchronous, there may be additional requirements for data alignment, pixel formats or texture dimensions.
Asynchronous loading only helps when you can start decoding the data before you really need it. The preloader in NVList works in two steps, first decoding the image to the texture cache, then uploading the texture data to the graphics card. The preloader is fed statistical data collected over multiple playthroughs of the VN (among other things, keeping track of every time an image is loaded from disk). Based on the statistics collected, the preloader looks ahead in the script for future image loads and starts decoding the images asynchronously. Once images are decoded to the texture cache, pixel data is asynchronously uploaded to the graphics card using PBOs.