Introduction. The 2020s were marked by the emergence of a new generation of computer simulators using augmented reality. One of the promising advantages of augmented reality technology is the ability to safely simulate hazardous situations real-world. A prerequisite for realizing this advantage is to provide the visual coherence of augmented reality scenes: virtual objects must be indistinguishable from real ones. All IT leaders consider augmented reality as a next “big wave”; thus, the visual coherence is becoming a key issue for IT in general. However, it is in aerospace applications that the visual coherence has already acquired practical significance. An example is Boeing's development of an augmented reality flight simulator, which began in 2022. Visual coherence is a complex problem, one of the aspects of which is to provide the correct overall coloration of virtual objects in an augmented reality scene. The objective of the research was to develop a new method of such tinting.

Materials and Methods. The developed method (called spectral transplantation) uses two-dimensional spectral image transformations.

Results. A spectral transplantation technology is proposed that provides direct transfer of color, brightness, and contrast characteristics from the real background to virtual objects. An algorithm for automatic selection of the optimal type of spectral transformation has been developed.

Discussion and Conclusion. Being a fully automatic process without recording lighting conditions, spectral transplantation solves a number of complex problems of visual coherence. Spectral transplantation can be a valuable addition to other methods of providing visual coherence.

Введение. 2020-е годы ознаменовались появлением нового поколения компьютерных тренажеров с применением технологии дополненной реальности. Одно из преимуществ данной технологии — возможность безопасного моделирования опасных ситуаций в реальном мире. Необходимым условием использования этого преимущества является обеспечение визуальной когерентности сцен дополненной реальности: виртуальные объекты должны быть неотличимы от реальных. Все мировые IT-лидеры рассматривают дополненную реальность как следующую волну радикальных изменений в цифровой среде, поэтому визуальная когерентность становится ключевым вопросом для будущего IT, а в аэрокосмических приложениях визуальная когерентность уже приобрела практическое значение. Примером может служить разработка корпорацией Боинг пилотского тренажера с дополненной реальностью (2022). Визуальная когерентность — сложная комплексная проблема, одним из аспектов которой является обеспечение корректной колористической тонировки виртуальных объектов в сцене дополненной реальности. Цель работы — разработка нового метода такой тонировки.

Материалы и методы. В разработанном методе (названном спектральной трансплантацией) используются двумерные спектральные преобразования изображений.

Результаты исследования. Предложена технология спектральной трансплантации, обеспечивающая прямую передачу характеристик цвета, яркости и контраста от реального фона к виртуальным объектам. Разработан алгоритм автоматического выбора оптимального вида спектрального преобразования.

Обсуждение и заключение. Будучи полностью автоматическим процессом без регистрации условий освещенности, спектральная трансплантация решает ряд сложных проблем визуальной когерентности. Спектральная трансплантация может стать ценным дополнением к другим методам обеспечения визуальной когерентности.

Introduction. Modern simulators actually by default imply the use of virtual reality (VR). The advantages of this approach are well known; therefore, we will not dwell on them, but we will note a number of significant and, more importantly, insurmountable disadvantages due to the very nature of virtual reality technology. VR is a digital, discrete technology, while the real world is continuous. Therefore, modeling the real world in VR is inevitably associated with errors, which reduces the efficiency of training. However, for training systems, an even more serious negative aspect is that human decisions are largely based on subconscious consideration of numerous details of the real picture of the world. This process is fundamentally impossible to reproduce using purely computer technologies (e.g., VR) for two reasons: we still do not know (and are unlikely to ever know) what the mechanism of the human brain is. The latest speculations on the topic of artificial intelligence only confirm this. The details of the real world taken into account when making decisions are almost infinite in number, they arise randomly and are of quite a different nature (visual, acoustic, tactile ...).

The emergence of augmented reality (AR) training systems in the 2020s reduced the severity of this problematic situation. Examples are the development by Boeing of an augmented reality pilot simulator based on the well-known R6 ATARS project, which began in the fall of 2022, as well as a similar project launched by British BAE Systems or an air traffic control training simulator from this article. All the information wealth of the world around us in AR is presented explicitly and does not require modeling. But it is needed to solve the problem of visual coherence (VC) to realize the advantages of AR associated with the parallel presence of real and virtual objects in scenes: virtual objects must be indistinguishable from real ones. This article proposes a method for solving the problem of visual coherence in the framework of a project on the development of a training system for air traffic controllers.

AR is a derivative form of VR. AR retains all the features of VR, but, in addition, as a hybrid technology, it has significant advantages arising from the parallel coexistence of virtual and real objects, which attracts the attention of developers to VC. Moreover, studies [

An exhaustive overview of the known VC methods can be found in [

This work was aimed at developing a universal and automatic method to provide direct transfer of color, brightness and contrast characteristics from a real background to virtual objects without digital 3D modeling, which was required in existing VC approaches. The method is based on the mathematical apparatus of two-dimensional spectral transformations, we called it “spectral transplantation”.

The key results of this study are:

It is important to note that VC depends on many factors: lighting, shadows, color tone, mutual reflections, surface texture, optical aberrations, convergence, accommodation, etc. Accordingly, various AR visualization techniques were used. In our case, VC is provided only for the factors of general illumination and coloring of virtual objects in AR. This is one of the VC challenges, especially for outdoor scenes. Therefore, spectral transplantation should be used in combination with other VC methods to achieve full VC.

The list of sources in [

Measurement of lighting conditions

Using a light probe with diffuse bands between mirror spherical quadrants, P. Debevec and others [

A. Alhakamy and M. Tuceryan [

Assessment of lighting conditions

S.B. Knorr and D. Kurtz [

We should mention work [

The closest analogues of the proposed method are approaches that, like spectral transplantation, do not involve preliminary measurements of lighting and simulation of lighting conditions, scene geometry, surface reflection, and also provide for automatic processing.

Among such analogues, there are methods of color transfer from image to image. Paper [

Xuezhong Xiao and Lizhuang Ma [

The advantages of the developed method, in comparison to [

The proposed method uses two-dimensional spectral transformations. Various types of images are optimally described by different types of spectral transformations (“optimally” — in the sense of matching visual perception for real and virtual objects). Actively used in digital image processing since the advent of digital television are the Discrete Fourier Transform, Discrete Cosine Transform, Hadamar Transform, S-Transform, and Karhunen-Loeve Transform.

Materials and Methods. The scheme of the spectral transplantation method (the version using the Fourier transform [

Fig. 1. Scheme of the spectral transplantation method.A version using the Fourier transform

Fig. 2. Real world (WF) and virtual world (VF) video frames:a — WF, airport, cloudy weather; b — WF, airport, sunny weather; c — VF, virtual airplane. WF are small fragments (<25%) of images published on websites sydneyairport.com.au and 6sqft.com.

The goal of this method is to transfer the main characteristics of the image from WF to VF. The scheme of the method is very simple, although the operations have a large computational volume. The method is implemented in five stages (Fig. 1):

1) Selection of color (RGB) channels for WF — WFr, WFg, WFb and for VF — VFr, VFg, VFb. The RGB model is used because of its generality and correlation between channels, which is specific to this model.

2) Calculation of two-dimensional direct Fourier transform (direct Fourier transform — DFT): DFT(WFr), DFT(WFg), DFT(WFb), DFT(VFr), DFT(VFg), DFT(VFb). The DFT formula is given below:

(1)

where c = R, G, B — index for red, green and blue color image channels; M, N — row and column numbers of the pixel matrix of the transformed image; k, l — spatial frequency arguments; xc(m,n) — pixel value with spatial coordinates (m,n) in channel c; Xc(k,l) — complex numbers.

3) This is a key stage. Transplantation of low-frequency part (LFP) is carried out between pairs of WF and VF spectra for each of the red, green and blue channels. This means that VF LFP is replaced by the corresponding WF LFP. The idea of spectral transplantation is based on the following property of DFT: the general character of the image (i.e., color hue, brightness, contrast) depends on the spatial frequencies contained in LFP (including the constant component) of its two-dimensional spectrum.

Thus, by transplanting WF LFP into VF spectrum, we transfer the main characteristics of the image from WF to VF. For this, it is more convenient to use a centered form of a two-dimensional spectrum, where the constant component is located in the center of the matrix of spectral coefficients, and the low-frequency components are symmetrically arranged around the constant component. In a centered spectrum, LFP is the central part of the DFT matrix, and LFP has the size Ml × Nl (Ml <M, Nl <N). If Ml =Nl, then the notation for the square matrix LFP can be LFP(012..F), where 0 — constant component, F — number of the largest spatial frequency in LFP matrix.

The size of the LFP for transplantation depends on the size of the transformed image (this size determines the spectral resolution) and on the volume of image characteristics that should be borrowed from WF. At this stage of research, the size of LFP is determined empirically. For example, the best visual results for 512×512-pixel images were obtained using LFP (012345).

4) Restoring RGB channels for VF using a two-dimensional reverse Fourier transform (RFT). While at this stage, the characteristics of WF and VF are mixed. As a result, RGB channels of the VF image are obtained with the main color, brightness and contrast characteristics of the WF, as well as with characteristics inherited from the original VF.

5) Restoring the corrected VF color through merging the RGB channels obtained at the previous stage, cutting out virtual objects and building an AR scene by superimposing the cut virtual objects on WF.

Obviously, if this method is used to process the WF video stream, then there is no need to calculate DFT, and, accordingly, LFP for each frame of the real world, since the main characteristics of the image are changed only with a radical change in the recorded scene. Such changes can be easily detected by jumps in the average pixel value. At these moments, it is needed to recalculate the spectral transformation for LFP.

Since various types of images are optimally described by different types of spectral transformations mentioned above, it is reasonable to develop an automatic algorithm for selecting the optimal type of transformation for use in spectral transplantation.

We propose to estimate the difference between the visual perception of VF and WF by the RMS distance Δ between the LFP power spectra of their images (for all color channels):

(2)

where PV and PW — two-dimensional power spectra of VF and WF, respectively. For example, in the case of the Fourier transform, the formula for P has the form:

(3)

We propose to determine the optimal type of spectral transformation by the proximity of the vectors Δ and the mean vector calculated by the criterion of the minimum sum of squares of the distances between the mean vector and the vectors Δ for all transformations under consideration.

Let Δj(ΔjR, ΔjG, ΔjB) be the normalized vector of the distance between spectrum VF and WF LFPs for conversion j. Let Δa(ΔaR, ΔaG, ΔaB) be mean vector, and Dj — distance between Δj and Δa. Then, the sum of S squared distances from the vectors Δj of all transformations under consideration to the mean vector is equal to:

(4)

Coordinates ΔaR, ΔaG, ΔaB of the mean vector are calculated as the solution to a system of partial differential equations:

(5)

The selection of the optimal type of spectral transformation is determined by the proximity condition:

(6)

Another obvious criterion for selecting the optimal type of transformation is the length of the vectors Δ. However, the extremes of such a criterion may be related to the ability or inability of certain transformations to correctly detect the difference between certain types of WF and VF. Therefore, we consider the use of the mean vector as a more reliable method of selection.

Similar to the DFT calculation for WF, the optimal type of transformation is selected only once at the beginning of spectral transplantation, unless WF is changed dramatically.

Research Results. The proposed method was tested using the Fourier transform without selecting the optimal type of transformation. WF (real airport scene) and VF (virtual airplane model) had a size of 512=12 pixels and 24-bit colors. Two different conditions were investigated:

1) WF — photo of the airport in cloudy weather (Fig. 2 a);

2) WF — photo of the airport in sunny weather (Fig. 2 b).

In both cases, VF contained a 3D model of the aircraft shown in Figure 2 c. LP(0), LP(01), LFP(012), LFP(0123), LFP(01234), LFP(012345) transplants were tested. Some of the test results are shown in Figure 3. The best visual results were obtained using LFP(012345). In Figure 3, the images after spectral transplantation are intentionally shown without other VC effects (shadows, lighting, etc.) to demonstrate the pure results of this method.

Fig. 3. AR-scene: a — consisting of WF and VF without LFP transplantation; b — АР-scene after transplantation LFP(0123); c — AR-scene after transplantation LFP(012345)

The upper and lower rows in Figure 3 correspond to the opposite conditions for WF: light and dark WF with different shades. Experiments with any other WF will not add significantly new information since they will have conditions between those already presented in Figure 3.

Numerical simulation was carried out to demonstrate the mechanism of spectral transplantation. Figure 4 shows Fourier transplantation using a small (8×8) pixel matrix representing one of the color channels WF and VF. Such a small size of the matrix enables to clearly illustrate the transplantation procedure. In this example, the WF matrix can be associated with an image with a vertical gradient fill, and the VF matrix — with an image with a horizontal gradient fill. Another difference between WF and VF is the range of pixel values: 8-15 for WF (“lighter image”, 8 is a constant component) and 0–7 for VF (“darker image”). LFP(01') transplantation is shown, where 1' means part of the first spatial frequency component (used because of the very low resolution of the 8×8 matrix). The 3D form of the VF matrix after transplantation indicates the transfer of properties from WF to VF: the edge of the surface has risen; the first pixel has received the value of the constant WF component. This example demonstrates how, as a result of spectral transplantation, VF starts to acquire a vertical gradient and a constant component.

Fig. 4. Numerical simulation of spectral transplantation for 8×8-pixel matrices

Spectral transplantation provides several options for changing the parameters of this procedure: changing LFP size; selecting individual components of the spectrum for transplantation; using different transplant coefficients for various components to be transplanted.

Figure 5 shows the effect of transplantation with different parameters for various types of virtual objects — virtual aircraft models that differ in surface texture, markings, and gloss. Figures 5 a and b depict a virtual airplane with complex textures, text symbols and reflections of virtual light sources. Figures 5 d, e and f show a virtual plane with simple contrasting colors. Parts a and d contain virtual objects without transplantation; b and e contain virtual objects after LFP transplantation(0123); c and f contain virtual objects after LFP transplantation(012345). Virtual objects are intentionally shown without other VC effects (shadows, lighting, etc.) to demonstrate the pure results of the method.

Fig. 5. Scenes with cloudy WF: a, d — AR-scenes consisting of WF and VF without LFP transplantation;b, d — AR-scenes composed after LFP transplantation (0123); c, f — AR-scenes composed after LFP transplantation (012345)

It is important to emphasize that the presented figures illustrate the possibilities of tuning the proposed method, and not the final result, since it requires tuning to specific WF. The demonstration of well-executed, but incomplete results, as is often practiced in VC works, does not seem correct to us.

Discussion and Conclusions. The key complicating factor for the described method, presented in Figure 1, is the high computational costs. The most promising way to solve this problem is to directly convert WF LFP parameters into VF rendering parameters. This eliminates the cumbersome procedures of three DFT and three RFT calculations at the second and fourth processing steps, and requires only three WF DFT calculations, once for each section of the WF flow without significantly changing the basic characteristics of WF. This approach provides processing the VF flow in real time.

Another problem is selecting the optimal LFP size. As the volume of spatial frequencies used increases, they begin to hold information about the WF contents. Therefore, limiting the size of LFP is needed to eliminate the effect of a hybrid image [

In further research related to the topic of this paper, the following issues will be considered:

As a fully automatic process without measuring illumination, the proposed spectral transplantation method solves a number of complex VC problems. Let us say, how to best align the color, brightness, and contrast characteristics between real and virtual components in AR scenes. All these tasks are solved through one simple procedure without modeling lighting conditions, AR-scene geometry or BRDF, which eliminates the inevitable modeling errors. The proposed method can be a valuable addition to other VC tools.

The authors declare that there are no conflicts of interest present.