Open Menu
Synthetic data generator for AI chroma upsampling
Innovation Days

Synthetic data generator for AI chroma upsampling

User Name

Written by:

Sergio Sánchez

Oct. 10, 2024

Innovation days spotlight: A deep dive into creative solutions

During Innovation Days, we step away from routine tasks to explore creative and step-forward projects. This period halts all standard meetings, allowing our teams to focus entirely on transforming novel ideas into functional prototypes.

Chroma subsampling effect in desktop remoting

Poor chrominance clarity can lead to blurry text and visual artifacts in desktop-as-a-service (DaaS) and virtual desktop infrastructure (VDI) environments  By enhancing chroma subsampling methods to improve image sharpness and colour accuracy, we deliver more accurate visual outputs, resolving a key challenge in remote work technology.

Figure: Comparing text clarity in different chroma subsampling settings. It illustrates text clarity and colour accuracy differences between 4:2:0 and 4:4:4 (no subsampling) settings, emphasizing the enhanced detail in 4:4:4. Image source here.

The objective: pushing the boundaries with AI upscaling

Our final goal is to create an AI model deployed at the endpoint (e.g., thin client, laptop, tablet, etc.) that meets and surpasses current standards in YUV 4:2:0 to YUV 4:4:4 upscaling. This model leverages the full-resolution Y channel to enhance the U and V channels, improving the output quality in a more consistent and realistic manner.

Our immediate focus is to generate a robust and varied synthetic dataset that provides a solid foundation for training a deep learning model. This dataset will help avoid biases and ensure the model is well-rounded and efficient.

Introducing the synthetic desktop data generator YUV444 & YUV420

Purpose and design

The synthetic desktop data generator is an application designed to create specific mimic datasets in YUV444 and YUV420 formats for screen content encoding scenarios for DaaS/VDI. This powerful tool is crucial for assembling comprehensive datasets that provide the variance needed for effective AI model training. It produces images with uniform background colours and varied attributes such as fonts, sizes, and text colours, enabling robust AI training and model validation. These datasets specifically support the development of AI models focused on chroma upscaling, enhancing resolution from standard YUV 4:2:0 to superior YUV 4:4:4 quality after decoding.

Understanding the YUV color model 

Think of an image as a 3D matrix where the additional dimensions offer vital colour data. While RGB uses primary colours to represent images, the YUV model separates luminance (brightness) from chrominance (colour details), optimizing how we store and transmit colour information in the encoding processes.

The mechanics of chroma subsampling

The YUV format (a.k.a. YCbCr) lets us utilize chroma subsampling by reducing chroma resolution compared to luma, based on the human eye's higher sensitivity to brightness. This method is essential for efficient image compression and is used for most video encoders.

Typical chroma subsampling approaches are the following:

  • 4:4:4: There is no chroma subsampling

  • 4:2:2: U and V are sampled at half the sample rate of luma; the horizontal chroma resolution is halved

  • 4:2:0: U and V are each subsampled at a factor of two, both horizontally and vertically

If we use YUV 4:2:0 instead of 4:4:4, we can cut this size by half. For more information, refer to ​Wikipedia.

Results: capabilities of the synthetic data generator

The tool’s capabilities extend to generating text in multiple resolutions, colour formats, and fonts, creating thousands of unique images. As showcased during Innovation Days, these images —varying in resolution, font attributes, and background colours — are meticulously designed to train models robustly and mitigate bias.

Figure: Synthetic Data Generator Output for AI Training. Showcases a diverse dataset created for deep learning model training, with variations designed to ensure model robustness and mitigate bias.

Proof of concept: integrating into real-world scenarios

The adaptability of our dataset is demonstrated through a proof-of-concept that incorporates the generated images into desktop templates designed to represent a wide range of applications, such as office tasks, gaming, graphic design, and everyday computing. These templates mimic authentic digital environments by realistically modifying text within the generated images to achieve perspective and occlusion. Each template, which includes a variety of window configurations, operating system themes, and application interfaces, ensures that text is meticulously integrated within previously annotated bounding boxes, improving the realism of text placement and visual perception.

Figure: Sample images of the proof of concept performed by integrating the synthetic images generated into annotated templates of typical desktop scenarios, demonstrating the potential of creating diverse digital environments.

Looking ahead: Innovations in AI development

In the near future, synthetically generated datasets will be essential for advancing our AI models that enhance YUV 4:2:0 to YUV 4:4:4 scaling. This AI-driven enhancement approach leverages the full-resolution Y channel to improve the scaling of U and V channels, aligning them more accurately and enhancing overall image quality. The model is designed for deployment at the endpoint (e.g., thin client, laptop, tablet) after decoding. By leveraging Fluendo’s AI engine for efficient edge inference, we ensure optimal performance across various environments, both within and outside the DaaS ecosystem, providing a seamless integration experience. Our method sets new benchmarks in desktop remoting, emphasizing sharper visuals, enhanced colour accuracy, and greater detail, reflecting the growing demand for higher image quality across various technologies.

We invite you to experience these improvements firsthand. Whether you're looking to reduce operational costs, improve user experience, or simply curious about the potential impact on your specific use case, we encourage you to contact us. It's the best way to see how Fluendo could benefit your organization. Our team is ready to assist you in exploring how this technology can be applied to your unique remote desktop environment.

Take The Leap With Your Multimedia Experiences

Navigate through our list of Audio and Video Codecs and understand why we offer outstanding high-performance services to all sizes of industries.

Talk to an expert!
Grid
Decor