No more Trade-offs on image resolution! Nyun Zero lets build AI models at gigapixel scale.
Background
In the ever-evolving field of computer vision, deep learning models have established themselves as the cornerstone of advanced feature extraction, surpassing traditional algorithms. However, as technology pushes the boundaries of data acquisition, AI practitioners are faced with a growing challenge: how to train deep learning models effectively on very large images. Large images are everywhere now, be it the medical imaging domain or the remote sensing survey. Thus, a solution is needed where the large deep learning images can be processed seamlessly.
In this case study, we cover the following:
- Highlight the challenges of large images
- Introduce PatchGD
- Present a mathematical understanding of the topic
- Discuss PatchGD in Action
- Explain the integration in Nyun Adapt
The Challenge of Large Images
In various scientific domains such as microscopy, medical imaging, and earth sciences, recent technological advancements have made it possible to obtain very large images. These images can be as massive as 10,000 X 10,000
pixels or more. Handling such large images with traditional CNNs presents significant computational and memory constraints.
The prevailing CNN models are typically trained on low-resolution images, often less than 500 pixels per dimension. This limitation arises from popular benchmark datasets like ImageNet, which contain primarily low-resolution images. While these models excel on such datasets, applying them to high-resolution images results in a quadratic increase in activation size, demanding extensive computational resources and memory. Limited GPU memory makes it impractical to process such large images with CNNs, further limiting the utility of these models in handling high-resolution tasks.
Introducing Patch Gradient Descent (PatchGD)
At Nyun AI, we have developed Patch Gradient Descent (PatchGD), a novel learning strategy designed to tackle the challenge of training deep learning models on large-scale images in an end-to-end manner. PatchGD is built on the hypothesis that instead of updating the entire image at once, similar performance can be achieved by iteratively updating only small parts of the image. This approach optimizes memory and computing efficiency when dealing with large images, making it an ideal solution for resource-constrained environments.
PatchGD now in Nyun Adapt: We are excited to announce that PatchGD is now integrated into Nyun Zero and can directly be used to handle large images with deep learning models.
PatchGD in Action
To demonstrate the effectiveness of PatchGD, we conducted experiments on The Cancer Genome Atlas (TCGA) dataset, which consists of high-resolution images of histopathology used to benchmark deep learning methods for large images. We compared PatchGD against the conventional gradient descent method, and the results were compelling. PatchGD outperformed gradient descent, particularly under memory constraints. Even when limited to a GPU with only 4 GB of memory, PatchGD remained stable and showcased its adaptability to varying image dimensions and memory constraints.
Use Patch GD in Nyun Adapt
We have now integrated PatchGD into our Nyun Zero platform. Nyun Zero is an efficiency-focused deep learning platform designed to streamline the entire development-to-deployment cycle of deep learning models. It allows users to efficiently train AI models and optimize the models to make them lean and fast keeping the accuracy the same. Users have the option to connect their own VPC(Virtual Private Cloud) keeping their data private and secure. Users can export these models with ease for different hardware. Nyun AI hosts 2 applications Nyun Adapt for training models on your data and Nyun Kompress to optimize their models.
Nyun ADAPT is a robust adaptation module that effortlessly fine-tunes and performs transfer learning on a diverse range of deep learning tasks and models. Adapt empowers users to adapt large and even medium-sized vision and language models seamlessly, enabling the creation of custom models with just a few clicks. This module incorporates state-of-the-art adaptation methods such as (Q,G)-LoRA, SSF, Patch GD etc, allowing users to strike the optimal balance across various model metrics, including parameter count and training throughput.
To perform PatchGD training in Nyun Adapt, users need to follow these simple steps.
- Login to our platform HERE
- Connect to our virtual private client (VPC) or our Nyun cloud.
- Choose Nyun Adapt and create a new task.
- Upload and select your dataset of choice.
- Upload and select the model backbone to be used.
- Select PatchGD in the algorithm and start the process.
Example results
With PatchGD implemented in Nyun Zero, we have been able to achieve remarkable results across a variety of datasets and memory constraints. Figure 1 shows a comparison of PatchGD with the conventional approach of deep learning training. While at 16 GB memory constraint, the performance gap is smaller, we can deliver significantly superior results with PatchGD for a constrained memory size of 4 GB.
Figure 1. Our PatchGD algorithm in Nyun Zero outperforms standard training by significant margins when trained under low memory constraints.
With PatchGD implemented, Nyun Adapt gains an advantage in terms of required memory and can deliver better accuracy and QWK scores as highlighted by the results shown in Table 1. We observed that as the image size grows, the gain in performance obtained by PatchGD increases significantly, clearly highlighting the strong capability of Nyun Zero for large images with the addition of PatchGD support. It does not end here. There is no limit on what can be achieved, and we can also work with images at gigapixel level with our Nyun Zero platform.
Table 1. Performance comparison between PatchGD and conventional gradient descent methods in Nyun Zero for ResNet50 architecture.
There exist a few other approaches for training deep learning models on large images, however, we have observed that with PatchGD incorporated, Nyun Adapt is able to surpass all of them and by remarkable margins, when trained on large images. This is shown in the results presented in Table 2.
Table 2. Comparison of results from Nyun Zero (PatchGD) against other SOTA methods for training on 4096X4096 resolution images of PANDA dataset for 48 GB GPU memory constraint and ResNet50 backbone.
Mathematical Formulation of PatchGD
Let's delve into the technical details of PatchGD. In the context of CNN-based classification, we start with a CNN model parameterized by \( \boldsymbol{\theta} \) that takes an input image \( \mathbf{X} \) and computes the probability of it belonging to predefined classes. The objective is to minimize the loss function \( \mathcal{L} \).
Traditional mini-batch gradient descent updates the model using gradients computed over a batch of samples. However, this approach faces limitations when dealing with large images due to memory constraints.
PatchGD, on the other hand, avoids updating the entire image at once and instead computes gradients and updates the model using only parts of the image. It introduces inner iterations and patch sampling to achieve this. Additionally, PatchGD incorporates an additional sub-network to further enhance performance.
How PatchGD Works
Here's a step-by-step breakdown of how PatchGD works:
- Model Initialization: PatchGD starts by initializing \( \mathbf{Z} \) corresponding to the input images. This process involves filling \( \mathbf{Z} \) using the output obtained from the base CNN model \( f_{\boldsymbol{\theta}_1} \).
- Model Update Iterations: PatchGD performs a series of inner iterations, each involving the update of \( k \) patches per image in the input batch \( \mathbf{X} \). These patches are used to update \( \mathbf{Z} \), and the model parameters are updated accordingly.
- Additional Sub-network: PatchGD incorporates an additional sub-network \( g_{\boldsymbol{\theta}_2} \), extending the parameter set \( \boldsymbol{\theta} \). This sub-network processes the partly updated \( \mathbf{Z} \) to compute class probabilities and loss.
- Gradient Accumulation: To control the frequency of model updates during inner iterations and mitigate convergence issues, PatchGD accumulates gradients over \( \epsilon \) inner steps before updating the model.
- Inference Phase: During inference, PatchGD fills \( \mathbf{Z} \) using the optimized base model \( f_{\boldsymbol{\theta}_1^*} \) and computes class probabilities using \( g_{\boldsymbol{\theta}_2^*} \).
Conclusion
Patch Gradient Descent (PatchGD) offers a promising solution to the challenge of training deep learning models on large images, even in resource-constrained environments. By updating models iteratively on small parts of the image, PatchGD optimizes memory and compute efficiency, making it a valuable tool for researchers and practitioners in fields where large images are a common occurrence.
With PatchGD included in Nyun Adapt, we believe Nyun Zero can be a tool of significant value for domains of histopathology, satellite imagery and geophysics, among others. As technology continues to push the boundaries of image resolution, our platform empowers the deep learning community to leverage the potential of high-resolution images without the limitations of computational and memory constraints. It's a significant step forward in the quest for more accurate and efficient deep learning models in the realm of computer vision.