What Is RBI (Remote Browser Isolation) - Overview and Implementation

1. RBI Overview

  • RBI (Remote Browser Isolation) is a security technology that does not run the browser in the user's local environment. Instead, it runs the browser on a server or in an isolated remote environment and delivers only the rendered screen.
    • The browser runs in a server-side isolated environment, and users receive only the visual output.
    • User input events are sent to the server to control the remote browser.
RBI Overview image

1.1. Limits of Web Security

  • Today, most business activities are centered around web browsers.
    • Web browsers have become both the most commonly used work environment and the primary entry point attackers target first.
  • Traditional web security frameworks mostly rely on website blocking, URL categorization, and detection-stage filtering.
    • These methods are effective against known threats but have limitations against new phishing attacks or uncategorized malware.
  • Also, it is practically impossible to block all web access in advance.
    • Because of this, there is a structural limitation: conventional web security alone cannot fundamentally block web threats.

1.2. Why RBI Is Needed

  • Remote Browser Isolation (RBI) is a security concept that shifts the approach in response to these limitations.
    • It runs web pages in an isolated server environment, not on the user's endpoint.
    • Users receive only the execution result, while malicious code or exploit attempts embedded in web content are blocked inside the isolated environment.
  • RBI minimizes security risk by allowing web access through isolation rather than blanket blocking.
  • Users can browse the web in a familiar way, while security is structurally strengthened by separating the execution environment.
    • For this reason, RBI is regarded as a practical alternative that satisfies both security and usability in modern web environments.

2. RBI Implementation Technologies 

2.1. Chrome / Chromium

Chrome / Chromium image
  • Chrome or Chromium is the core engine that renders actual web pages in RBI environments.
  • Headless mode runs the browser without a physical or virtual display. Rendering is performed, but the screen cannot be shown directly to users.
  • However, it does not natively include real-time screen streaming or input relay for interactive user control.
  • Therefore, in RBI environments, it is typically used together with additional display and streaming technologies rather than by itself.

2.2. Xvfb

Xvfb image
  • Xvfb (X Virtual Framebuffer) is a virtual X11 server that provides a graphical environment without an actual physical display.
  • General browsers require an environment capable of screen output, but such a display does not exist in server environments.
  • Xvfb solves this problem by creating a virtual screen in memory.
  • The browser recognizes this virtual screen like a real monitor and performs rendering normally.
  • This makes it possible to run browsers in standard GUI mode, not just headless mode, even in server environments.

2.3. x11vnc

x11vnc image
  • x11vnc captures the screen of an X server (including Xvfb) and delivers it externally through the VNC (Virtual Network Computing) protocol.
  • In other words, it reads the browser screen running on Xvfb in real time and converts it into a format that can be transmitted over the network.
  • It can also relay keyboard and mouse input through the VNC protocol, enabling user interaction.
  • This approach is relatively simple to implement, but it has limitations in transport efficiency and latency.

2.4. noVNC

noVNC image
  • noVNC is a client technology that enables use of the VNC protocol directly in a web browser.
  • Traditional VNC requires a dedicated client program, but noVNC converts it over WebSocket so it can be accessed from standard browsers.
  • Users can view and control the remote browser screen simply by opening a web page, without installing additional software.
  • This greatly improves RBI accessibility, but the VNC-based architecture still has constraints in performance and scalability.

2.5. WebRTC 

WebRTC image
  • WebRTC is a standard technology for real-time communication, characterized by low latency and high transmission efficiency.
  • In RBI environments, it is used to encode the browser screen into a video stream and deliver it to users, replacing traditional VNC approaches.
  • It can also deliver user input events through a separate data channel, enabling more natural interaction.
  • WebRTC provides high performance, but requires additional components such as signaling servers and NAT traversal (STUN/TURN), making implementation relatively complex.

 

3. RBI Implementation Methods

  • Remote Browser Isolation is a security architecture that does not execute web content directly in the local environment. Instead, it runs the browser in an isolated remote environment and delivers only the result to the user.
  • An RBI system is divided into the user domain and an isolated execution environment, and the actual execution of web pages occurs on the server side.
  • This RBI structure is implemented not as a single technology, but as a combination of multiple components.

3.1. Combining Implementation Technologies

  • Browser
    • In RBI environments, a browser engine is required to process web pages in practice.
    • Chrome or Chromium is typically used, and this browser interprets HTML, CSS, and JavaScript and renders them to the screen.

  • Virtual Display
    • Because there is no physical display in server environments, a separate virtual display environment must be configured so the browser can output a screen.
    • Virtual display technology such as Xvfb is used for this purpose. Xvfb creates a virtual screen in memory, allowing the browser to recognize it like a real display and render accordingly.

  • Streaming
    • Because the screen rendered on the virtual display cannot be delivered to users as-is, it must be captured and converted into a transferable form.
    • This role is handled by streaming technologies such as x11vnc or WebRTC.
    • x11vnc reads the X server screen and sends it through the VNC protocol, which makes implementation straightforward. However, it has certain limitations in latency and performance.
    • By contrast, WebRTC encodes the screen as a video stream and transmits it, providing lower latency and higher transfer efficiency.

  • Client
    • On the user side, a client is required to display the screen delivered by the server and send keyboard and mouse input events back to the server.
    • noVNC displays VNC-based screen streams directly in a web browser, and when WebRTC is used, the web browser itself acts as the client


※ In this way, each component is a technology that can work independently, but in an RBI system they are connected as a single pipeline to form the execution and delivery process for web content.

Combined implementation technologies image

3.2. Implementation Approaches

  • An RBI system delivers the screen of a browser running in a remote environment to users, and implementation approaches are classified by the screen delivery method.
  • Representative approaches are the VNC-based approach and the WebRTC-based approach.
  • Both approaches have the same objective, but differ in screen transport method and performance characteristics.

3.2.1. VNC Approach

VNC approach image
  • The VNC-based approach is the most intuitive and is a traditional structure often used when implementing RBI for the first time.
  • In this approach, the browser-rendered screen is captured as-is and delivered to users through the VNC protocol.
Chrome (GUI)
   ↕
Xvfb (Virtual Display)
   ↕
x11vnc (Capture)
   ↕
noVNC (Web Client)
   ↕
User Browser
  • The VNC-based approach has the advantage of simple architecture and easy initial setup.
  • On the other hand, since it transmits pixel-level screen data, it has limitations in network efficiency and latency.
  • Therefore, this approach is mainly used in test environments or PoC stages rather than full-scale production services.

3.2.2. WebRTC Approach

WebRTC approach image
  • The WebRTC-based approach emerged to overcome VNC limitations and is mainly used in modern RBI systems.
  • In this approach, the screen is transmitted not as a simple image but encoded as a video stream.
Chrome
   ↕
Virtual Display (Xvfb)
   ↕
Screen Capture
   ↕
Video Encoder (H264 / VP8)
   ↕
WebRTC Peer Connection
   ↕
User Browser
  • The WebRTC-based approach encodes the browser screen running in the remote environment into a video stream and delivers it to users.
  • Because it provides low latency and high transfer efficiency, it enables stable service operations even in large-scale user environments.
  • However, to provide this performance, it requires more components and more complex processing than VNC-based approaches.
WebRTC approach image

Signaling Server
  • The signaling server is an essential component at the initial stage of establishing a WebRTC connection.
  • Because the WebRTC protocol itself does not include target discovery or connection brokering, a separate relay server is required to exchange session setup information.
  • The main roles of the signaling server are as follows.
    • Exchanging connection information between client and server
    • Managing session creation and termination
    • Supporting initial WebRTC connection setup

STUN / TURN Server

  • The most complex element in WebRTC implementation is handling network environments.
  • Most users and servers are behind NAT (Network Address Translation), which often makes it difficult to secure direct communication paths.
  • To address this, WebRTC uses STUN and TURN servers.
    • STUN Server: Session Traversal Utilities for NAT
      • A protocol used by clients behind NAT to discover their public IP address and port information. It helps clients identify their public IP and port.
    • TURN Server: Traversal Using Relays around NAT
      • A protocol used to relay media and data through an intermediate server when direct peer-to-peer communication is not possible. In such cases, data is delivered through the relay server.

  • These servers are core infrastructure for ensuring WebRTC connection stability.

Stream Encoding

  • To deliver screens through WebRTC, simple screen capture alone is not sufficient.
  • Because browser-rendered screens are fundamentally image-form data, sending them as-is consumes very large bandwidth.
  • Therefore, WebRTC-based approaches transmit screens by encoding them as video streams.
  • This process requires the following steps.
    • Screen capture
    • Encoding using video codecs
    • Real-time streaming transmission

※ As a result, WebRTC-based approaches require higher implementation complexity and system resources than VNC-based approaches.

3.3. Comparison of Implementation Approaches

Item VNC Approach WebRTC Approach
Implementation Difficulty Low High
Performance Low High
Latency High Very Low
Scalability Limited Excellent
Usage Purpose Testing / MVP Production Service

※ The VNC-based approach has the advantage of rapid implementation, but it is limited in performance and user experience. The WebRTC-based approach provides high performance, but implementation is more complex.

4. Comparison by RBI Service Provider

4.1. Products by Provider

Menlo Security - Isolation Platform

  • A specialized RBI company with high technical maturity and strong market share in high-security environments.
  • It applies DOM-based isolation technology, balancing security and user experience.

Cloudflare - Browser Isolation

  • Provides browser isolation as part of a Zero Trust-based security platform.
  • Its integration with cloud infrastructure makes adoption easy, and it is showing rapid growth.

Zscaler - Cloud Browser Isolation

  • As one of the leaders in SSE (Security Service Edge) and SASE platforms, it holds high market share among enterprise customers.
  • Within its integrated SASE/SSE platform, RBI is provided as a core security capability.

Ericom Software - Shield

  • A technology-focused RBI provider, strong in cost efficiency and OEM/MSP scalability.
  • Offers a flexible architecture that supports both on-premises and cloud environments.

Palo Alto Networks - Prisma Access Browser Isolation

  • Expands RBI capabilities by leveraging its existing security appliance and platform customer base.
  • Securing market share through an integrated security-platform-centric approach.

Cisco - Product name: Secure Access (Browser Isolation)

  • A structure that extends RBI capabilities based on existing network equipment and security-solution customers.
  • Provides browser isolation integrated with network access control.

Authentic8 - Silo

  • Its share in the general enterprise market is limited, but it is strong in specialized security sectors such as government, military, and intelligence.
  • Provides a high security level based on a fully isolated browser model.

4.2. Feature Comparison by Provider

Item Menlo Cloudflare Zscaler Ericom Palo Alto Cisco Authentic8
Pixel
Isolation
O O O O O O O
DOM
Reconstruction
O
(
Limited)

(
Limited)
O
(
Limited)
X X
Hybrid
Isolation
O
(Automatic)

(Policy)

(Policy)
O
(Automatic)

(Policy)

(Policy)
X
Streaming
Method
WebRTC WebRTC WebRTC WebRTC WebRTC WebRTC VNC
(Custom)
File Download
Control
O O O O O O O
Clipboard
Control
O O O O O O O
Keystroke
Protection
O
(Isolated)

(Basic)

(Basic)
O
(Isolated)

(Basic)

(Basic)
O
(Isolated)
Session
Isolation 
O O O O O O O
SaaS
Integration
O
(Granular)
O
(Platform)
O
(Platform)
O
(Granular)
O
(Platform)
O
(Platform)

(Basic)
Zero Trust
Integration
O O O O O O O

4.3. Feature Comparison Items

Pixel Isolation

  • This approach delivers only the resulting screen of web pages executed in a remote environment.
  • Web content is processed on the server, and the result is converted into pixel-level images for transmission.
  • As a result, web code is not delivered to endpoints, which provides high security, but it can be affected by network latency and image quality.

DOM Reconstruction

  • This approach sends structural information of the web page to the client and composes the screen in the local browser.
  • It offers better performance and user experience than pixel-based transmission, but security control is important because data is delivered to the client.

Hybrid Isolation

  • This approach selectively applies Pixel Isolation and DOM Reconstruction depending on context.
  • By applying DOM to trusted sites and Pixel to risky sites, it maintains a balance between security and performance.

Streaming Method

  • WebRTC-based streaming delivers remote screens in real time, offering low latency and high efficiency.
  • Through this, users can get a natural experience in a web browser without additional plugins.

File Download Control

  • This feature controls file downloads over the web and can apply file sanitization (CDR Content Disarm & Reconstruction) when needed.
  • This effectively blocks the inflow of malicious files.

Clipboard Control

  • This feature controls copy and paste behavior to prevent data leakage.
  • By restricting copy or paste direction based on policy, it can reduce external exposure of sensitive information.

Keystroke Protection

  • This feature is a protection method to prevent exposure and theft of user input information.
  • By applying encryption or server-side processing to sensitive input data, it helps defend against keylogging attacks.

Session Isolation 

  • This structure provides an isolated execution environment per user, blocking cross-session influence.
  • This helps maintain stable security levels even in multi-user environments.

SaaS Integration

  • This feature integrates with cloud-based SaaS services to control access and behavior by policy.
  • This enables granular security control not only for file handling but also for internal service actions.

Zero Trust Integration

  • This feature links RBI with a security model that verifies every access request.
  • It can control access based on user and device conditions and strengthen security based on a safe browsing environment. 

[Reference]

  • noVNC - Browser-based VNC Client
  • WebRTC (W3C / IETF)