With the evolving landscape of digital browsing, artificial intelligence (AI) is at the forefront of redefining how users engage with online content. Sam Altman, CEO of OpenAI, encapsulated this ambition by expressing a desire for a service that unobtrusively observes a user’s online activities to provide proactive assistance when needed. This vision is embodied by a new class of AI browsers including OpenAI's ChatGPT Atlas and Perplexity's Comet.
Unlike conventional browsers, these AI-integrated tools introduce notable departures in functionality. Prominently featured is the addition of a chatbot activation button consistently accessible in the browser's upper-right corner. This interface element enables users to query the chatbot about webpage contents, such as seeking clarification on an article’s subject matter or requesting explanations for images displayed. Beyond inquiry, the AI browsers support an agent mode that empowers the AI to undertake entire tasks autonomous to the user — ranging from editing documents in Google Docs to handling transactions on Amazon.
However, this level of interactivity necessitates extensive access to user data and website information, raising significant privacy concerns. Lena Cohen, a technologist with the Electronic Frontier Foundation, explains that ChatGPT Atlas gains access to a substantially broader set of data than traditional browsers. This data, including personal details from visited sites, may also be utilized to further train OpenAI's models. In contrast to standard browsers, which typically log only the URLs visited, AI browsers transmit more granular content details to their servers, such as order histories when shopping online or communications from platforms like WhatsApp.
Or Eshed, CEO of LayerX, characterizes this as a "gold rush into user data in the browser," emphasizing the proliferation of user information being funneled to AI platforms. This shift necessitates that users remain vigilant about their data exposure and control mechanisms.
Data Transparency and User Controls
When accessing chatbot functionalities through a browser's sidebar, users inadvertently relinquish more control over shared data compared to direct interaction with standalone chatbot websites. These sidebars automatically convey the context of the active webpage to the chatbot. For instance, ChatGPT Atlas shows a visual indicator with the website’s name inside the sidebar bubble, denoting the source of contextual data being sent.
OpenAI states that the nature of retrieved data depends on the page's content. If the page is image-rich, relevant images may be fetched for analysis; for text-heavy pages like articles, primarily textual information is extracted. The browser also offers an optional "memories" feature wherein Atlas records descriptive summaries of all visited sites continually, expanding data retention beyond the immediate context of the current interrogation.
Nevertheless, the exact parameters guiding which elements of a website are transmitted, and the AI's decision criteria for data extraction, remain opaque to users. To mitigate potential overexposure, Atlas provides options to manually remove specific pages from chatbot context in ongoing conversations via an "x" icon appearing when hovering over the webpage name. Additionally, users may permanently prohibit certain websites from sharing data with ChatGPT within the browser's settings accessible through the URL bar.
Perplexity’s Comet browser, by contrast, lacks granular controls for data transmission tied to browsing context. To avoid leaking sensitive page contents, users must open a non-sensitive new tab to access the chatbot sidebar; only the active tab's information is attached in chat queries.
Opting Out of AI Model Training and Its Implications
Atlas users are presented with two key settings regulating how their data contributes to AI model development. The first, "Improve the model for everyone," allows OpenAI to utilize all interactions and inputs to ChatGPT for training purposes and is enabled by default. Given that Atlas appends website data automatically, this includes potentially sensitive content from personal accounts visited, such as social media profiles. Although OpenAI claims to remove personally identifiable information prior to training, the specific criteria used to define such data remain undisclosed.
The second setting, "Include web browsing," permits OpenAI to train models using comprehensive browser activity — including opened tabs and clicked links. This option is disabled by default. Disabling the primary "Improve the model for everyone" setting halts OpenAI from employing chats and browsing behavior for training entirely.
Comet’s data handling contrasts with Atlas by storing activity locally on the user’s device. Users can manage retention settings through their Perplexity account preferences.
Crucially, opting out of training does not prevent data transmission itself; it solely restricts subsequent usage by the AI training systems. Cohen warns that once data resides on corporate servers, user control diminishes significantly, exposing information to potential misuse by hackers, governmental agencies, or other actors. OpenAI reportedly complied with over a hundred government requests for user data within the first half of the current year alone.
Security Threats from Prompt Injection Attacks
The rollout of AI browsers introduced complex security vulnerabilities amid concerns about the manipulation of AI agents. Prompt injection attacks exploit the AI's difficulty in differentiating between legitimate webpage content and malicious embedded instructions clandestinely inserted by bad actors. Such crafted prompts may instruct the AI to divulge sensitive information or execute harmful operations.
Eshed advises caution when enabling agent mode on untrusted or unfamiliar websites, as malicious content can be concealed from human detection yet remain accessible to the AI's parsing logic. Mitigating measures include operating Atlas in a logged-out mode, which restricts the agent’s access to personal data and accounts, decreasing the risk of accidental information leakage. Perplexity does not currently offer a comparable mode, thus presenting greater exposure.
Pranav Vishnu, product lead for Atlas, notes that AI agent browsing is an emerging domain, recommending users begin with minimal access in logged-out mode and grant only necessary permissions for specific tasks.
Considering Whether to Engage With AI Browsers
Integrating AI more deeply into everyday web interactions aligns with the strategic objectives of AI companies seeking competitive advantage through expansive data acquisition. Early entry into the AI browser market, even with products perceived as partially developed, aims to cultivate data accumulation feeding iterative improvements via a growth-driven feedback loop.
Dan Hendrycks, Executive Director at the Center for AI Safety, remarks on the rapid evolution and niche potential of AI browsing. Nonetheless, he expresses personal reservations about adopting these tools, observing the continuous self-promotion embedded within ChatGPT Atlas.
Ultimately, while AI browsers offer unprecedented convenience and enhanced interactive capabilities, users face a tradeoff that balances the benefits against privacy considerations and security risks. Careful evaluation and judicious use are advisable as these technologies continue to mature.