{"id":29378,"date":"2025-01-31T23:15:33","date_gmt":"2025-01-31T23:15:33","guid":{"rendered":"https:\/\/gridinsoft.com\/blogs\/?p=29378"},"modified":"2025-01-31T23:52:15","modified_gmt":"2025-01-31T23:52:15","slug":"deepseek-ai-data-leak","status":"publish","type":"post","link":"https:\/\/gridinsoft.com\/blogs\/deepseek-ai-data-leak\/","title":{"rendered":"DeepSeek AI Data Leaked, Exposing User Data"},"content":{"rendered":"<p><strong>Wiz Research discovered a detailed DeepSeek database containing sensitive information<\/strong>, including user chat history, API keys, and logs. Additionally, it exposed backend data with internal details about infrastructure performance. Yes, the unprotected data was openly lying in the public domain, so it is far beyond the high-profile leak.<\/p>\n<h2>DeepSeek AI Data Breach: Over a Million Log Entries and Sensitive Keys Exposed<\/h2>\n<p>DeepSeek, a rapidly rising Chinese AI startup that has become worldwide known in just a few days for its open-source models, has found itself in hot water after a major security lapse. <a href=\"https:\/\/www.wiz.io\/blog\/wiz-research-uncovers-exposed-deepseek-database-leak\" rel=\"noopener noreferrer nofollow\" target=\"_blank\">Researchers at Wiz discovered<\/a> <strong>that DeepSeek left one of its ClickHouse databases publicly accessible on the internet<\/strong>, potentially allowing unauthorized access to sensitive internal data. Wouldn\u2019t it be ironic if an AI company that claims to be smarter than humans couldn\u2019t even secure its own database?<\/p>\n<figure id=\"attachment_29384\" aria-describedby=\"caption-attachment-29384\" style=\"width: 2062px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/deepseek-message.webp\" alt=\"Plain-Text chat messages DeepSeek\" width=\"2062\" height=\"1150\" class=\"size-full wp-image-29384\" title=\"\" srcset=\"https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/deepseek-message.webp 2062w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/deepseek-message-300x167.webp 300w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/deepseek-message-1024x571.webp 1024w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/deepseek-message-768x428.webp 768w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/deepseek-message-1536x857.webp 1536w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/deepseek-message-2048x1142.webp 2048w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/deepseek-message-860x480.webp 860w\" sizes=\"auto, (max-width: 2062px) 100vw, 2062px\" \/><figcaption id=\"caption-attachment-29384\" class=\"wp-caption-text\">Plain-Text chat messages from DeepSeek (source: Wiz Research)<\/figcaption><\/figure>\n<p>The exposed database contained over a million log entries, including chat history, backend details, API keys, and operational metadata\u2014essentially the backbone of DeepSeek\u2019s infrastructure. API secrets, in particular, are highly sensitive because they act as authentication tokens for accessing services. If compromised, attackers could exploit these keys to manipulate AI models, extract user data, or even take control of internal systems.<\/p>\n<h2>How Was the Data Accessed?<\/h2>\n<p>DeepSeek&#8217;s system ran on ClickHouse, an open-source columnar database optimized for handling large-scale data analytics. The database was hosted at oauth2callback.deepseek[.]com:9000 and dev.deepseek[.]com:9000, <strong>and required no authentication to access<\/strong>. This means that anyone who discovered the exposed endpoints could connect and potentially extract or alter the data at will.<\/p>\n<p>ClickHouse supports an HTTP interface, which allows users <a href=\"https:\/\/gridinsoft.com\/sql-injection\">to run SQL queries<\/a> directly from a web browser or command line without needing dedicated database management software. Because of this, any attacker who knew the right queries could potentially extract data, delete records, or escalate their privileges within DeepSeek\u2019s infrastructure.<\/p>\n<figure id=\"attachment_29387\" aria-describedby=\"caption-attachment-29387\" style=\"width: 2056px\" class=\"wp-caption aligncenter\"><img loading=\"lazy\" loading=\"lazy\" decoding=\"async\" src=\"https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/leaked-data.webp\" alt=\"Leaked data screenshot\" width=\"2056\" height=\"1152\" class=\"size-full wp-image-29387\" title=\"\" srcset=\"https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/leaked-data.webp 2056w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/leaked-data-300x168.webp 300w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/leaked-data-1024x574.webp 1024w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/leaked-data-768x430.webp 768w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/leaked-data-1536x861.webp 1536w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/leaked-data-2048x1148.webp 2048w, https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/leaked-data-860x482.webp 860w\" sizes=\"auto, (max-width: 2056px) 100vw, 2056px\" \/><figcaption id=\"caption-attachment-29387\" class=\"wp-caption-text\">Some leaked data<\/figcaption><\/figure>\n<p>Wiz researcher Gal Nagli pointed out that while much of AI security discourse focuses on future risks (like AI model manipulation and adversarial attacks), the real-world threats often stem from elementary mistakes, like exposed databases.<\/p>\n<p>As Nagli rationally notes, AI firms must prioritize data protection by working closely with security teams to prevent such leaks. If attackers had gained access to DeepSeek\u2019s logs, they could have harvested API keys to exploit AI services. They could also analyze chat logs to extract <a href=\"https:\/\/gridinsoft.com\/blogs\/personal-data-sensitive-data\/\">user data<\/a> and private interactions. Additionally, they might <strong>manipulate internal settings<\/strong> to alter how models operate.<\/p>\n<h2>So What Now?<\/h2>\n<p>Despite such seemingly high-profile failures, the service still works great, as evidenced by the statistics of app downloads from official app stores. However, apart from this incident, those concerned about data security have some questions for the service. Its privacy policies are under investigation, particularly in Europe, due to questions about its handling of user data. As a Chinese AI company, DeepSeek is also being examined by U.S. authorities for potential national security risks.<\/p>\n<p>Additionally, OpenAI and Microsoft suspect that <strong>DeepSeek may have used OpenAI\u2019s API<\/strong> without permission to train its models via distillation\u2014a process where AI models are trained on the output of more advanced models rather than raw data. The Italian data protection authority, Garante, recently demanded information on DeepSeek\u2019s data collection practices, leading to its apps becoming unavailable in Italy. Meanwhile, Ireland\u2019s Data Protection Commission (DPC) has made a similar request.<\/p>\n<p style=\"padding-top:15px;padding-bottom:15px;\"><a href=\"\/download\/antimalware\" rel=\"nofollow\"><img loading=\"lazy\" loading=\"lazy\" decoding=\"async\" src=\"\/blogs\/wp-content\/uploads\/2022\/07\/env02.webp\" alt=\"DeepSeek AI Data Leaked, Exposing User Data\" width=\"798\" height=\"336\" class=\"aligncenter size-full\" title=\"\"><\/a><\/p>\n","protected":false},"excerpt":{"rendered":"<p>Wiz Research discovered a detailed DeepSeek database containing sensitive information, including user chat history, API keys, and logs. Additionally, it exposed backend data with internal details about infrastructure performance. Yes, the unprotected data was openly lying in the public domain, so it is far beyond the high-profile leak. DeepSeek AI Data Breach: Over a Million [&hellip;]<\/p>\n","protected":false},"author":7,"featured_media":29389,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"content-type":"","_sitemap_exclude":false,"_sitemap_priority":"","_sitemap_frequency":"","footnotes":""},"categories":[15],"tags":[444,619,697],"class_list":{"0":"post-29378","1":"post","2":"type-post","3":"status-publish","4":"format-standard","5":"has-post-thumbnail","7":"category-security-news","8":"tag-ai","9":"tag-cybersecurity","10":"tag-data-breach"},"featured_image_src":"https:\/\/gridinsoft.com\/blogs\/wp-content\/uploads\/2025\/01\/GS_Blog_Telegram-Captcha-Exploits-PowerShell-to-Spread-Malware_1280x674-Copy-29.webp","author_info":{"display_name":"Stephanie Adlam","author_link":"https:\/\/gridinsoft.com\/blogs\/author\/adlam\/"},"_links":{"self":[{"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/posts\/29378","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/users\/7"}],"replies":[{"embeddable":true,"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/comments?post=29378"}],"version-history":[{"count":12,"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/posts\/29378\/revisions"}],"predecessor-version":[{"id":29393,"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/posts\/29378\/revisions\/29393"}],"wp:featuredmedia":[{"embeddable":true,"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/media\/29389"}],"wp:attachment":[{"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/media?parent=29378"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/categories?post=29378"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/gridinsoft.com\/blogs\/wp-json\/wp\/v2\/tags?post=29378"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}