Tag Archives: dna databases

Change text to "We The People"

We The Subjects — Plundering Health Data

When geneticist Jingyuan Fu heard that an artificial intelligence (AI) group in China had downloaded a large biomedical dataset her team built in Europe, she felt pride — and a jolt of unease. “We spent millions on that dataset,” says Fu, a professor of systems medicine at the University of Groningen in the Netherlands. “And the Chinese bought the whole thing for around €2,000.” In recent years, Fu’s group, like many others, has also begun using such data as feedstock for artificial intelligence. The AI group in China that downloaded her dataset had the same goal. “The Chinese wanted all our data,” Fu says. “And they also wanted our insights into how to mine it for AI development.”
From her perspective, today’s global scramble for biomedical data looks increasingly lopsided. “China has collected a huge amount of data,” she says. “But their own data sharing and openness is very limited.”… China already holds the largest data repositories, with 1.4 billion people using the WeChat app, many of whom are already connected to hospital databases for data integration, analysis and even healthcare delivery. “China also runs the largest number of clinical trials in the world generating massive drug response and real-world-evidence datasets.”

[A]fter decades of policies pushing ‘open science’, governments are now promoting ‘data sovereignty’ — the idea that sensitive datasets should remain under national control and foreign access should be conditional. [In Europe] the stance is defensive. [Europe] is embarrassed about having allowed Chinese AI developers to plunder European biomedical databases, even while China blocks foreign access to Chinese datasets. They are now belatedly closing international access to biomedical databases, after years of championing cross-border sharing…“According to the European Commission “there are currently no partnerships involving the sharing of such data with China or the United States for AI development”….

As of April 2025, the 2.5 petabytes of omics data in the US Cancer Genome Atlas Program database are now closed to Chinese researchers, and UK Biobank data, containing whole-genome and exome sequences for 500,000 people, is no longer internationally downloadable. UK Biobank data must now be analyzed on the Biobank’s own platform, which provides a cloud-based ‘reading room’ without allowing individual data downloads…In September 2025, the US National Institutes of Health issued new regulations for genomic data repositories and users aimed at “protecting Americans’ sensitive personal health-related data from misuse by foreign adversaries” while enhancing “the privacy and autonomy of research participants”….In December 2025, the US State Department launched its Pax Silica initiative, aimed at forming an international AI alliance that hedges against China. [Furthermore], data that are generated and held by hospitals, insurers, device makers, drug makers and data platform companies. are abundant For example, US-based electronic health records vendor Epic Systems Corporation manages records for over 300 million US patients and says that it has more than 150 AI features in development…

[But]AI models developed using sequestered datasets often ‘overfit’ to the specific demographics or clinical practices of their training environment…. “Without external, international validation, these biases are frequently only discovered after they have caused clinical harm,” …[For example] many high-performing AI tools for melanoma detection show a precipitous drop in accuracy when applied to darker skin tones. “Because major datasets are often skewed toward light-skinned northern European or North American populations, “these tools can misclassify malignant lesions as benign in under-represented groups.”

Excerpt from Paul Webster, Who Owns by Health Data?, Nature Medicine,  April 24, 2026

Genetic Surveillance based on Stray DNA

Everywhere they go, humans leave stray DNA. Police have used genetic sequences retrieved from cigarette butts and coffee cups to identify suspects; archaeologists have sifted DNA from cave dirt to identify ancient humans. But for scientists aiming to capture genetic information not about people, but about animals, plants, and microbes, the ubiquity of human DNA and the ability of even partial sequences to reveal information most people would want to keep private is a growing problem, researchers from two disparate fields warn this week. Both groups are calling for safeguards to prevent misuse of such human genomic “bycatch.”

Genetic sequences recovered from water, soil, and even air can reveal plant and animal diversity, identify pathogens, and trace past environments, sparking a boom in studies of this environmental DNA (eDNA). But the samples can also contain significant amounts of human genes, researchers report today in Nature Ecology & Evolution. In some cases, the DNA traces were enough to determine the sex and likely ancestry of the people who shed them, raising ethical alarms…Similarly, scientists have for decades analyzed the genetic information in fecal matter to reveal the microbes in people’s intestines—the gut microbiome, which plays dramatic roles in human health and development.

The power to extract personal data from eDNA and microbiome samples will continue to increase, both groups of authors warn. That raises concerns about misuse by police or other government agencies, collection by commercial companies, or even mass genetic surveillance, says Natalie Ram, a law and bioethics scholar at the University of Maryland Francis King Carey School of Law. In the United States, she says, researchers and funding agencies should make greater use of federal Certificates of Confidentiality. They prohibit the disclosure of “identifiable, sensitive research information” to anyone not connected with a study, such as law enforcement, without the subject’s consent….

“Which companies and governments are going to pay and license to have poop-based surveillance technology?” he asks. “Imputing people’s identity based on their poop is compelling and interesting, for a number of reasons, and most of them are all the wrong reasons.”

Excerpts from Gretchen Vogel, Privacy concerns sparked by human DNA accidentally collected in studies of other Species, Science, May 15, 2023

Tesla as Catfish: When China Carps-Tech CEOs Fall in Line

Many countries are wrestling with how to regulate digital records. Some economies, including in Europe, emphasize the need for data privacy, while others, such as China and Russia, put greater focus on government control. The U.S. currently doesn’t have a single federal-level law on data protection or security; instead, the Federal Trade Commission is broadly empowered to protect consumers from unfair or deceptive data practices.

Behind China’s moves is a growing sense among leaders that data accumulated by the private sector should in essence be considered a national asset, which can be tapped or restricted according to the state’s needs, according to the people involved in policy-making. Those needs include managing financial risks, tracking virus outbreaks, supporting state economic priorities or conducting surveillance of criminals and political opponents. Officials also worry companies could share data with foreign business partners, undermining national security.


Beijing’s latest economic blueprint for the next five years, released in March 2021, emphasized the need to strengthen government sway over private firms’ data—the first time a five-year plan has done so. A key element of Beijing’s push is a pair of laws, one passed in June 2021, the Data Security Law,  and the other a proposal updated by China’s legislature in Apr0il 2021. Together, they will subject almost all data-related activities to government oversight, including their collection, storage, use and transmission. The legislation builds on the 2017 Cybersecurity Law that started tightening control of data flows.

The law will “clearly implement a more stringent management system for data related to national security, the lifeline of the national economy, people’s livelihood and major public interests,” said a spokesman for the National People’s Congress, the legislature. The proposed Personal Information Protection Law, modeled on the European Union’s data-protection regulation, seeks to limit the types of data that private-sector firms can collect. Unlike the EU rules, the Chinese version lacks restrictions on government entities when it comes to gathering information on people’s call logs, contact lists, location and other data.

In late May 2021, citing concerns over user privacy, the Cyberspace Administration of China singled out 105 apps—including ByteDance’s video-sharing service Douyin and Microsoft Corp.’s Bing search engine and LinkedIn service—for excessively collecting and illegally accessing users’ personal information. The government gave the companies named 15 days to fix the problems or face legal consequences….

Beijing’s pressure on foreign firms to fall in line picked up with the 2017 Cybersecurity Law, which included a provision calling for companies to store their data on Chinese soil. That requirement, at least initially, was largely limited to companies deemed “critical infrastructure providers,” a loosely defined category that has included foreign banks and tech firms….Since 2021, Chinese regulators have formally made the data-localization requirement a prerequisite for foreign financial institutions trying to get a foothold in China. Citigroup Inc. and BlackRock Inc. are among the U.S. firms that have so far agreed to the rule and won licenses to start wholly-owned businesses in China…

Senior officials have publicly likened Tesla to a “catfish” rather than a “shark,” saying the company could uplift the auto sector the way working with Apple and Motorola Mobility LLC helped elevate China’s smartphone and telecommunications industries. To ensure Tesla doesn’t become a security risk, China’s Cyberspace Administration recently issued a draft rule that would forbid electric-car makers from transferring outside China any information collected from users on China’s roads and highways. It also restricted the use of Tesla cars by military personnel and staff of some state-owned companies amid concerns that the vehicles’ cameras could send information about government facilities to the U.S. In late May 2021, Tesla confirmed it had set up a data center in China and would domestically store data from cars it sold in the country. It said it joined other Chinese companies, including Alibaba and Baidu Inc., in the discussion of the draft rules arranged by the CyberSecurity Association of China, which reports to the Cyberspace Administration…

Increasingly, China’s president, Mr. Xi, leaned toward voices advocating greater digital control. He now labels big data as another essential element of China’s economy, on par with land, labor and capital.  “From the point of view of the state, anti-data monopoly must be strengthened,” said Li Lihui, a former president of state-owned Bank of China Ltd. and now a member of China’s legislature. He said he expects China to establish a “centralized and unified public database” to underpin its digital economy.

Excerpts from China’s New Power Play: More Control of Tech Companies’ Troves of Data, WSJ, June 12, 2021

Genomic Surveillance

The use of DNA profiling for individual cases of law enforcement has helped to identify suspects and to exonerate the innocent. But retaining genetic materials in the form of national DNA databases, which have proliferated globally in the past two decades, raises important human rights questions.

Privacy rights are fundamental human rights. Around the world, the unregulated collection, use, and retention of DNA has become a form of genomic surveillance. Kuwait passed a now-repealed law mandating the DNA profiling of the entire population. In China, the police systematically collected blood samples from the Xinjiang population under the guise of a health program, and the authorities are working to establish a Y-chromosome DNA database covering the country’s male population. Thailand authorities are establishing a targeted genetic database of Muslim minorities. Under policies set by the previous administration, the U.S. government has been indiscriminately collecting the genetic materials of migrants, including refugees, at the Mexican border.

Governments should reform surveillance laws and draft comprehensive privacy protections that tightly regulate the collection, use, and retention of DNA and other biometric identifiers .They should ban such activities when they do not meet international human rights standards of lawfulness, proportionality, and necessity.

Excerpts from Yves Moreau and Maya Wong, Risks of Genomic Surveillance and How to Stop it, Science, Feb. 2021