Biotechnology - Genomic and Proteomics/Commons based cases in BGP
Research questions
- Commons based cases (the cases that we know will appear in the right part of the quadrants)
- Identify cases
- Correlate them with their main outputs (Data. Narratives. Tools)
- How and in what extent they are “experimenting” or “adopting” commons-based approach. Are they adopting OA policies, for instance? Are they adopting Social Responsible License approaches?
- Identify these cases and treat them as entities that will also be placed in our mapping device (the quadrants)
- Identify what actors are participating on this and what actors are just observers (Use the questionnaire to guide your research when appropriate - Carol will select specific relevant and helpful questions)
Common-based cases
"Commons" refers to a particular institutional form of structuring the rights to access, use, and control resources.” (p. 60) “The salient characteristic of commons, as opposed to property, is that no single person has exclusive control over the use and disposition of any particular resource in the commons. Instead, resources governed by commons may be used or disposed of by anyone among some (more or less well‐defined) number of persons, under rules that may range from ‘anything goes’ to quite crisply articulated formal rules that are effectively enforced.” (Benkler, Yochai, The Wealth of the Networks, p 61)
“If some information producers do not need to capture the economic benefits of their particulars information outputs, or if some business can capture the economic value of their information production by means other than exclusive control over their products, then the justification for regulating access by granting copyrights or patents is weakened” (p 37)
“The efficiency of regulating information, knowledge, and cultural production through strong copyright and patent is not only theoretically ambiguous, it also lacks empirical basis.” (p 38)
“Indeed, industry surveys concerned with patents have shown that the vast majority of industrial R&D is pursued with strategies that do not relay primarily on patents.” (Wealth of Networks, p44) “Whether, overall, any given regulatory change that increases the scope of exclusive rights improves or undermines new innovation therefore depends on whether, given level of appropriability that preceded it, it increased input costs more or less than it increased the prospect of being paid for one’s outputs.” (p 49)
“Given diverse strategies, the primary unambiguous effect of increasing the scope and force of exclusive rights is to shape the population of business strategies. Strong exclusive rights increase the attractiveness of exclusive‐rights‐based strategies at the expense of nonproprietary strategies, whether market‐based or nonmarket based. They also increase the value and attraction of consolidation of large inventories of existing information with new production.” ( p 50)
Foundational Genomic Data Commons
Human Genome Project
- External Link: Human Genome Project
- Products: Data and Tools. Genome sequence available publicly
- Governance: funded through the NIH
- Comment: Another interesting instance of the commons - the government used the power of funding to mandate open access requirements from the organizations which participated.
- Summary/Notes:
The Human Genome Project was the mapping of the entire human genome using an open approach. Funded primarily by US government funding emerging initially from the Department of Energy (representing the roots of genomic research in the push to understand mutation emerging from radiation exposure), the HGP was a classic “big science” project. An enormous amount of money was committed, a small number of centers were chosen to receive that money, and there was an expectation that the data resulting would be an open product. The primary regulation in our nomenclature was normative and not legal – the data was in the public domain, but there were some expectations of scientific behavior and the right to punish violators was reserved, but the punishment would be in the discipline via peer review and grantmaking review and not via the courts.
The norms that emerged from the Human Genome Project served as the basis for setting norms for the development of common‐based practices in the genomics field and also for the understanding of the legal rules related to database protection. For instance, in the in‐take angle, with the HGP it was understood that a limited group of people could contribute since there was a lack of capacity and infrastructure. Not many had the scientists, the labs or the machines to develop the study – a marked characteristic of differentiation when you compare the HGP with Open Source projects, where there is a democratization of means via ubiquitous cheap desktop computing and ICTs.
However, after some time into the project development, the sponsors of the project – the government – realized that the people part of the project’s team was not posting the data they were producing and the competitor Celera was rapidly creating a private version of the genome via new technology (itself developed at least partially with HGP funding). This was the origin of the codified and formalized norms known as the Bermuda Rules, further developed during the Fort Lauderdale meeting. The Rules were simple and clear: all data was in the public domain, and it would be posted online with 24 hours of coming off the machines. However, scientists using the data were expected to check and see if the data had been “published” yet (the fuzzy part) and if it was unpublished they were expected to honor some norms about the data.
The norms that emerged from the HGP were the inspiration for a norm‐setting process in the HapMap project. However, when the HapMap came to life, the Open Source Movement was already a well developed and studied movement. The FLOSS movement inspired the HapMap to adopt, in its beginning, a more regulated approach, through the institution of a “click wrap” contract among the HapMap participants during its in‐take and out‐take process. The sharing norms instituted by the HapMap contract highly regulated the publication process and also tried to interfere in the exploitation (more precisely – the abuse) of patents that may have emerged from the HapMap out‐puts.
HapMap
- External Link: HapMap
- Products: Data. Coordination between researchers in Canada, China, Japan, Nigeria, United Kingdom and the United States to identify disease-causing genes. Data released into the public domain
- Governance: Combination of both public and private organizations (http://www.hapmap.org/groups.html)
- Another good instance of commons-based production
- Summary/notes:
However, before the HapMap Out‐Take process became a Open Unregulated Commons, it was an Open Regulated Commons by a contract that asked for “not to reduce access to the data” and a kind of “share‐alike” for patents generated through the use of the data from the HapMap. This approach, as can be seen in the project’s site, was abandoned for a unregulated environment – the HapMap Out‐take = Open Unregulated Commons. The reasons were: (1) the project was finished, so there was no reason to protect the data anymore and (2) it was preventing data integration. It turned out that the clickwrap license was less effective at preventing patents than the simple creation of prior art, and its effect on data integration was felt to be toxic
enough so that the contract was removed. (See: http://www.hapmap.org/cgi‐perl/registration + Science Vol. 312. no. 5777, p. 1131 ‐ The HapMap Gold Rush: Researchers Mine a Rich Deposit)
It is interesting to track that the desired community regulation moved from norms to contract back to norms, and that the desire for the community output to serve as an input into new systems (like integrated genomic global databases) was an important factor in moving back to norms
ENCONDE
- External Link: ENCODE
- Products: Data. Open consortium to identify all functional elements of the human genome. Data is made publicly available
- Governance: Part of the NIH
- Comment: Perfect instance of commons-based production.
Sage Bionetworks
- External Link: Sage Bionetworks
- Primarily products: Data and Narratives.
- Summary-Notes on Sage
Observational Genomic Data Commons
Gene Expression Omnibus
The birth of the GEO project in the environment we briefly analyzed above allowed the emergence of a new kind of Commons, the "Partially Open Commons", where everybody that has the money and the tools to run the experiment is allowed. This pattern may be similar to the Open Commons, but it is different from the “Limited Commons” that in general we observed as a patter in the Foundational Data projects, where just the “chosen” ones could contribute.
Proteomic Data Commons
In terms of data and narrative outputs, proteomics is very similar. There is fundamental and observational data, though there is no “human proteome project” like the HGP to serve as an aggregating actor for commons based efforts. There are many smaller efforts that we can study including databases in structural genomics and protein data.
For protein tools, antibodies are the biggest category. We can classify all sorts of antibodies for specific study like cytokines, neurotrophins, etc. These are studyable both from the perspective of companies that provide as well as labs. There is also a growing system for protein expression like gene expression that depends on antibodies, but also now can use all sorts of genomic tools. So the genomic tools are now becoming proteomic tools as well. Also, access to the same stem cells and mice is essential if the research is going to translate to cures. It will be interesting to look and see if the same desire for treating the outputs of research as inputs to new research we saw in the fundamental genomic data space apply here.
Some other kinds of protein tech would include high throughput screening array technology (the robots that test drugs against proteins) and software tools: structure prediction, identification, properties, alignment. Proteomics research is very intensive in terms of computation and software (much more complex than genomics – more similar in some ways to climate change and weather modeling in terms of complexity).
We should probably expect to discuss the impact of patents as biomarker / diagnostic marker. Gene patents haven't had the expected impact of anticommons, but protein patents are extremely valuable and frequently enforced.
Tools Commons
BIOS/CAMBIA
- External Link: CAMBIA / BIOS
- Output: Tools (e.g. new databases) and Narratives (studies and papers)
- Governance: Non-profit NGO. Funding through the Norwegian Government, Horticulture Australia, and the Lemelson Foundation
- Should definitely take a look at the BioForge project, which aims to encourage collaboration between research groups in the life sciences
Methodologies for the Commons
Health Commons
- External Link: Health Commons
- Products: Data, Narratives and Tools. Coalition of organizations aim to share data under a common set of terms and conditions
- Governance: lead by 501(c)3 Science Commons
Infrastructure for the Commons
Ensembl Genome Browser
- External Link: Ensembl Genome Browser
- Output: Data. Aims to automatically annotate the genome, integrate that annotation with other databases and share the product freely on the web
- Governance: Collaboration between the European Bioinformatics Institute and the Wellcome Trust Sanger Institute
- Comment: Interesting case - seems to be using data that's in the commons, managed by private organizations, to produce a new product that is also in the commons
- Summary/Notes: Ensembl is a joint project between EMBL - EBI and the Wellcome Trust Sanger Institute to develop a software system which produces and maintains automatic annotation on selected eukaryotic genomes.
BIODAS: Distributed Annotation System
- Products: Data, Narratives and Tools. Aims to create standard protocol for exchanging genomic annotations
- Governance: Distributed, though with self-appointed leaders
- Comment: This falls under the gray area of the definition of 'commons'. It is much closer to Lessig's definition, where something like TCP/IP could be considered a commons.
National Center for Biotechnology Information
- External Link: National Center for Biotechnology Information
- Output: Data. Creates publicly accessible data and analysis systems for biochemistry and genetics
- Governance: Division of National Library of Medicine and National Institutes of Health
- Comment: Probably does not count as a common-based system. The tools, while publicly available, do not appear to be publicly edit-able. Might be more useful to see what if any collaborative enterprises develop from this work
The Open Biological and Biomedical Ontologies
- External Link: Open Biological Ontologies
- Primarily products: Data, Narratives and Tools. Aims to support community of people developing biomedical ontologies
- Governance: Coordinating editors from the Berkeley Bioinformatics Open-Source Projects - there does not seem to be a system of elections
- Summary:
The OBO Foundry is a collaborative experiment involving developers of science-based ontologies who are establishing a set of principles for ontology development with the goal of creating a suite of orthogonal interoperable reference ontologies in the biomedical domain. The groups developing ontologies who have expressed an interest in this goal are listed below, followed by other relevant efforts in this domain.
In addition to a listing of OBO ontologies, this site also provides a statement of the OBO Foundry principles, discussion fora, technical infrastructure, and other services to facilitate ontology development. We welcome feedback and encourage participation.
Open Wet Ware
- External Link: Open Wet Ware
- Primarily products: Data, Narratives and Tools. Sharing best practices in biological engineering
- Governance: Elected officers, funded through the NSF
- Summary/Notes: OpenWetWare is an effort to promote the sharing of information, know-how, and wisdom among researchers and groups who are working in biology & biological engineering.
A kernel for the Tropical Disease Initiative
- TDI “A kernel for the Tropical Disease Initiative”, Leticia Ort’ et. al., Nature Publishing Group,
- Problem it aims to address: open source drug development has not been successful because there lacks a critical mass of publicly available data. Focus on tropical diseases
- Methodology: develop computational pipeline (e.g. open source data sets) for developing the following: structure modeling of target proteins
- predictions of ligant bonding locations: (e.g., http://www.thesynapticleap.org/), public-private partnerships (e.g., http://www.mmv.org/) and private foundations (e.g., http://www.gatesfoundation.org/);
- predict structures of protein sequences
- products created from using the software do not seem to be required to be put in the public domain
- Uses Science Commons protocol for implementing Open Access Data ((http://sciencecommons.org/projects/publishing/open-access-dataprotocol/)
- Open Source Biotechnology Project, Open source biotechnology?, http://rsss.anu.edu.au/~janeth/OSBiotech.html
Others
BMC Biotechnology
- External Link: BMC Biotechnology
- Output: Narratives. Open Access Biotech Journal. Anyone can submit, though maintains a peer-review process
- Governance: The site itself is part of Springer Science+Business Media
Chiron
- External Link: Chiron
- Output: Data. Chiron maintains connections to lots of individual scientists. Reported 1,400 informal agreements and collaborations with other companies and 64 formal collaborations with other companies (Powell pp. 72-73)
- Comment: Seems Chiron has built a collaborative network qualitatively different from that of other firms. Likely worth investigating more
Michigan Biotech
- External Link: Two Michigan Biotech companies decide to share their lab and equipment
- Output: Tools. Two companies didn't have the money to maintain separate labs, so they merged their efforts.
- Comment: It might be interesting to talk to these people personally and ask what if any collaboration this sort of proximity has brought
Bibliography for Item 10 in BGP
Biotechnology_-_Genomic_and_Proteomics