Background: The hypothesis that distance matters but that in recent years geographical proximity has become less important for research collaboration was tested. We have chosen a sample--authors at German immunological institutes--that is relatively homogeneous with regard to research field, language and culture, which beside distance are other possible factors influencing the willingness to co-operate. We analyse yearly distributions of co-authorship links between institutes and compare them with the yearly distributions of distances of all institutes producing papers in journals indexed in the Science Citation Index, editions 1992 till 2002. We weight both types of distributions properly with paper numbers.
Results: One interesting result is that place matters but if a researcher has to leave the home town to find a collaborator distance does not matter any longer. This result holds for all years considered, but is statistically most significant in 2002. The tendency to leave the own town for collaborators has slightly increased in the sample. In addition, yearly productivity distributions of institutes have been found to be lognormal.
Conclusion: The Internet did not change much the collaboration patterns between German immunological institutes.
{"title":"Collaboration and distances between German immunological institutes--a trend analysis.","authors":"Frank Havemann, Michael Heinz, Hildrun Kretschmer","doi":"10.1186/1747-5333-1-6","DOIUrl":"https://doi.org/10.1186/1747-5333-1-6","url":null,"abstract":"<p><strong>Background: </strong>The hypothesis that distance matters but that in recent years geographical proximity has become less important for research collaboration was tested. We have chosen a sample--authors at German immunological institutes--that is relatively homogeneous with regard to research field, language and culture, which beside distance are other possible factors influencing the willingness to co-operate. We analyse yearly distributions of co-authorship links between institutes and compare them with the yearly distributions of distances of all institutes producing papers in journals indexed in the Science Citation Index, editions 1992 till 2002. We weight both types of distributions properly with paper numbers.</p><p><strong>Results: </strong>One interesting result is that place matters but if a researcher has to leave the home town to find a collaborator distance does not matter any longer. This result holds for all years considered, but is statistically most significant in 2002. The tendency to leave the own town for collaborators has slightly increased in the sample. In addition, yearly productivity distributions of institutes have been found to be lognormal.</p><p><strong>Conclusion: </strong>The Internet did not change much the collaboration patterns between German immunological institutes.</p>","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-06-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-1-6","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26089242","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Occasionally, multiple names are given to the same gene/protein. When this happens, different names can be used in subsequent publications, for example in different research areas, sometimes with little or no awareness that the same entity known under a different name may have a major role in another field of science. Recent reports about the protein p11 presented findings that this protein, commonly known as S100A10, may play a crucial role in depression and antidepressant treatment mechanisms. One set of data showed an increased expression of this protein in the brain of mice treated with antidepressants. P11/S100A10 is only one of several S100 proteins expressed in the brain. Interestingly, it has been previously noted that antidepressant treatment increases the brain content of another S100 protein, S100B. It appears that up-regulating the brain content of various S100 proteins might be a common feature of antidepressants. In cells coexpressing S100A10 and S100B, these proteins may interact and exert opposite regulatory roles. Nevertheless, S100A10 is predominantly expressed in certain types of neurons whereas S100B is more abundant in glia. Thus, an interplay among multiple members of the S100 proteins might be important in determining the region and cell specificity of antidepressant mechanisms. Calling the p11 protein by its other name, S100A10, may prompt more investigators from different fields to participate in this new direction of neurobiological research.
{"title":"Nomen est Omen: do antidepressants increase p11 or S100A10?","authors":"Hari Manev, Radmila Manev","doi":"10.1186/1747-5333-1-5","DOIUrl":"https://doi.org/10.1186/1747-5333-1-5","url":null,"abstract":"<p><p> Occasionally, multiple names are given to the same gene/protein. When this happens, different names can be used in subsequent publications, for example in different research areas, sometimes with little or no awareness that the same entity known under a different name may have a major role in another field of science. Recent reports about the protein p11 presented findings that this protein, commonly known as S100A10, may play a crucial role in depression and antidepressant treatment mechanisms. One set of data showed an increased expression of this protein in the brain of mice treated with antidepressants. P11/S100A10 is only one of several S100 proteins expressed in the brain. Interestingly, it has been previously noted that antidepressant treatment increases the brain content of another S100 protein, S100B. It appears that up-regulating the brain content of various S100 proteins might be a common feature of antidepressants. In cells coexpressing S100A10 and S100B, these proteins may interact and exert opposite regulatory roles. Nevertheless, S100A10 is predominantly expressed in certain types of neurons whereas S100B is more abundant in glia. Thus, an interplay among multiple members of the S100 proteins might be important in determining the region and cell specificity of antidepressant mechanisms. Calling the p11 protein by its other name, S100A10, may prompt more investigators from different fields to participate in this new direction of neurobiological research.</p>","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-04-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-1-5","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26042116","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
Background: The TREC 2004 Genomics Track focused on applying information retrieval and text mining techniques to improve the use of genomic information in biomedicine. The Genomics Track consisted of two main tasks, ad hoc retrieval and document categorization. In this paper, we describe the categorization task, which focused on the classification of full-text documents, simulating the task of curators of the Mouse Genome Informatics (MGI) system and consisting of three subtasks. One subtask of the categorization task required the triage of articles likely to have experimental evidence warranting the assignment of GO terms, while the other two subtasks were concerned with the assignment of the three top-level GO categories to each paper containing evidence for these categories.
Results: The track had 33 participating groups. The mean and maximum utility measure for the triage subtask was 0.3303, with a top score of 0.6512. No system was able to substantially improve results over simply using the MeSH term Mice. Analysis of significant feature overlap between the training and test sets was found to be less than expected. Sample coverage of GO terms assigned to papers in the collection was very sparse. Determining papers containing GO term evidence will likely need to be treated as separate tasks for each concept represented in GO, and therefore require much denser sampling than was available in the data sets. The annotation subtask had a mean F-measure of 0.3824, with a top score of 0.5611. The mean F-measure for the annotation plus evidence codes subtask was 0.3676, with a top score of 0.4224. Gene name recognition was found to be of benefit for this task.
Conclusion: Automated classification of documents for GO annotation is a challenging task, as was the automated extraction of GO code hierarchies and evidence codes. However, automating these tasks would provide substantial benefit to biomedical curation, and therefore work in this area must continue. Additional experience will allow comparison and further analysis about which algorithmic features are most useful in biomedical document classification, and better understanding of the task characteristics that make automated classification feasible and useful for biomedical document curation. The TREC Genomics Track will be continuing in 2005 focusing on a wider range of triage tasks and improving results from 2004.
{"title":"The TREC 2004 genomics track categorization task: classifying full text biomedical documents.","authors":"Aaron M Cohen, William R Hersh","doi":"10.1186/1747-5333-1-4","DOIUrl":"https://doi.org/10.1186/1747-5333-1-4","url":null,"abstract":"<p><strong>Background: </strong>The TREC 2004 Genomics Track focused on applying information retrieval and text mining techniques to improve the use of genomic information in biomedicine. The Genomics Track consisted of two main tasks, ad hoc retrieval and document categorization. In this paper, we describe the categorization task, which focused on the classification of full-text documents, simulating the task of curators of the Mouse Genome Informatics (MGI) system and consisting of three subtasks. One subtask of the categorization task required the triage of articles likely to have experimental evidence warranting the assignment of GO terms, while the other two subtasks were concerned with the assignment of the three top-level GO categories to each paper containing evidence for these categories.</p><p><strong>Results: </strong>The track had 33 participating groups. The mean and maximum utility measure for the triage subtask was 0.3303, with a top score of 0.6512. No system was able to substantially improve results over simply using the MeSH term Mice. Analysis of significant feature overlap between the training and test sets was found to be less than expected. Sample coverage of GO terms assigned to papers in the collection was very sparse. Determining papers containing GO term evidence will likely need to be treated as separate tasks for each concept represented in GO, and therefore require much denser sampling than was available in the data sets. The annotation subtask had a mean F-measure of 0.3824, with a top score of 0.5611. The mean F-measure for the annotation plus evidence codes subtask was 0.3676, with a top score of 0.4224. Gene name recognition was found to be of benefit for this task.</p><p><strong>Conclusion: </strong>Automated classification of documents for GO annotation is a challenging task, as was the automated extraction of GO code hierarchies and evidence codes. However, automating these tasks would provide substantial benefit to biomedical curation, and therefore work in this area must continue. Additional experience will allow comparison and further analysis about which algorithmic features are most useful in biomedical document classification, and better understanding of the task characteristics that make automated classification feasible and useful for biomedical document curation. The TREC Genomics Track will be continuing in 2005 focusing on a wider range of triage tasks and improving results from 2004.</p>","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-03-14","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-1-4","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26043295","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
{"title":"Launching the \"Journal of Biomedical Discovery and Collaboration\"","authors":"Neil R Smalheiser","doi":"10.1186/1747-5333-1-1","DOIUrl":"https://doi.org/10.1186/1747-5333-1-1","url":null,"abstract":"","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-1-1","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"65689532","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
William R Hersh, Ravi Teja Bhupatiraju, Laura Ross, Phoebe Roberts, Aaron M Cohen, Dale F Kraemer
Background: The goal of the TREC Genomics Track is to improve information retrieval in the area of genomics by creating test collections that will allow researchers to improve and better understand failures of their systems. The 2004 track included an ad hoc retrieval task, simulating use of a search engine to obtain documents about biomedical topics. This paper describes the Genomics Track of the Text Retrieval Conference (TREC) 2004, a forum for evaluation of IR research systems, where retrieval in the genomics domain has recently begun to be assessed.
Results: A total of 27 research groups submitted 47 different runs. The most effective runs, as measured by the primary evaluation measure of mean average precision (MAP), used a combination of domain-specific and general techniques. The best MAP obtained by any run was 0.4075. Techniques that expanded queries with gene name lists as well as words from related articles had the best efficacy. However, many runs performed more poorly than a simple baseline run, indicating that careful selection of system features is essential.
Conclusion: Various approaches to ad hoc retrieval provide a diversity of efficacy. The TREC Genomics Track and its test collection resources provide tools that allow improvement in information retrieval systems.
{"title":"Enhancing access to the Bibliome: the TREC 2004 Genomics Track.","authors":"William R Hersh, Ravi Teja Bhupatiraju, Laura Ross, Phoebe Roberts, Aaron M Cohen, Dale F Kraemer","doi":"10.1186/1747-5333-1-3","DOIUrl":"10.1186/1747-5333-1-3","url":null,"abstract":"<p><strong>Background: </strong>The goal of the TREC Genomics Track is to improve information retrieval in the area of genomics by creating test collections that will allow researchers to improve and better understand failures of their systems. The 2004 track included an ad hoc retrieval task, simulating use of a search engine to obtain documents about biomedical topics. This paper describes the Genomics Track of the Text Retrieval Conference (TREC) 2004, a forum for evaluation of IR research systems, where retrieval in the genomics domain has recently begun to be assessed.</p><p><strong>Results: </strong>A total of 27 research groups submitted 47 different runs. The most effective runs, as measured by the primary evaluation measure of mean average precision (MAP), used a combination of domain-specific and general techniques. The best MAP obtained by any run was 0.4075. Techniques that expanded queries with gene name lists as well as words from related articles had the best efficacy. However, many runs performed more poorly than a simple baseline run, indicating that careful selection of system features is essential.</p><p><strong>Conclusion: </strong>Various approaches to ad hoc retrieval provide a diversity of efficacy. The TREC Genomics Track and its test collection resources provide tools that allow improvement in information retrieval systems.</p>","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-1-3","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26043349","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}
This informal tutorial is intended for investigators and students who would like to understand the workings of information retrieval systems, including the most frequently used search engines: PubMed and Google. Having a basic knowledge of the terms and concepts of information retrieval should improve the efficiency and productivity of searches. As well, this knowledge is needed in order to follow current research efforts in biomedical information retrieval and text mining that are developing new systems not only for finding documents on a given topic, but extracting and integrating knowledge across documents.
{"title":"A tutorial on information retrieval: basic terms and concepts.","authors":"Wei Zhou, Neil R Smalheiser, Clement Yu","doi":"10.1186/1747-5333-1-2","DOIUrl":"https://doi.org/10.1186/1747-5333-1-2","url":null,"abstract":"<p><p>This informal tutorial is intended for investigators and students who would like to understand the workings of information retrieval systems, including the most frequently used search engines: PubMed and Google. Having a basic knowledge of the terms and concepts of information retrieval should improve the efficiency and productivity of searches. As well, this knowledge is needed in order to follow current research efforts in biomedical information retrieval and text mining that are developing new systems not only for finding documents on a given topic, but extracting and integrating knowledge across documents.</p>","PeriodicalId":87404,"journal":{"name":"Journal of biomedical discovery and collaboration","volume":null,"pages":null},"PeriodicalIF":0.0,"publicationDate":"2006-03-13","publicationTypes":"Journal Article","fieldsOfStudy":null,"isOpenAccess":false,"openAccessPdf":"https://sci-hub-pdf.com/10.1186/1747-5333-1-2","citationCount":null,"resultStr":null,"platform":"Semanticscholar","paperid":"26042581","PeriodicalName":null,"FirstCategoryId":null,"ListUrlMain":null,"RegionNum":0,"RegionCategory":"","ArticlePicture":[],"TitleCN":null,"AbstractTextCN":null,"PMCID":"OA","EPubDate":null,"PubModel":null,"JCR":null,"JCRName":null,"Score":null,"Total":0}