1
|
Fang L, Salami MO, Weber GM, Torvik VI. uCite: The union of nine large-scale public PubMed citation datasets with reliability filtering. Data Brief 2025; 60:111535. [PMID: 40322502 PMCID: PMC12049819 DOI: 10.1016/j.dib.2025.111535] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 01/15/2025] [Revised: 02/28/2025] [Accepted: 03/28/2025] [Indexed: 05/08/2025] Open
Abstract
There has been a recent push to make public, aggregate, and increase coverage of bibliographic citation data. Here we describe uCite, a citation dataset containing 564 million PubMed citation pairs aggregated from the following nine sources: PubMed Central, iCite, OpenCitations, Dimensions, Microsoft Academic Graph, Aminer, Semantic Scholar, Lens, and OpCitance. Of these, 51 million (9%) were labeled unreliable, as determined by patterns of source discrepancies explained by ambiguous metadata, crosswalk, and typographical errors, citing future publications, and multi-paper documents. Each source contributes to improved coverage and reliability, but varies dramatically in precision and recall, estimates of which are contrasted with the Web of Science and Scopus herein.
Collapse
Affiliation(s)
- Liri Fang
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, United States
| | - Malik Oyewale Salami
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, United States
| | - Griffin M. Weber
- Department of Biomedical Informatics, Harvard Medical School, Boston, MA, United States
| | - Vetle I. Torvik
- School of Information Sciences, University of Illinois Urbana-Champaign, Champaign, IL, United States
| |
Collapse
|
2
|
Zhang L, Song N, Gui S, Wu K, Lu W. Bridging the gap in author names: building an enhanced author name dataset for biomedical literature system. J Am Med Inform Assoc 2024; 31:1648-1656. [PMID: 38916911 PMCID: PMC11258411 DOI: 10.1093/jamia/ocae127] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [MESH Headings] [Grants] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/26/2023] [Revised: 05/07/2024] [Accepted: 05/16/2024] [Indexed: 06/26/2024] Open
Abstract
OBJECTIVE Author name incompleteness, referring to only first initial available instead of full first name, is a long-standing problem in MEDLINE and has a negative impact on biomedical literature systems. The purpose of this study is to create an Enhanced Author Names (EAN) dataset for MEDLINE that maximizes the number of complete author names. MATERIALS AND METHODS The EAN dataset is built based on a large-scale name comparison and restoration with author names collected from multiple literature databases such as MEDLINE, Microsoft Academic Graph, and Semantic Scholar. We assess the impact of EAN on biomedical literature systems by conducting comparative and statistical analyses between EAN and MEDLINE's author names dataset (MAN) on 2 important tasks, author name search and author name disambiguation. RESULTS Evaluation results show that EAN improves the number of full author names in MEDLINE from 69.73 million to 110.9 million. EAN not only restores a substantial number of abbreviated names prior to the year 2002 when the NLM changed its author name indexing policy but also improves the availability of full author names in articles published afterward. The evaluation of the author name search and author name disambiguation tasks reveal that EAN is able to significantly enhance both tasks compared to MAN. CONCLUSION The extensive coverage of full names in EAN suggests that the name incompleteness issue can be largely mitigated. This has significant implications for the development of an improved biomedical literature system. EAN is available at https://zenodo.org/record/10251358, and an updated version is available at https://zenodo.org/records/10663234.
Collapse
Affiliation(s)
- Li Zhang
- Laboratory of Data Intelligence and Interdisciplinary Innovation of Nanjing University, Nanjing, Jiangsu, 210023, China
- School of Information Management, Nanjing University, Nanjing, Jiangsu, 210023, China
| | - Ningyuan Song
- Laboratory of Data Intelligence and Interdisciplinary Innovation of Nanjing University, Nanjing, Jiangsu, 210023, China
- School of Information Management, Nanjing University, Nanjing, Jiangsu, 210023, China
| | - Sisi Gui
- School of Information Management, Nanjing Agricultural University, Nanjing, Jiangsu, 210023, China
| | - Keye Wu
- Laboratory of Data Intelligence and Interdisciplinary Innovation of Nanjing University, Nanjing, Jiangsu, 210023, China
- School of Information Management, Nanjing University, Nanjing, Jiangsu, 210023, China
| | - Wei Lu
- School of Information Management, Wuhan University, Wuhan, Hubei, 430072, China
| |
Collapse
|
3
|
Jiang Y, Liu X. A construction and empirical research of the journal disruption index based on open citation data. Scientometrics 2023; 128:3935-3958. [PMID: 37287879 PMCID: PMC10195667 DOI: 10.1007/s11192-023-04737-y] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Grants] [Track Full Text] [Figures] [Journal Information] [Subscribe] [Scholar Register] [Received: 12/07/2022] [Accepted: 05/08/2023] [Indexed: 06/09/2023]
Abstract
For many years, the journal evaluation system has been centered on impact indicators, resulting in evaluation results that do not reflect the academic innovation of journals. To solve this issue, this study attempts to construct the Journal Disruption Index (JDI) from the perspective of measuring the disruption of each journal article. In the actual study, we measured the disruption of articles of 22 selected virology journals based on the OpenCitations Index of Crossref open DOI-to-DOI citations (COCI) first. Then we calculated the JDI of 22 virology journals based on the absolute disruption index (D Z ) of the articles. Finally, we conducted an empirical study on the differences and correlations between the impact indicators and disruption indicators as well as the evaluation effect of the disruption index. The results of the study show: (1) There are large differences in the ranking of journals based on disruption indicators and impact indicators. Among the 22 journals, 12 are ranked higher by JDI than Cumulative Impact Factor for 5 years (CIF5), the Journal Index for PR6 (JIPR6) and average Percentile in Subject Area (aPSA). The ranking difference of 17 journals between the two kinds of indicators is greater than or equal to 5. (2) There is a medium correlation between disruption indicators and impact indicators at the level of journals and papers. JDI is moderately correlated with CIF5, JIPR6 and aPSA, with correlation coefficients of 0.486, 0.471 and - 0.448, respectively. D Z was also moderately correlated with Cumulative Citation (CC), Percentile Ranking with 6 Classifications (PR6) and Percentile in Subject Area (PSA) with correlation coefficients of 0.593, 0.575 and - 0.593, respectively. (3) Compared with traditional impact indicators, the results of journal disruption evaluation are more consistent with the evaluation results of experts' peer review. JDI reflects the innovation level of journals to a certain extent, which is helpful to promote the evaluation of innovation in sci-tech journals.
Collapse
Affiliation(s)
- Yuyan Jiang
- Henan Research Center for Science Journals, Xinxiang Medical University, 601 Jinsui Road, Xinxiang, 453003 China
| | - Xueli Liu
- Henan Research Center for Science Journals, Xinxiang Medical University, 601 Jinsui Road, Xinxiang, 453003 China
- Faculty of Humanities & Social Sciences, Xinxiang Medical University, 601 Jinsui Road, Xinxiang, 453003 China
| |
Collapse
|
4
|
Mahuli AV, Sagar V, Vpk V, Mahuli SA, Kujur A. Bibliometric Analysis of Poor Oral Health as a Risk Factor for Oral Cancer. Cureus 2023; 15:e36015. [PMID: 37041926 PMCID: PMC10084796 DOI: 10.7759/cureus.36015] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Abstract] [Key Words] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Accepted: 03/10/2023] [Indexed: 03/13/2023] Open
Abstract
Poor oral health is a risk factor for oral cancer, and bibliometrics can tell us important things about publication trends and research. Oral cancer risk factors include smoking, betel nut chewing, alcohol consumption, trauma from sharp teeth, chronic infections, and other factors related to oral health. There is a need to understand the role of poor oral health as a risk factor. Thus, this study aimed to conduct a bibliometric analysis of the literature on poor oral health as a risk factor for oral cancer. A bibliometric analysis was conducted for poor oral health as a risk factor for oral cancer using RStudio 2021.09.0+351 "Ghost Orchid" Release (2021-09-20) for Windows, package "bibliometrix." The literary data for this study were derived from Elsevier's Scopus database, and the data were exported in BibTex format. The results considered the time frame of 1983 to 2022, with journals, books, newspaper articles, and others as sources, accounting for a total of 543 documents. The search yielded a total of 2,882 authors, with a total of 3,306 appearances. The results show that the research on poor oral health and oral cancer is mainly led by the United States (106), India (49), and China (46). The top author is Warnakulasuriya S, followed by Worthington HV. The research shows the countries that are currently working on the topics and helps set up future collaborations to improve the evidence produced and help the scientific community by finding research gaps and experts in this area of research.
Collapse
Affiliation(s)
- Amit V Mahuli
- Public Health Dentistry and Preventive Dentistry, Dental College, Rajendra Institute of Medical Sciences, Ranchi, IND
| | - Vidya Sagar
- Preventive and Social Medicine, Rajendra Institute of Medical Sciences, Ranchi, IND
| | - Vedha Vpk
- Public Health Dentistry and Preventive Dentistry, Dental College, Rajendra Institute of Medical Sciences, Ranchi, IND
| | - Simpy A Mahuli
- Public Health Dentistry, Rajendra Institute of Medical Sciences, Ranchi, IND
| | - Anit Kujur
- Preventive and Social Medicine, Rajendra Institute of Medical Sciences, Ranchi, IND
| |
Collapse
|
5
|
Malhotra K, Goyal K, Malhotra S. Is global surgery really global? Evaluating global and gender diversity in global surgery research. Br J Surg 2022; 109:1331-1332. [PMID: 36149783 DOI: 10.1093/bjs/znac328] [Citation(s) in RCA: 3] [Impact Index Per Article: 1.0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Received: 08/28/2022] [Accepted: 09/02/2022] [Indexed: 12/31/2022]
Affiliation(s)
- Kashish Malhotra
- Department of Surgery, Dayanand Medical College and Hospital, Punjab, India
| | - Kashish Goyal
- Department of Surgery, Dayanand Medical College and Hospital, Punjab, India
| | - Sakshi Malhotra
- Department of Obstetrics and Gynaecology, Pandit Bhagwat Dayal Sharma Post Graduate Institute of Medical Sciences Rohtak, Haryana, India
| |
Collapse
|
6
|
Liang Z, Mao J, Li G. Bias against scientific novelty: A prepublication perspective. J Assoc Inf Sci Technol 2022. [DOI: 10.1002/asi.24725] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/21/2022]
Affiliation(s)
- Zhentao Liang
- School of Information Management Wuhan University Wuhan China
- Center for Studies of Information Resources Wuhan University Wuhan China
| | - Jin Mao
- School of Information Management Wuhan University Wuhan China
- Center for Studies of Information Resources Wuhan University Wuhan China
| | - Gang Li
- School of Information Management Wuhan University Wuhan China
- Center for Studies of Information Resources Wuhan University Wuhan China
| |
Collapse
|
7
|
Li X, Tang X, Cheng Q. Predicting the clinical citation count of biomedical papers using multilayer perceptron neural network. J Informetr 2022. [DOI: 10.1016/j.joi.2022.101333] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/29/2022]
|
8
|
Wang S, Ma Y, Mao J, Bai Y, Liang Z, Li G. Quantifying scientific breakthroughs by a novel disruption indicator based on knowledge entities. J Assoc Inf Sci Technol 2022. [DOI: 10.1002/asi.24719] [Citation(s) in RCA: 0] [Impact Index Per Article: 0] [Reference Citation Analysis] [Track Full Text] [Journal Information] [Subscribe] [Scholar Register] [Indexed: 11/12/2022]
Affiliation(s)
- Shiyun Wang
- Center for Studies of Information Resources Wuhan University Wuhan Hubei China
- School of Information Management Wuhan University Wuhan Hubei China
| | - Yaxue Ma
- School of Information Management Nanjing University Nanjing Jiangsu China
| | - Jin Mao
- Center for Studies of Information Resources Wuhan University Wuhan Hubei China
- School of Information Management Wuhan University Wuhan Hubei China
| | - Yun Bai
- Center for Studies of Information Resources Wuhan University Wuhan Hubei China
- School of Information Management Wuhan University Wuhan Hubei China
| | - Zhentao Liang
- Center for Studies of Information Resources Wuhan University Wuhan Hubei China
- School of Information Management Wuhan University Wuhan Hubei China
| | - Gang Li
- Center for Studies of Information Resources Wuhan University Wuhan Hubei China
| |
Collapse
|