How to merge Scopus and Web of Science (WoS) databases to use on Bibliometrix or Mendeley
Вставка
- Опубліковано 20 лип 2024
- This video shows two approaches to merging Scopus and Web of Science (WoS) databases. The first one generates an xlsx file without duplicates to be loaded on Bibliometrix. The second is simpler and generates a bib file without duplicates to be uploaded on Mendeley (or any other repository that manages bib files).
0:00 Introduction
0:23 Downloading R and RStudio
0:51 Exporting bib file from Scopus
2:15 Exporting bib file from Web of Science (WoS)
3:53 Merging bib files to generate an xlsx file using RStudio
7:02 Uploading xlsx file in Biblioshiny (Bibliometrix's interface)
9:56 Merging bib files to generate a bib file using BibTex Tidy
12:27 Acknowledgment
Link to download R: cran.r-project.org/
Link to download RStudio: www.rstudio.com/products/rstu...
Link to BibTex Tidy: flamingtempura.github.io/bibt...
Code to be used in RStudio to generate a xlsx file without duplicates:
install.packages("bibliometrix") # if you don't have it installed
setwd("/home/rafael/merge-scopus-wos/bib")
library(bibliometrix)
S = convert2df("scopus.bib", dbsource = "scopus", format = "bibtex")
W = convert2df("wos.bib", dbsource = "isi", format = "bibtex")
Database = mergeDbSources(S, W, remove.duplicated = TRUE)
dim(Database)
install.packages("openxlsx") # if you don't have it installed
library(openxlsx)
write.xlsx(Database, file = "database.xlsx")
Tip 1: When using Bibliometrix, some graphics may not be generated due to "duplicate row.names are not allowed" error. It occurs when the last column (SR) of the xlsx database file contains non-unique values. This can happen if the mergeDbSources method in the R code fails to eliminate some duplicates (because titles of the same work sometimes present subtle differences and therefore are not eliminated).
To prevent this issue, you should edit the xlsx file after it's created by following these steps:
1. Remove any duplicate values in the DOI column (DI).
2. Remove any duplicate values in the SR column.
3. Ensure that the SR column has no missing values.
An easy way to accomplish these steps is to use a color filter in Excel. Go to Home - Conditional Formatting - Highlight Cells Rules - Duplicate Values and apply the filter to the DI column, and later to the SR column. Then, manually delete any duplicate rows that are highlighted in the filtered view. By following these steps, you can avoid the "duplicate row.names are not allowed" error and ensure that your Bibliometrix analysis runs smoothly.
Tip 2: When merging the bib files with Bibtex Tidy, choose to merge only based on "Matching DOIs". Using "Matching Keys" may delete different works of the same authors just because they share the same key.
Thanks!
Because WoS only generates a maximum of 500 entries per datafile if you export full record and references, sometimes you end with more than one exported file from WoS. The convert2df code you kindly shared here seems to work for one file at a time and I had two, so I first used an editor (BibDesk) to merge the two .bib files into one, and then used your convert2df code. Everything worked just fine, thanks!
Crystally clear! You saved my life, my friend.
Very clear and straigth to the point. Muito obrigado pela grande ajuda, Rafael!
thanks rafael! i tried many other tutorials, but this was the only that worked
Hi Rafael
Thanks for a wonderful tutorial. It was really awesome. The best part of this post is the two tips. Two thumbs up!!!
I would suggest an alternate way to remove duplicates from excel sheet.
1. Select the column in which you want to find duplicates.
2. Go to the "Data" tab in the Excel ribbon.
3. Click on the "Remove Duplicates" button in the "Data Tools" group.
4. A dialog box will appear with the column(s) selected for duplicate removal. Make sure the correct column is selected. Also make sure you expand the selection to select all columns in the excel sheet.
5. Click the "OK" button.
Excel will remove any duplicate values from the selected column, and you will be left with only unique values.
In my case, I was able to remove 20 duplicates from an initial 3439 rows!!
After that Biblioshiny worked without the "duplicate row.names are not allowed" error.
So again a big big thanks for your guidance. Keep up the good work!!!
Thank you so much for providing such a clear solution
Thank you Rafael!!! super helpful
Thanks! Your video was very helpful!
Thank you very much. I'll try
I am trying to merge the Scopus and Dimensions databases, is this the same procedure? Thank you very much.
Thank you!
thank you , Rafael, however, the combined excel file had an "na" error and so I was unable to use it in rstudio or vosviewer.
thank you so much
Hi Rafael, Many thanks for the wonderful video. I want to add some more records (not identified by Scopus or WoS) to the final merged Excel file. Is there an easy way of adding them?
same question
Good morning, first of all thank you very much because thanks to your video I was able to merge and generate a single database in wos and scopus. The only thing is that I am getting an error in Biblioshin and the graphic "Countries' Collaboration World Map" is this normal? I have followed your advice to manually remove duplicates that were left in excel generated but that graph still does not work for me, thanks in advance.
perfact🤩🤩
Hello
Thank you very much for the video.
I am getting the following error.
I wonder what I should do?
I would really appreciate if you can help
Error in mergeDbSources(wos, scopus_1, scopus_2, remove.duplicated = T):
could not find function "mergeDbSources"
How to merge PubMed and WoS data?
Is it possible to download high-resolution or vectorial images from the Biblioshiny interface?
Yes, you can set the dpi up to 600 in some exports.
What If I want to merge it to a bib file instead of a xlsx file ?
thanks! it is possible to merge a Wos, Scopus and Science direct databases to upload it to bibliometrix?
Hi Claudia,
I believe there is no widespread method for merging these three specific bases. You may need to develop a specific code to combine them. However, I think it doesn't make much sense since Scopus encompasses (with title, abstract and other metadata) most of the articles present in Science Direct (where the full texts are available).
the coding part is confusing. can you simplify the process more?
Sir After merging i get the data in capital letters only and also space between them also disturbed. Please help.
Hi
Can I use it in vosviewer?
Could we read IEEE bib file. I used convert2df with dbsource = "scopus", and I got an error:
Missing fields: AU DE ID C1 CR
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 0, 55
Thanks for this video! After successfully merging a WOS and SCOPUS database and removing a large number of duplicates, I still receive the error message "double row names". Any ideas?
Hi, I have the same problem, did you manage to solve it?
I am also having the same problem.
Check the tips in the description.
Thank you so much, Rafael Queiroz. It's very obvious and helpful for me in carrying on MSc dissertation.
Hi @Rafael,
The method you tell work well, but i need information, I have downloaded file from scopus and WOS by clicking contain all information. when i have merged file and opened it with bibliometrix then i am getting error "Number of cited references" and Cited references completely missing. Can you help me in this
Same problem
Is there a way to get a list of the removed duplicates? thank you for the video!
Hi, I have the same problem, did you manage to solve it?
@@victorcarlosarruda any slotutions please !!!
Is it possible to share this code?
Hello, can help me to get publications from WoS?
Thanks man
I am receiving this error despite selecting all fields in Scopus_
Missing fields: AU DE ID C1 CR
Error in data.frame(..., check.names = FALSE) :
arguments imply differing number of rows: 0, 2583
How to resolve it?
same problem
Hi
Am getting error while running the code
Hey John Paul, I am also getting the error. Did you find the solution? It would of much help if you share it with me.
@@user-mg8cd2mc8x currently am doing with 1 database