How to merge Scopus and Web of Science (WoS) databases to use on Bibliometrix or Mendeley

Поділитися
Вставка
  • Опубліковано 20 лип 2024
  • This video shows two approaches to merging Scopus and Web of Science (WoS) databases. The first one generates an xlsx file without duplicates to be loaded on Bibliometrix. The second is simpler and generates a bib file without duplicates to be uploaded on Mendeley (or any other repository that manages bib files).
    0:00 Introduction
    0:23 Downloading R and RStudio
    0:51 Exporting bib file from Scopus
    2:15 Exporting bib file from Web of Science (WoS)
    3:53 Merging bib files to generate an xlsx file using RStudio
    7:02 Uploading xlsx file in Biblioshiny (Bibliometrix's interface)
    9:56 Merging bib files to generate a bib file using BibTex Tidy
    12:27 Acknowledgment
    Link to download R: cran.r-project.org/
    Link to download RStudio: www.rstudio.com/products/rstu...
    Link to BibTex Tidy: flamingtempura.github.io/bibt...
    Code to be used in RStudio to generate a xlsx file without duplicates:
    install.packages("bibliometrix") # if you don't have it installed
    setwd("/home/rafael/merge-scopus-wos/bib")
    library(bibliometrix)
    S = convert2df("scopus.bib", dbsource = "scopus", format = "bibtex")
    W = convert2df("wos.bib", dbsource = "isi", format = "bibtex")
    Database = mergeDbSources(S, W, remove.duplicated = TRUE)
    dim(Database)
    install.packages("openxlsx") # if you don't have it installed
    library(openxlsx)
    write.xlsx(Database, file = "database.xlsx")
    Tip 1: When using Bibliometrix, some graphics may not be generated due to "duplicate row.names are not allowed" error. It occurs when the last column (SR) of the xlsx database file contains non-unique values. This can happen if the mergeDbSources method in the R code fails to eliminate some duplicates (because titles of the same work sometimes present subtle differences and therefore are not eliminated).
    To prevent this issue, you should edit the xlsx file after it's created by following these steps:
    1. Remove any duplicate values in the DOI column (DI).
    2. Remove any duplicate values in the SR column.
    3. Ensure that the SR column has no missing values.
    An easy way to accomplish these steps is to use a color filter in Excel. Go to Home - Conditional Formatting - Highlight Cells Rules - Duplicate Values and apply the filter to the DI column, and later to the SR column. Then, manually delete any duplicate rows that are highlighted in the filtered view. By following these steps, you can avoid the "duplicate row.names are not allowed" error and ensure that your Bibliometrix analysis runs smoothly.
    Tip 2: When merging the bib files with Bibtex Tidy, choose to merge only based on "Matching DOIs". Using "Matching Keys" may delete different works of the same authors just because they share the same key.

КОМЕНТАРІ • 46

  • @julioareck
    @julioareck Рік тому +1

    Thanks!
    Because WoS only generates a maximum of 500 entries per datafile if you export full record and references, sometimes you end with more than one exported file from WoS. The convert2df code you kindly shared here seems to work for one file at a time and I had two, so I first used an editor (BibDesk) to merge the two .bib files into one, and then used your convert2df code. Everything worked just fine, thanks!

  • @hughliu7427
    @hughliu7427 Рік тому

    Crystally clear! You saved my life, my friend.

  • @dfgoulart
    @dfgoulart Рік тому

    Very clear and straigth to the point. Muito obrigado pela grande ajuda, Rafael!

  • @SantiagoAldunateSalas
    @SantiagoAldunateSalas 8 місяців тому

    thanks rafael! i tried many other tutorials, but this was the only that worked

  • @umermuhammad826
    @umermuhammad826 Рік тому +1

    Hi Rafael
    Thanks for a wonderful tutorial. It was really awesome. The best part of this post is the two tips. Two thumbs up!!!
    I would suggest an alternate way to remove duplicates from excel sheet.
    1. Select the column in which you want to find duplicates.
    2. Go to the "Data" tab in the Excel ribbon.
    3. Click on the "Remove Duplicates" button in the "Data Tools" group.
    4. A dialog box will appear with the column(s) selected for duplicate removal. Make sure the correct column is selected. Also make sure you expand the selection to select all columns in the excel sheet.
    5. Click the "OK" button.
    Excel will remove any duplicate values from the selected column, and you will be left with only unique values.
    In my case, I was able to remove 20 duplicates from an initial 3439 rows!!
    After that Biblioshiny worked without the "duplicate row.names are not allowed" error.
    So again a big big thanks for your guidance. Keep up the good work!!!

  • @Lily-cl6zk
    @Lily-cl6zk Рік тому

    Thank you so much for providing such a clear solution

  • @dutchi9030
    @dutchi9030 Місяць тому +1

    Thank you Rafael!!! super helpful

  • @camilakersten9061
    @camilakersten9061 8 місяців тому

    Thanks! Your video was very helpful!

  • @MAgphinRamadhan
    @MAgphinRamadhan Рік тому

    Thank you very much. I'll try

  • @bpcordero
    @bpcordero Рік тому +2

    I am trying to merge the Scopus and Dimensions databases, is this the same procedure? Thank you very much.

  • @windersconsultants
    @windersconsultants 25 днів тому

    Thank you!

  • @alissaht
    @alissaht 6 днів тому

    thank you , Rafael, however, the combined excel file had an "na" error and so I was unable to use it in rstudio or vosviewer.

  • @Lizzzzetazetazeta
    @Lizzzzetazetazeta Рік тому

    thank you so much

  • @champikakariyawasam2378
    @champikakariyawasam2378 8 місяців тому +1

    Hi Rafael, Many thanks for the wonderful video. I want to add some more records (not identified by Scopus or WoS) to the final merged Excel file. Is there an easy way of adding them?

    • @Xiaoyu17
      @Xiaoyu17 7 місяців тому

      same question

  • @elenafernandezdiaz9105
    @elenafernandezdiaz9105 11 місяців тому

    Good morning, first of all thank you very much because thanks to your video I was able to merge and generate a single database in wos and scopus. The only thing is that I am getting an error in Biblioshin and the graphic "Countries' Collaboration World Map" is this normal? I have followed your advice to manually remove duplicates that were left in excel generated but that graph still does not work for me, thanks in advance.

  • @shubhamjain1334
    @shubhamjain1334 Рік тому

    perfact🤩🤩

  • @user-ph2ry3wn3t
    @user-ph2ry3wn3t 5 місяців тому

    Hello
    Thank you very much for the video.
    I am getting the following error.
    I wonder what I should do?
    I would really appreciate if you can help
    Error in mergeDbSources(wos, scopus_1, scopus_2, remove.duplicated = T):
    could not find function "mergeDbSources"

  • @noorpk
    @noorpk Рік тому

    How to merge PubMed and WoS data?

  • @mizhimo
    @mizhimo Рік тому

    Is it possible to download high-resolution or vectorial images from the Biblioshiny interface?

    • @rafaelsqueiroz
      @rafaelsqueiroz  Рік тому

      Yes, you can set the dpi up to 600 in some exports.

  • @marianareyes2623
    @marianareyes2623 8 місяців тому

    What If I want to merge it to a bib file instead of a xlsx file ?

  • @claudiaperaza7472
    @claudiaperaza7472 Рік тому

    thanks! it is possible to merge a Wos, Scopus and Science direct databases to upload it to bibliometrix?

    • @rafaelsqueiroz
      @rafaelsqueiroz  Рік тому

      Hi Claudia,
      I believe there is no widespread method for merging these three specific bases. You may need to develop a specific code to combine them. However, I think it doesn't make much sense since Scopus encompasses (with title, abstract and other metadata) most of the articles present in Science Direct (where the full texts are available).

  • @alissaht
    @alissaht 19 днів тому

    the coding part is confusing. can you simplify the process more?

  • @satyaroshni8686
    @satyaroshni8686 Місяць тому

    Sir After merging i get the data in capital letters only and also space between them also disturbed. Please help.

  • @aseeladil337
    @aseeladil337 Рік тому

    Hi
    Can I use it in vosviewer?

  • @chianlee
    @chianlee 5 місяців тому

    Could we read IEEE bib file. I used convert2df with dbsource = "scopus", and I got an error:
    Missing fields: AU DE ID C1 CR
    Error in data.frame(..., check.names = FALSE) :
    arguments imply differing number of rows: 0, 55

  • @julianapattermann7661
    @julianapattermann7661 Рік тому +2

    Thanks for this video! After successfully merging a WOS and SCOPUS database and removing a large number of duplicates, I still receive the error message "double row names". Any ideas?

  • @chianlee
    @chianlee Рік тому

    Thank you so much, Rafael Queiroz. It's very obvious and helpful for me in carrying on MSc dissertation.

  • @user-vn9kv7zg6v
    @user-vn9kv7zg6v 10 місяців тому

    Hi @Rafael,
    The method you tell work well, but i need information, I have downloaded file from scopus and WOS by clicking contain all information. when i have merged file and opened it with bibliometrix then i am getting error "Number of cited references" and Cited references completely missing. Can you help me in this

  • @icefunkdark8555
    @icefunkdark8555 Рік тому

    Is there a way to get a list of the removed duplicates? thank you for the video!

    • @victorcarlosarruda
      @victorcarlosarruda Рік тому

      Hi, I have the same problem, did you manage to solve it?

    • @salmaboulait6057
      @salmaboulait6057 Місяць тому

      @@victorcarlosarruda any slotutions please !!!

  • @amirardeshir815
    @amirardeshir815 Рік тому

    Is it possible to share this code?

  • @jean-baptistearchange3321
    @jean-baptistearchange3321 Місяць тому

    Hello, can help me to get publications from WoS?

  • @husamaljawhar4800
    @husamaljawhar4800 Рік тому

    Thanks man

  • @user-bf8vo1sc7z
    @user-bf8vo1sc7z Рік тому +1

    I am receiving this error despite selecting all fields in Scopus_
    Missing fields: AU DE ID C1 CR
    Error in data.frame(..., check.names = FALSE) :
    arguments imply differing number of rows: 0, 2583
    How to resolve it?

  • @john110503
    @john110503 Рік тому

    Hi
    Am getting error while running the code

    • @user-mg8cd2mc8x
      @user-mg8cd2mc8x Рік тому

      Hey John Paul, I am also getting the error. Did you find the solution? It would of much help if you share it with me.

    • @john110503
      @john110503 Рік тому

      @@user-mg8cd2mc8x currently am doing with 1 database