Frequently Asked Questions
Last updated: 01.08.2024
Why Search Smart?
- Adaptation of the 'Query Hit Count' method using different types of queries allowing the assessment of different academic database types. A big shout-out goes to the members of the EC3 Research Group at the Universidad de Granada for the groundwork and inspiration! Research ideas that helped build Search Smart includes some groundwork on query hit data by Orduña-Malea et al. (2014) and database comparisons by José Luis Ortega (2014).
- Application of 'metamorphic testing' to academic databases allowing their assessment despite limited data availability. Over the years, we developed and tested various metamorphic relations that help assess the qualities and limitations of specific search functionalities. At a basic level, these relations test how variations in the search queries translate into changes in the search results. To learn about some basic principles of 'metamorphic testing' you can read, e.g., Segura et al. (2020). For more information on the tests we perform, read the FAQ section 'Our testing'.
- Introduction of the 'Search Triangle’ notion to academic search, specifying that databases and heuristics need to be matched to the goals different search types have.
- Introduction of the 'Basket of Keywords' method allowing the assessment of subject coverage of a large number of databases.
- Gusenbauer, M. (2019). Google Scholar to Overshadow Them All? Comparing the Sizes of 12 Academic Search Engines and Bibliographic Databases. Scientometrics, 118(1), 177–214. https://doi.org/10.1007/s11192-018-2958-5 [direct download]
- Gusenbauer, M., & Haddaway, N. R. (2020). Which Academic Search Systems are Suitable for Systematic Reviews or Meta-Analyses? Evaluating Retrieval Qualities of Google Scholar, PubMed and 26 other Resources. Research Synthesis Methods, 11(2), 181–217. https://doi.org/10.1002/jrsm.1378 [direct download]
- Gusenbauer, M., & Haddaway, N. R. (2021). What every Researcher should know about Searching – Clarified Concepts, Search Advice, and an Agenda to improve Finding in Academia. Research Synthesis Methods, 12(2), 136–147. https://doi.org/10.1002/jrsm.1457 [direct download]
- Gusenbauer, M. (2022). Search where you will find most: Comparing the disciplinary coverage of 56 bibliographic databases. Scientometrics, 127, 2683–2745. https://doi.org/10.1007/s11192-022-04289-7 [direct download]
- Orduña-Malea, E., Ayllón, J. M., Martín-Martín, A., & Delgado López-Cózar, E. (2014). About the size of Google Scholar: playing the numbers. EC3 Working Papers, 18(23).
- Ortega, J. L. (2014). Academic Search Engines: A quantitative outlook. Chandos information professional series. Chandos Publishing/Elsevier.
- Segura, S., Towey, D., Zhou, Z. Q., & Chen, T. Y. (2020). Metamorphic Testing: Testing the Untestable. IEEE Software, 37(3), 46–53. https://doi.org/10.1109/MS.2018.2875968
At the moment (as of August 1st, 2024), we cover 106 databases and their search systems. Here is the list, ranked alphabetically:
- ABI/Inform Global (via ProQuest)
- Academia.edu
- Academic Search Elite (via EBSCOhost)
- Academic Search Premier (via EBSCOhost)
- ACM Guide to Computing Literature
- ACS Publications
- AIS eLibrary
- APA PsycInfo (via EBSCOhost)
- APA PsycInfo (via Ovid)
- APA PsycNet
- Arts & Humanities Citation Index (via Web of Science)
- arXiv
- Bielefeld Academic Search Engine
- Biological Science Database (via ProQuest)
- BIOSIS Citation Index (via Web of Science)
- Business Source Premier (via EBSCOhost)
- CAB Abstracts (via Ovid)
- CAS SciFinder-n
- CINAHL (via EBSCOhost)
- CINAHL Plus (via EBSCOhost)
- ClinicalTrials.gov
- CNKI Overseas
- Cochrane Database of Systematic Reviews
- Cochrane Library - CENTRAL
- Conference Proceedings Citation Index - Science (via Web of Science)
- Conference Proceedings Citation Index - Social Science & Humanities (via Web of Science)
- CORE
- Crossref
- dblp
- Dimensions - Clinical Trials
- Dimensions - Publications
- Dimensions - Publications (free)
- Dissertations & Theses Global (via ProQuest)
- DOAJ
- Earth, Atmospheric & Aquatic Science Database (via ProQuest)
- EconBiz
- EconLit (via EBSCOhost)
- EconStor
- Embase (via Ovid)
- Emerald Insight
- Emerging Sources Citation Index (via Web of Science)
- Environmental Science Database (via ProQuest)
- Epistemonikos
- ERIC
- ERIC (via EBSCOhost)
- EU Clinical Trials Register
- Europe PMC
- Food Science and Technology Abstracts (via EBSCOhost)
- GeoRef (via ProQuest)
- Google Scholar
- GreenFILE (via EBSCOhost)
- HAL
- IEEE Xplore
- Ingenta Connect
- International Bibliography of the Social Sciences (via ProQuest)
- Internet Archive Scholar
- JSTOR
- Lens
- Medline (via EBSCOhost)
- Medline (via Ovid)
- Medline (via Web of Science)
- Mendeley
- Naver Academic
- Nursing & Allied Health Database (via ProQuest)
- Open Access Theses and Dissertations
- OpenAIRE
- OpenAlex
- Overton - Scholarly Articles
- Overton - Policy Documents
- Paperity
- Pascal and Francis Bibliographic Databases
- Policy Commons
- Psychology & Behavioral Sciences Collection (via EBSCOhost)
- Public Health Database (via ProQuest)
- Public Library of Science (PLOS)
- PubMed
- RePEc (via EconPapers)
- RePEc (via IDEAS)
- ResearchGate
- SAGE Journals Online
- ScanMedicine
- Science Citation Index Expanded (via Web of Science)
- ScienceDirect
- ScienceOpen
- Scilit
- Scinapse
- Scite
- SciTech Premium Collection (via ProQuest)
- Scopus
- Semantic Scholar
- Social Sciences Citation Index (via Web of Science)
- Social Sciences Premium Collection (via ProQuest)
- SocINDEX (via EBSCOhost)
- Sociological Abstracts (via ProQuest)
- SPORTDiscus (via EBSCOhost)
- SpringerLink
- SSRN
- Taylor and Francis Online
- Virtual Health Library
- Web of Science Core Collection
- WHO International Clinical Trials Registry Platform
- Wiley Online Library
- World Transit Research
- WorldCat - Article/chapter only
- WorldCat - Thesis/dissertation
- zbMATH Open
Databases we explicitly decided NOT to cover (see selection criteria under 'our testing'):
- ArnetMiner (QHC data issues)
- OSF Preprints (QHC data issues)
- Scielo (non-English focus)
- Transport Research International Documentation (QHC data issues)
- WorldWideScience (QHC data issues)
- Gusenbauer, M. (2019). Google Scholar to Overshadow Them All? Comparing the Sizes of 12 Academic Search Engines and Bibliographic Databases. Scientometrics, 118(1), 177–214. https://doi.org/10.1007/s11192-018-2958-5 [direct download]
- Gusenbauer, M., & Haddaway, N. R. (2020). Which Academic Search Systems are Suitable for Systematic Reviews or Meta-Analyses? Evaluating Retrieval Qualities of Google Scholar, PubMed and 26 other Resources. Research Synthesis Methods, 11(2), 181–217. https://doi.org/10.1002/jrsm.1378 [direct download]
- Gusenbauer, M., & Haddaway, N. R. (2021). What every Researcher should know about Searching – Clarified Concepts, Search Advice, and an Agenda to improve Finding in Academia. Research Synthesis Methods, 12(2), 136–147. https://doi.org/10.1002/jrsm.1457 [direct download]
- Gusenbauer, M. (2022). Search where you will find most: Comparing the disciplinary coverage of 56 bibliographic databases. Scientometrics, 127, 2683–2745. https://doi.org/10.1007/s11192-022-04289-7 [direct download]
Specific advice
- MUST: Select the subject(s) that are relevant to you.
- MUST: Filter system functionality that is important for you either via one of our three presets or via the many options provided in the filter list.
- MUST: Sort the best options according to your needs: 'most coverage' ('total coverage'; 'abs. subj. cov.'; 'record type coverage') vs. 'most specialized' ('rel. subj. cov.') in your subject, etc.
- OPTIONAL: If you do not have institutional access to paywalled systems (e.g., ProQuest, EBSCOhost, Ovid, Web of Science), select 'Non-paywalled databases', and freely accessible options will show.
- What: the database(s) you accessed and the search system(s) that provided access to the database(s). Sometimes the database is the same as the search system (e.g., 'SCOPUS'), and sometimes the database (e.g., Medline) can be accessed through different search systems (e.g., 'EBSCOhost', 'Web of Science'). is (this is sometimes the same as the database itself: e.g., 'SCOPUS' is the name of the database and the system, yet 'Embase' is the name of the database. ATTENTION: particularly 'Web of Science' delivers different versions to institutions that mostly differ in their retrospective coverages. Thus, always report the underlying indices you searched. The 'Web of Science Core Collection' consists of a varying number and quality of databases that is different for each university.
- When: date of search
- How: the exact search query that you used (incl. keywords, operators, field codes) and the filters you employed
- How many: how many records did you identify (keep a copy of the export of records for future reference)
- After you selected all the criteria important to you...
- Click 'Save selection to PDF'
- Done. Review the sorted results with all search filters you selected.
Presets
- 'Systematic keyword searching'
- 'Backward citation searching'
- 'Forward citation searching'
- Minimum search string length (narrow field code) is 25 or more
- Verbatim queries
- Reproducible queries over time/place
- Boolean OR
- Boolean AND
- Boolean operators work exactly
- Field code "abstract"
- Nested search (parenthesis)
- Accessible records: 1000 or more (systematic searches will, in most cases, go well beyond the first results page)
- Bulk select records
- Bulk export records: 50 or more at a time
- Backward citation information
- Accessible records: 1000 or more
- Bulk select records' backward citations
- Bulk export backward citations: 500 or more at a time
- Forward citation information
- Accessible records: 1000 or more
- Bulk select records' forward citations
- Bulk export forward citations: 500 or more at a time
Functions
- Subject coverage: share of records from a specific subject
- Keyword coverage: prevalence of records including specific keywords in the title
- Record type coverage: share of records from a specific type
- Retrospective coverage: share of records from a specific time range
- Open access coverage: share of open access records
- Interface: some basic search options the search interface offers
- Sorting options: how you may sort search results on the database
- Query: how well you can perform queries on a database
- Field codes: what field codes you may use to narrow your search to specific areas/meta-data of a record
- Operators: what operators you may use to construct a keyword query and whether they work as expected
- Pre-/Post-query filters: what filter types (facets) the search interface offers to narrow down your search results
- Record type filters: what record types you may filter the search results with
- Citation search: what types of citation searches the database offers (incl. suggestions of related records)
- Retrieval: what retrieval options the database offers
- Export formats: what formats search results may be exported with
- Alphabetically (A-Z)
- Alphabetically (Z-A)
- Total coverage (descending) - called 'clinical trials coverage' in 'clinical trials' view
- Subject coverage (absolute / descending)
- Subject coverage (relative / descending)
- Keyword coverage (absolute / descending)
- Keyword coverage (relative / descending)
- Record type coverage (absolute / descending)
- Record type coverage (relative / descending)
- Subject x Record type coverage (absolute / descending)
- Subject x Record type coverage (relative / descending)
- Open access coverage (descending) - not available in 'clinical trials' view
- General information about the database, like its owner, headquarters, year of launch, or a verbal description from the system providers.
- Retrospective coverage illustrated by a line chart.
- Subject coverage illustrated by a donut chart.
- Record type coverage illustrated by a donut chart.
- Presets assessment: systematic keyword searching, forward citation searching, grey literature searching.
- Detailed information on search functionalities: interface, query, operators, citations search/filtering, retrieval.
To compare up to six databases you can pin them and press the yellow 'compare' button. Change your selection on the go and find out which databases best fits your purpose.
You can save your filter settings by copying the URL in the browser tab or using the 'share' button. This will automatically save the filters and sorting options you selected. You can re-use these filters again to account for the changes that occurred in systems and databases in the meantime. As databases update and functionalities change, your list of best databases will change too. So keep revisiting Search Smart to find out about the latest changes and new databases that might do your job even better.
You can either copy the URL directly, share it with peers or extract it from a PDF you created to save your selection of databases.
Saving a PDF with your selection of databases helps you to store the state of the search system and database landscape captured by Search Smart at that time. You can append the document in reviews and research papers to justify database selection. Such export will reflect the filter criteria that were important for you at that specific point in time and the state of search systems and databases were in.
Moreover, the specific filter settings of your selection are also stored in the PDF. This way you can re-evaluate your selection of databases after some time and incorporate changes that occurred in the meantime.
- Gusenbauer, M. (2023). A free online guide to researchers’ best search options. Nature, 615, 586. https://doi.org/10.1038/d41586-023-00845-0
- Gusenbauer, M. (2019). Google Scholar to Overshadow Them All? Comparing the Sizes of 12 Academic Search Engines and Bibliographic Databases. Scientometrics, 118(1), 177–214. https://doi.org/10.1007/s11192-018-2958-5 [direct download]
- Gusenbauer, M., & Haddaway, N. R. (2020). Which Academic Search Systems are Suitable for Systematic Reviews or Meta-Analyses? Evaluating Retrieval Qualities of Google Scholar, PubMed and 26 other Resources. Research Synthesis Methods, 11(2), 181–217. https://doi.org/10.1002/jrsm.1378 [direct download]
- Gusenbauer, M., & Haddaway, N. R. (2021). What every Researcher should know about Searching – Clarified Concepts, Search Advice, and an Agenda to improve Finding in Academia. Research Synthesis Methods, 12(2), 136–147. https://doi.org/10.1002/jrsm.1457 [direct download]
- Gusenbauer, M. (2022). Search where you will find most: Comparing the disciplinary coverage of 56 bibliographic databases. Scientometrics, 127, 2683–2745. https://doi.org/10.1007/s11192-022-04289-7 [direct download]
Our testing
We want to include as many search systems and databases as possible. We started with 70 databases and frequently add new ones.
To be consistent in our testing and to warrant comparability, we need to focus on certain types of databases. Therefore, we have specific selection criteria for the inclusion of databases:
- Databases with mostly scholarly content, e.g., journal articles, conference papers, academic books. However, it can be difficult to determine whether a database is 'scholarly' and how much scholarly content it has.
- Databases with a disciplinary focus on at least one of the 26 ASJC subjects. Very narrowly focused databases are excluded, e.g., databases on theatre studies, a sub-discipline of Arts and Humanities.
- Mostly English content (as much as we would love to include non-English-focused databases, this is not possible due to our testing procedures)
- At first, we focus on large databases with more than 1,000,000 records, yet we also include smaller ones if they are popular. Clinical trials databases need to cover at least 100,000 trials. Often, smaller databases will be included in the larger databases we cover anyway.
- Databases that inform about query hit counts, i.e., the number of hits a keyword query retrieves.
We want to include a diverse set of databases. We deliberately include open and paywalled, newer, and established databases in the same comparison. This helps readers assess their options compared to the databases they might already know.
There are many new, innovative search systems out there. We would like to include them all in our comparison. Yet, we need to stick to the requirements defined above. If you know a system we still miss, please contact us, and we will happily review it.
We used the number of search results for specific keywords to determine the subject coverage of databases. For each of the 26 subjects, we chose 14 keywords that were identified as most representative of the subject. We established representativeness by selecting keywords that were most prevalent in the titles of articles in the focussed subject while being least prevalent in article titles of other subjects. This way, the keywords were determined based on discriminant validity rather than chance, popularity, or frequency. Following our approach, we did not include terms such as 'evolution' that may have different meanings in different disciplines. Rather, we chose keywords such as 'boson' that were almost exclusively used in Physics, for example.
The only keyword we included that does not follow this logic is 'covid-19', which we included as a multi-disciplinary keyword of great relevance. Thus, users can determine the prevalence of 'covid-19' in scholarly databases we analyze.
Detailed information on our methodology can be found here: Gusenbauer, M. (2022). Search where you will find most: Comparing the disciplinary coverage of 56 bibliographic databases. Scientometrics, 127, 2683–2745. https://doi.org/10.1007/s11192-022-04289-7
We have periodic intervals for automatic tests. The intervals are different for each test and can be weeks or months. As many search systems do not change frequently, update intervals of several months are not an issue. In general, databases update only gradually with steadily increasing coverage counts, making coverage assessments valid over long periods. If you find a system has updated and this is not yet reflected in our data, don't hesitate to get in touch with us.
As most (proprietary) system providers do not allow direct access to databases, metamorphic testing is the next best thing to independently verify the workings of their claimed functionalities.
Metamorphic testing cannot validate functionality - we can only show whether it is plausible to assume that a system works concerning functionalities the test covers.
This means it is plausible to assume that a system provides some functionality if it has passed our tests. Yet, we can never be fully sure it fails in some other circumstances our tests do not cover. We try to gear our metamorphic tests, so they are sensitive to system flaws, yet we cannot warrant that a system actually always works. In this regard, Search Smart is, however, a huge step forward from the situation before - where we had no systematic testing and monitoring of the systems millions of researchers build their research on.
About
During my Ph.D. I felt the pain of not knowing whether I was using the best search options or not.
Search Smart is the tool I wish I had ten years ago when I started figuring out how to search (systematically). Identifying all literature on an ambiguously defined and labeled topic was no easy task. Armed with a search string of well over a thousand words, I quickly learned about different search systems' limitations. While some systems computed my queries, others did not. I learned the hard way how each one was different.
Search Smart is the product of a curiosity to understand and systematize academic search options. Since then, the goal has been to improve searching with increased transparency and better guidance.
"When content is abundant, content curation is an art." (Zarrin)
Like explorers must select their most appropriate telescope, explorers of the web must select their most appropriate search system. These lenses will direct their views and determine the things they will encounter.
Search Smart's mission is uncomplicated guidance of academics to find their best search options. Through meaningful and comprehensive testing of academic search systems that allow users to identify their best options with a few clicks.
Often, users are unaware of the great discrepancies in performance and functionalities between systems. Users often use what is popular among peers without being aware of alternatives. We want to make database selection more explicit and dynamic so users confidently know they search with the best option at hand.
Search Smart's vision is to foster educated researchers that use the best search options available. Heightened user expectations and a transparent search system landscape boost the continuous improvement of search systems that compete for being most fit-for-purpose.
We want to employ increasingly thorough testing to give an increasingly comprehensive overview of the landscape of scholarly databases. Search Smart should become an entity to stir up healthy competition among search providers, to make researchers demand more from search systems, and providers to better understand what functionalities are important to researchers for the pursuit of 'good science'.
Other
Here is a taxonomy of the most important terms we use at Search Smart:
- Basket of keywords (BOK): Novel analytic method based on specific keyword queries allowing the assessment of subject coverage of a large number of bibliographic databases in academia.
- Coverage: The prevalence of certain record types on a database.
- Database: The underlying (bibliographic) database accessible via a search system.
- Functionalities: The capabilities a search system offers users to search, filter, retrieve and manipulate search results.
- Preset: Pre-defined filter setting that selects multiple filters at once.
- Query hit count (QHC): Number of hits for a specific keyword query.
- Records: Any type of file hosted on a database.
- Search provider: The organization that operates the search system providing access to one or multiple databases.
- Search system: The system that assesses a database. Can be a search engine, an aggregator, or some other type.
The tests Search Smart performs and the results it illustrates are collected independently of search system providers. Thus, providers cannot directly access or manipulate test results.
Nevertheless, providers can help provide detailed information on the workings of the underlying search mechanisms and the database they access. This helps Search Smart to most accurately illustrate the characteristics of the search systems. Please do not hesitate to contact us, particularly if you feel certain information may be inaccurate. We will look into any inconsistencies so we make sure to capture database and search interface characteristics as accurately as possible.
If you like to see a different logo representing your database, please get in touch with us, and we are happy to change it.