Solution for a better literature review

FINANCIAL TECHNOLOGY AND OTHER RELATING ISSUES

3. Solution for a better literature review

In order to overcome those concerns mentioned above, researchers are said to adopt the following activities. Firstly, for the selection bias, they should carefully design and trial a search strategy as well as publish search methods in a prior protocol for peer-review (Haddaway et al., 2020). Next, they may attempt to find grey literature by using multiple bibliographic sources of both unpublished academic studies and organisational reports as possible evidence to test publication bias. Finally, the researchers need to carefully test and adopt a critical appraisal tool blueprinted from existing robust ones before starting the official process (Haddaway et al., 2020).

Many researchers have found great help in doing those activities via R - a free, open- source, and easily accessible software. It is a “free software environment for statistical computing and graphics” and much more. Below is the tutorial on how to use R packages to achieve a better literature review.

Firstly, the R packages need to be installed and loaded.

Figure 1: Loading R packages

These packages R [Version 4.1.1; R Core Team (2020)] and the R-packages dplyr [Version 1.0.7; Wickham et al. (2021)], formatR [Version 1.11; Xie (2021)], ggplot2 [Version 3.3.5; Wickham (2016)], ggraph [Version 2.0.5; Pedersen (2020)], igraph [Version 1.2.6;

Csardi & Nepusz (2006)], litsearchr [Version 1.0.0; Grames et al. (2019)], pacman [Version 0.5.1; Rinker & Kurkiewicz (2018)], papaja [Version 0.1.0.9997; Aust & Barth (2020)], readr [Version 2.0.1; Wickham & Hester (2020)], revtools [Version 0.4.1; Westgate (2019)], rmarkdown [Version 2.11; Xie et al. (2018); Xie et al. (2020)], and rticles [Version 0.21;

Allaire et al. (2021)] are used for all analyses.

In this example, assume that the topic of interest is “green logistics.” The starting point is to use the two keywords “green” and “logistics” to carry out a basic and naive search.

The R package “litsearchr” will help suggest improvements that might capture more relevant articles to the topic and build up the search strategy.

The keywords are applied to the search via Google Scholar, Web of Science, and Scopus. The search results are then downloaded to a folder the package “litsearchr” can read.

Figure 2: Commands to search

The package “litsearchr” stores the search into columns such as the title, authors, date, abstract, and so on.

Figure 3: Sorted columns

One of the purposes of doing the “naive search” is to obtain potential search terms related to the topic. These additional terms then are added in a new search to get more relevant results. This can help to overcome the selection bias. There are two methods to obtain the new terms. Firstly, the simplest method to obtain new search terms is just to examine what keywords were already attached in the downloaded articles. The keywords need to exclude common and unnecessary words (aka. stopwords) such as “the”, “review”, and so on.

Figure 4: Extracting the keywords

There are a number of new terms related to the naive search terms “green logistics,”

such as “circular economy” or “closed-loop supply chain.” Therefore, it is safe to say that the package “litsearchr” is able to provide researchers with more terms, which means more ideas and knowledge.

The second method to get new terms is based on the titles of downloaded articles just in case some articles do not even provide keywords.

Figure 5: Extracting the titles

The new terms stemmed from the titles and abstracts then will be combined after duplicates removed.

Figure 6: Combining keywords and titles

It is said that a picture of the network of terms is necessary to obtain a better idea of the search’s structure. This is not easy due to the fact that many terms are needed to be linked.

However, the “ggraph” package provides some useful tools for drawing networks. In this study, the network is kept simple and only shows the basic network visualization.

Figure 7: Commands to draw networks

Figure 8: Terms network

Terms locate near the centre of the graph and link to one another by darker lines are probably more important for the overall topic. Their names are not labeled in the graph. On the other hand, terms locate at the margins of the graph and are linked to any other terms only by blurred lines. These are not closely related to any key terms and thus are tangential at most and do not belong to the main topic.

The network of the search terms is usually ranked by importance to cut off the least important terms. The ‘strength’ of each term in the network is counted based on the number of other terms it links.

Figure 9: Ranking the terms

At the top of the list are terms that are most weakly related to the others. Their positions on the graph visualisation above are believed to locate near the periphery of the graph. Most of them are entirely irrelevant and have been found in a few articles in the naive search for arbitrary reasons. If it is not the case, they may be still relevant but are seldom

used. As a result, these terms should be removed. It is said that the visualisation of the term strength can be used as a criterion to discard unnecessary terms.

Figure 10: Commands to discard unnecessary terms

Figure 11: The graph of remaining terms

Figure 12: Selected terms

After removing the unnecessary search terms based on the term strength, if the terms of the original naive search are not included, they need to be added along with extra terms researchers want to have in the final set of search terms.

Figure 13: Adding extra terms

Researchers can use the revised list of search terms to generate a new search strategy to get more articles relevant to the same topic. The new search needs to be more rigorous with the addition of AND and OR operators. This approach is usually known as BOOLEAN search. More specifically, researchers can use AND operator to narrow a search by combining terms and get articles that mention at least one word from each of the interested subtopics. On the other hand, OR operator is said to broaden a search by including articles that contain any selected terms. Currently, with the “litsearchr” package, setting up a BOOLEAN search is to put the chosen terms into a list of separate vectors manually.

Figure 14: Grouping terms

The write_search() function takes the list of grouped search terms and then writes a new search text. This is a helpful search strategy with BOOLEAN use, and if it is used in many database sources (both published and unpublished ones), the selection and publication biases will be addressed and overcome.

Figure 15: commands for BOOLEAN search

Figure 16: BOOLEAN search

Based on the new BOOLEAN search, the results are stored and prepared for screening. Some software and applications can assist the process, but the screening stages are mainly manual so far. The R “revtools” package is said to support researchers effectively in the title and abstract screening.

Firstly, for the title screening, the package “revtools” works as an independent application that researchers can load the data directly from the new search stored above or load it via a file in CSV format.

3.1. Title screening

Once the researchers have loaded the application and added the data of the term search, they can manually select or deselect articles in individual or groups. The application also provides the function of navigating between pages using the arrow buttons. The researchers have to decide which titles to include or choose ‘unknown’ for unclear articles.

Secondly, abstract screening is run through the function screen_abstracts. In general, the two ways of screening are similar. However, unlike title screening, the abstract screening has only been done for one article at a time, and it allows researchers to take notes if they would like to.

Figure 17: Title screening

3.2. Abstract screening

Figure 18: Abstract screening

Finally, researchers can carry out the screening with topic models. To do so, the main way to investigate the results of a topic model in package “revtools” is via the command screen_topics. This function behaves similarly to the two ways of screening above.

3.3. Screening with topic models

After the topic models have been identified, the application creates a plot called

‘ordination.’ The axes on this plot are unlabelled as they do not have any implicit meaning.

The researchers, instead, need to focus on the points that sit together as they are said to contain similar topics. They can hover over those points to see the title of the selected articles and the strongest-weighted topic for those papers. This is an excellent help for synthesising the literature.

Figure 19: Topic screening

Solution for a better literature review

Situation of strategic market management of Hapromart chain

Literature review and hypothesis development