- Yesterday
Cascading Citation Expansion
- Chaomei Chen
Chaomei Chen
5/2/2026
Cascading citation expansion (CCE) is one of the most important methodological extensions in CiteSpace for systematically building a more complete and representative dataset of a research field. CCE extends Gene Garfield's original citation indexing to a flexible, versatile, actionable method that bridges local and global views of a research field.
1. Core idea
Cascading citation expansion = iteratively expanding a set of papers by following citation links (forward or backward) in multiple steps.
Instead of relying on a single keyword search, you:
Start with a seed set of publications
-
Expand it by:
Forward citation expansion → papers that cite your seed
Backward citation expansion → references cited by your seed
Repeat this process in multiple rounds (cascades)
This creates a progressively larger and more connected dataset. The goal is to reduce the risk of missing important literature. For technical details, please see our PLoS One article (Chen and Song, 2019).
2. Why it is needed
Traditional search (e.g., Web of Science query):
Depends heavily on keywords
-
May miss:
Seminal papers using different terminology
Interdisciplinary connections
Earlier foundational work
CCE addresses this by using citation relationships instead of only text matching.
3. How cascading works
Step 1 — Define a seed set
Examples:
A key article
A review paper
A small query-based dataset
A journal
In this blog, I will use the open-access journal Frontiers in Research Metrics and Analytics (FRMA) as a seed set and use the Cascading Citation Expansion function in CiteSpace to demonstrate its use.
To use this function, you need to have a Dimensions API key and set it as an environment variable DimensionsAPI. In addition, you need to set DimensionsUser and DimensionsPass environment variables accordingly.
Data > Import/Export > Dimensions > Cascading Citation Expansion.
The CCE with DSL interface provides a variety of examples you may modify and use. In this case, the DSL query is specified to retrieve publications from Frontiers in Research Metrics and Analytics (FRMA) that have at least 1 citation and have cited references. At this time, there are 400 qualified publications on Dimensions. The 400 publications will form the seed set. The first step is to retrieve the seed set using the Download the Query Result button.
Once the seed set is completed downloaded, you will see a summary similar to the one below. Note that for citation analysis purposes, we exclude publications that have no references.
In addition, two more files will be generated for expansion steps: citerIDs.txt and refIDs.txt.
Step 2 — Choose expansion direction
You may choose a variety of combinations of forward and/or backward expansion steps to fulfill your research needs. If you want to include research front articles that cite the seed papers, you should use forward expansion as you are looking forward from the standpoint of the seed set. In contrast, if you want to add articles that are cited by your seed set, i.e., the intellectual base of your seed set, then you should use backward expansion.
A. Forward expansion
Add papers that cite your seed papers
Moves toward newer research (research front)
B. Backward expansion
Add papers cited by your seed papers
Moves toward intellectual base (foundations)
This directly reflects CiteSpace’s core model:
Research front ↔ Intellectual base
Suppose that we are interested in visualizing the impact of the journal in terms of what kinds of articles that cite the journal's publications, we could use a forward citation expansion. We can then analyze the forward expansion set alone or with the seed set, whichever is more appropriate given our research goals.
You will be asked whether you want to include the refIDs.txt in the forward expansion. The refIDs.txt is the set of references cited by the seed set. The default answer is No.
Step 3 — Apply one expansion cycle
You get a larger dataset:
Seed → Expanded set (1st generation)As it turns out, there are a total of 3,186 publications that cite our 400-seed set. This is the first generation of a forward expansion from the seed set. The forward expansion will retrieve the bibliographic records of the 3,186 articles.
It took a while for the expansion process to complete. We actually retrieved 3,191 articles, a few more than the 3,186 articles found a day before.
The process was intentionally slowed down to comply with the rate limit of the API. However, if I didn't forget to keep the computer awake it would have completed about 10 hours sooner.
Step 4 — Cascade (repeat)
Use the expanded set as a new seed:
Seed → Gen1 → Gen2 → Gen3 ...Each iteration:
Increases coverage
Brings in more distant but relevant literature
A wide variety of expansion paths are potentially valuable. For example, you may consider expansion paths such as AAA, BAA, and ABA, each of which with unique semantics. For instance, the path ABA means that we aim to retrieve the research front of the seed set first, then retrieve the intellectual base of the expanded set, and finally, expand to all the research front articles of the enriched intellectual base.
In our demo case, we will just apply the forward expansion once.
Step 5 — Stop criteria
You stop when:
Dataset stabilizes (no major new areas)
Size becomes manageable
Research scope is sufficiently covered
We stopped at the completion of the first-generation forward expansion.
4. Visualizing the Merged Set (Seed + Expanded)
One way to proceed is to merge the seed set and the forward expansion set and apply Structural Variation Analysis (SVA) to the merged set so that we will be able to trace major evolutionary paths directly. See my blog on Structural Variation Analysis (SVA) for more details.
4.1. The Intellectual Landscape of FRMA Publications
4.2. A Structural Variation Analysis (SVA) of FRMA
5. Key variants in CiteSpace
CCE is flexible and typically used in three ways:
a. Query → Expansion
Start with keyword search
Expand to improve completeness
b. Landmark-based expansion
Start from a seminal paper
Trace its influence (forward)
c. Review-based expansion
Start from a review article
Expand backward to reconstruct the field
These strategies were explicitly compared in methodological studies.
6. What makes it “cascading”
The term “cascading” emphasizes:
Multi-step expansion, not one-step
Each step builds on the previous one
Like a snowball effect in citation space
7. Practical interpretation in CiteSpace
CCE helps you achieve:
a. Better field coverage
Captures hidden or non-obvious papers
b. Stronger network structure
Improves clustering quality (modularity, silhouette)
c. More reliable insights
Reduces bias from initial query design
8. If you want next step
If you’re using CiteSpace, CiteSpace Official Support Assistant GPT can guide you through:
Exact menu steps for running CCE
How to set expansion limits
How to evaluate whether your expansion is “good”
References
Chen, C., Song, M. (2019) Visualizing a Field of Research: A Methodology of Systematic Scientometric Reviews. PLoS One, 14(10), e0223994. 10.1371/journal.pone.0223994