• Aug 21, 2025

Cluster Summarization: gpt-4o vs. gpt-4o-mini vs. deepseek-chat

  • Chaomei Chen
  • 0 comments

In this blog, I illustrate how models such as gpt-4o, gpt-4o-mini, and deepseek-chat can augment our sensemaking of clusters of co-cited intellectual contributions. A practical question is: how can we determine systematically and objectively which cluster summaries are the most reliable choices? Furthermore, from a sustainability perspective: what factors should guide our decision when weighing affordable models against the most powerful ones?

Large Language Models (LLMs) open new ways for synthesizing and distinguishing themes across the intellectual landscape of scientific publications. A longstanding challenge, however, is explaining the structural and temporal dynamics of thematic concentrations. In this blog, I illustrate how models such as gpt-4o, gpt-4o-mini, and deepseek-chat can augment our sensemaking of clusters of co-cited intellectual contributions. A practical question is: how can we determine systematically and objectively which cluster summaries are the most reliable choices? Furthermore, from a sustainability perspective: what factors should guide our decision when weighing affordable models against the most powerful ones?

Settings

Clusters of co-cited references are characterized by their internal homogeneity and contextual linkages with other clusters. The valleys between peaks are good candidates for future growths. New discoveries and advances are often originated across the boundaries of thematic concentrations.

A representative comparison should consider a wide variety of underlying domains of research. As a starting point, we explored the literature of art, a rich and dynamic mix of forms, meanings, and values. The input is a collection of scholarly publications in English. In this blog, the cluster summarization is in Chinese.

The image below is a region of a visualized network. Each node is a publication, or rather, a reference as it is cited by others in the literature. Each node is depicted as a citation tree ring, the years of citations starting from inside out. The label of a cluster is placed at the weight center of the cluster. In the preliminary exploration, we focused on the top 20 largest clusters among the numerous number of clusters. This is mainly a budget saving choice. LLM-generated cluster labels are in Chinese. The original cluster labels generated by CiteSpace remain in English.

Here is the overview of the largest connected component of the network. The entire network has over 40,000 nodes. Its largest connected component has 18,000 nodes.

Comparison

I compared 3 LLMs, namely A: gpt-4o-mini, B: deepseek-chat, and C: gpt-4o. I originally planned to include gpt-5. However, gpt-5 imposed limited access to some of the content for safety reasons. For example, gpt-5 would abort the summarization of the demo project on terrorism due to this safety control.

The costs of these models are estimated in terms of input (without cached input) and output tokens.

A: gpt-4o-mini: $0.15/$0.60

B: deepseek-chat: $0.56/$1.68

C: gpt-4o: $2.50/$10.00

The preliminary comparison was limited to the top 3 largest clusters of the research domain. You will see what it is all about shortly. Each cluster summarization consists of characterization at three levels: cluster label, summary, and keywords.

Intuitively, our criteria of quality should weigh in accuracy, representativeness, and cohesiveness. There are many ways to assess cluster summaries against these criteria. The most reproducible or repeatable way is to use an LLM. I used ChatGPT 5. The summarization results of the largest 3 clusters generated by each of the A, B, C models were anonymously fed to ChatGPT 5.

Question: Is an "out-of-place" reference necessarily an error?

GPT-4o-mini vs DeepSeek-chat

All models agree that Cluster #0 is about Raphael (拉斐尔). However, the mention of a relationship with Bernini (贝尔尼尼) by gpt-4o-mini was considered a serious anachronism as Bernini was born 1598, long after Raphael (d. 1520). In contrast, deepseek-chat identified commonly known early-career anchors such as Perugino (佩鲁吉诺) and Pinturicchio (平托里乔).

The second largest cluster is on Picasso (毕加索). gpt-4o-mini's translation of 《亚维农少女》is somewhat peculiar. deepseek-chat's summary is comprehensive and focused. gpt-4o's unique identification of the style transformation (超现实/立体主义) is remarkable. A combination of gpt-4o and deepseek-chat would be desirable.

GPT-4o vs DeepSeek-chat

Regarding the largest cluster #0 on Raphael's early career, gpt-4o's emphasis on patronage was considered a plus, which introduces a new dimension that was not mentioned by the other two models.

The third cluster is about Caravaggio (卡拉瓦乔). gpt-4o-mini's translation to Chinese is again debatable. Its summary is strong along ethical and philosophical interpretations, but relatively weak in terms of mechanisms and details. deepseek-chat's summary is consistent with the conventional focus and coverage of Caravaggio. gpt-4o contributed new dimensions that were not covered by the other two models.

Recommendations

This is an exploratory comparison and assessment of the quality of cluster summarizations based on some commonly accessible LLMs. More comprehensive and in-depth comparisons are necessary before one can reliably reach practically meaningful conclusions. Factors such as accuracy, representativeness, cohesiveness as well as operational costs and the quality of underlying data should all be taken into account. Based on the preliminary comparison, a prudent strategy is to take all the variations of the summaries of the same cluster into account. More specifically, one could start with gpt-4o-mini for initial exploration and fine-tune an exploratory process and various settings. gpt-4o and deepseek-chat have their own unique strengths and weaknesses. An additional layer of integration by an LLM would be an efficient and practical step to further strength the overall quality of the understanding of the underlying structural and dynamic values.

Integration

The summaries generated by the three LLMs all have their own unique strengths and weaknesses. A natural next step is to have them integrated to a single cohesive summary. Here is an integrated summary of Cluster #0 on Raphel's earlier career in English and Chinese.

Integrated Summary (Cluster #0: Raphael’s Early Career and Artistic Influence)

This cluster focuses on the early artistic career of Raphael during the Italian Renaissance, with particular attention to his creative development in Urbino and Umbria and the multiple influences that shaped his work. The literature widely discusses Raphael’s interactions with his contemporaries, including his relationships with mentors and collaborators such as Perugino and Pinturicchio, as well as how he gradually formed his personal artistic style through both cooperation and competition (Oberhuber 1986; Russell 1986).

Research themes highlight Raphael’s innovations in religious painting, especially the symbolic and artistic significance of works such as The Transfiguration and the Madonna series (Jungic 1988; Caron 1988; Bendersky 1995). The cluster also explores how Raphael positioned himself within the Florentine narrative painting tradition while absorbing influences from Northern European art (Quednau 1983; Rosenberg 1986). In addition, particular emphasis is placed on the role of early patrons in shaping his career, and on how Raphael pursued technical innovations that expanded his influence through interactions with the Venetian school, as well as artists such as Leonardo da Vinci and Dürer (Ettlinger 1986; Pagden 1986; Passavant 1983; Nesselrath 1993).

Overall, this cluster reveals how Raphael, in his early career, established a unique position within the Renaissance artistic landscape by combining tradition and innovation, deepening religious and symbolic expression, and benefiting from patronage systems that supported his growth.

Keywords: Raphael; Renaissance; Urbino; religious painting; artistic influence; artistic patronage; narrative painting tradition

综合总结(Cluster #0:拉斐尔的早期生涯与艺术影响)

这一聚类聚焦于意大利文艺复兴时期画家拉斐尔的早期艺术生涯,尤其是他在乌尔比诺和翁布里亚时期的创作经历及其所受的多重影响。文献广泛讨论了拉斐尔与同时代艺术家的互动,包括他与佩鲁吉诺、平托里乔等导师与合作者之间的关系,以及在竞争与交流中逐渐形成的个人艺术风格(Oberhuber 1986; Russell 1986)。

研究主题涵盖了拉斐尔在宗教绘画中的创新表现,如《变容图》《圣母像》等作品所蕴含的宗教与艺术象征意义(Jungic 1988; Caron 1988; Bendersky 1995),以及他如何在佛罗伦萨叙事画传统中找到定位并吸收北方艺术风格(Quednau 1983; Rosenberg 1986)。此外,集群还特别强调了早期赞助者对其艺术生涯的重要作用,以及拉斐尔如何在技法上不断革新,从而在与威尼斯画派、达·芬奇、杜勒等艺术家互动的过程中拓展了自身的影响力(Ettlinger 1986; Pagden 1986; Passavant 1983; Nesselrath 1993)。

总之,这一聚类揭示了拉斐尔在早期如何通过传统与创新的融合、宗教与象征的深化以及赞助体系的支持,逐步奠定其在文艺复兴艺术格局中的独特地位。

关键词:拉斐尔;文艺复兴;乌尔比诺;宗教绘画;艺术影响;艺术赞助;叙事画传统

Please feel free to leave your comments.

0 comments

Sign upor login to leave a comment