Matthias Plaue
June 7, 2023

LLMs for innovation and technology intelligence: news categorization and trend signal detection

Introduction: NLP and news categorization

In applications of business intelligence, news articles are an important source for relevant and timely information. Methods from natural language processing (NLP) and text mining can be used to analyze these data and extract relevant insights. For example, news articles can be used to gauge public sentiment — see my previous blog post.

NLP methods can also help the analyst explore a large collection of news articles more efficiently by detecting events and trends [Panagiotou et al. 2022], or summarizing key points [Ma et al. 2022].

Removing irrelevant results such as fake news [Capuano et al. 2023] can also reduce the data deluge. Conversely, the analyst greatly benefits from a system that helps them focus on the news most relevant to their domain.

One approach to identify the articles that are most likely to contain relevant information is automated news categorization. Technology and trend scouts in particular, who wish to collect data that informs innovation strategy, are particularly interested in news that belong to one or more of the following categories, which we can also refer to as genres:

A simple technique for automated news categorization is the search for keywords. For example, we can expect a news article that contains one of the keywords “startup”, “venture capital”, or “angel investor” to be an article that belongs to the genre of startup news.

The goal of this report is to compare the performance, in terms of accuracy and runtime, of traditional keyword search for news categorization with state-of-the-art methods from machine learning.

Trend signal detection

In addition to the news categories listed in the previous section, we want to detect trend signals which we understand as news articles that describe events, claim facts, or reflect on opinions that point to the potential development of significant change in the landscape of innovation and technology. In other words, trend signals can be understood as precursors to emerging trends.

The news genre of trend signals is very broad, and may refer to any of the following sub-categories. Some of those sub-categories may have a large overlap with one or more of the news genres defined in the previous section.

1. Science and Technology

1a. Novel materials or methods.
News articles discussing the development and launch of new, innovative manufacturing techniques, as well as newly-created materials that can improve products, services, or technologies. Example: ‘Smart plastic’ material is step forward toward soft, flexible robotics and electronics

1b. Advancements in efficiency or effectiveness.
Articles covering successful improvements to existing products or technologies regarding functionality, adaptability, performance, or usability. Example: New battery tech boosts EV range by 20%

1c. Innovative applications of existing technologies.
Articles reporting creative new uses of current technologies or products, giving them alternative purposes. Can relate to recycling or repurposing byproducts, materials, applications or systems. Example: 3 Surprising Uses for Depleted EV Batteries

1d. Scientific discoveries and breakthroughs.
Articles covering major discoveries in scientific research, new inventions, advancements or discoveries that solve problems, reduce costs, or enable new applications. Example: Scientists break world record for solar power window material

2. Economics and Politics

2a. Startups.
Articles profiling innovative new startups with proprietary technologies, techniques, designs or materials that address industry challenges, reshape manufacturing, or promote sustainability. May detail a startup’s work, partnerships, or funding.

2b. Mergers, acquisitions, and partnerships.
Articles covering companies investing in startups, collaborating strategically, merging, being acquired, going public, or raising capital to enable product development or launches.

2c. Policy changes, new legislation, and funding opportunities.
Articles announcing government decisions, policies, public contracts, funding programs, laws, regulations, or economic policies affecting specific industries or markets. Policies may relate to changes in political leadership.

2d. New market entrants.
Articles covering the emergence of new competitors in existing markets, including large firms expanding into new sectors or new firms entering established markets. Example: Tesla opens its EV charging network to the masses.

3. Society and Markets

3a. “Hype” or “buzz” surrounding technologies or high-tech products. Articles discussing temporary surges of attention for contemporary emerging technologies or products among researchers, industry players, policymakers, or users. Example: Five technology trends that will define the future of EVs

3b. Events or claims influencing public opinion of technologies or players. Articles reporting unforeseen news or events that negatively or positively impact public perception of specific technologies, companies, or industry leaders. Could cover accidents, lawsuits, misconduct allegations, product defects, or reputation-building announcements. Example: 10 Dirty Truths Of Electric Cars Nobody Is Talking About

3c. Launch of new high-tech products. Articles announcing the release of new technology products, systems, materials, techniques, features or designs that provide additional functionality, applications or benefits. Example: The New Abarth 500e: The Scorpion Stings Again, Now In Full Electric Mode

Trend signals can be strong signals, i.e., about events that are widely reported on. As a result, many trend signals may point to the same event, and cluster analysis can help identify those events.

What makes trend signal detection a powerful concept, however: trend signals are not defined by signal strength, and can therefore also be early signals or weak signals. More traditional methods for trend detection based on time series analysis have a difficult time detecting early signals because there is no emerging trend yet that could be identified in a robust manner. Similarly, weak signals are often drowned out by noise, which makes them difficult to detect by unsupervised analysis of the data stream alone.

Read the full article on medium.


To the Blogs

Drive Innovation Smarter and Faster
in the Digital Era.

Transform your enterprise with cutting-edge AI insights. Enhance decision-making, uncover market trends, and drive growth with real-time, automated intelligence.

100x Faster Insights

70% Cost Cut

Uncover Game-changing Patterns