Infusing any LLM with news — one line of code

Emergent Methods
8 min readMay 2, 2024

--

50k news sources, indexed and ready for your natural language query

If your LLM needs access to up-to-the minute news context, AskNews is your low-latency, high-coverage, natural language link for your LLM chain.

You could build and maintain your own Retrieval Augmented Generation (RAG) architecture, which would require you to host and maintain:

  • 🪣 scraping 50k news websites every 5 minutes,
  • 🧽 cleaning, summarizing, translating, and enriching 500k articles per day,
  • 🧮 embedding the articles with dense and sparse vectors,
  • 💾 storing the documents in an ever-growing vector database,
  • 🔬 monitoring for quality and ensuring up-time reliability,
  • 🏎 ensuring low-latency interactions to avoid slowing down your LLM application,
  • 🧑‍🔬️ researching and developing methods for quality/accuracy control,
  • 🧭 research and improve methods for retrieval on dense and sparse vector indices,
  • 🌍 tracking news narratives through time with state of the art clustering/tracking methods.

But managing these bullet points would require a full team of developers. Instead, we introduce the simpler approach — outsourcing these bullet points to AskNews in favor of a single line of code. Let’s go!

TL;DR

What is AskNews?

AskNews is an AI-first News API with low-latency and natural language queries at its core. It enriches over 500k news articles per day across 13 languages, hundreds of countries, and 50k unique sources 🌎. We even track Reddit. You can see our source origin metrics up-to-the minute by viewing our transparency dashboard.

Our global coverage, indexed into an easy natural language query for you
Global news across all languages, translated and ready for your LLM context

We are monitoring the biggest events on the global news landscape, across all these languages and countries, ensuring that you have the unbiased and diversified up-to-the minute news coverage. Here is an example of the available metrics, visualized on the AskNews webpage.

Quick-start

You can benefit from the entire AskNews infrastructure via simple natural language queries on a single line of code. Let’s see how easy it is:

pip install asknews

Next, head over to my.asknews.app and sign up for an account to generate your free news API credentials (choose the types of data you’d like access to):

If you need any help with this step or any other steps, join our community discord at https://discord.gg/JQGkBz6HNa

Once you have your client_id and client_secret you can create your AskNews client:

from asknews_sdk import AskNewsSDK
from openai import OpenAI

sdk = AskNewsSDK(
client_id="your_client_id",
client_secret="your_client_secret",
scopes=["chat", "news", "stories"]
)
oai = OpenAI(api_key="")

Note: Additionally, we have an async client if you prefer to operate in an async environment.

Now you have both your AskNews SDK instantiated as well as your standard OpenAI client. It is worth noting that you can use any LLM you want for this exercise, we simply use OpenAI for convenience here.

Next, we build an example interaction. Let’s say you are building a chat bot, and the user is asking about “What is the current political situation in Germany?” You can treat AskNews as a super-powered vector database that will return the latest news-context to your application.


# Your user asks a question about the current political situation in Germany
user = {
"role": "user",
"content": "What is the current political situation in Germany?"
}

# Grab a prompt-optimized string ready to go for your LLM:
response = sdk.news.search_news(
# any natural language query. Any phrase, keyword...anything you want.
query=user["content"],
# control the number of articles to include in the context
n_articles=10,
# you can also ask for "dicts" if you want more information in a structured way
return_type="string",
# use "nl" for natural language for your search, or "kw" for keyword search
method="nl"
)
# now you have a prompt optimized string
news_articles = response.as_string

Note: you have control over many other parameters, including filtering on the time of publication. Full API reference available here.

The prompt-optimized string is a densely enriched string that is ready to be infused into your prompt without any further interactions. When we say “prompt-optimized”, we mean that we have already narrowed down and formatted all the necessary details for you. response.as_string is a string that looks like this:

<doc>
[1]:
title: Germany's economy teeters on the brink of disaster
summary: The German economy is on the brink of disaster, according to an article by 'El País'. The cause of this is attributed to a decline in the industrial sector, with the final blow dealt by the conflict in Ukraine. It is believed that Germany's reckless sanction policy has played a role in this. The crisis in Germany often leads to a crisis in the entire EU, as neighboring countries are closely linked to Germany through a supply chain. According to the German Institute for Economic Research, losses from the start of the Ukrainian conflict are estimated at 200 billion euros. This is primarily due to the rise in energy prices.
source: Беларусь 1
published: April 01 2024 12:00
organizations: EU, the German Institute for Economic Research
places: Ukraine, Germany
keywords: German economy, economic crisis, Ukraine conflict, sanction policy
classification: Economics
sentiment: 0.0
</doc>
<doc>
[2]:
title: Gelingt der AfD im Osten, was Trump 2016 schaffte? – Kampf um die Nichtwähler
summary: The article discusses the potential of the Alternative for Germany (AfD) party to become the strongest party in eastern Germany, particularly in the states of Saxony, Thuringia, and Brandenburg. The party has been successful in mobilizing non-voters in the past, and they are focusing on issues that resonate with this group. The article also draws parallels with the 2016 US presidential election, where non-voters played a crucial role in determining the outcome. The AfD is attempting to replicate this success by focusing on themes that polarize voters and mobilize those who might otherwise stay home. The article also mentions the potential for the AfD to become the strongest party in eastern Germany, citing a survey by the Forsa Institute that suggests that about half of those who had not planned to vote in January might reconsider if the AfD were to make significant gains. The article concludes by discussing the importance of understanding the non-voter demographic and the potential for the AfD to tap into this group.
source: GMX
published: April 01 2024 09:30
organizations: the Forsa Institute
places: Saxony, Thuringia, Germany, US, Brandenburg
keywords: Alternative for Germany, AfD, non-voters, Saxony, Thuringia, Brandenburg, 2016 US presidential election
classification: Politics
sentiment: 0.0
</doc>
...
*truncated for brevity*

Note: you can control the document delimiters by setting the doc_start_delimiter and doc_end_delimiter .

Now we take that string and input it directly into our system prompt as the example here:

system = {
"role": "system",
"content": f"""
A chat between a curious user and an artificial intelligence Assistant.
The Assistant has access to the following news articles that may be useful
for answering the User's questions: {news_articles}
"""
}
response = oai.chat.completions.create(
model="gpt-3.5-turbo",
messages=[system, user]
)

print(response.choices[0].message.content)

And that’s it, you just infused GPT3.5 with up-to-the minute news. The response for our example looks like this:

The current political situation in Germany involves several key developments:

1. **Internal Security Concerns**: There are doubts about Germany's internal security, with an increasing focus on strengthening the Federal Constitutional Court. This indicates a growing sense of insecurity about both external threats and internal vulnerabilities within the state [1].

2. **Budget and Defense**: The CDU is facing financial challenges, with Party leader Friedrich Merz questioning the feasibility of spending 40 billion euros on citizen's income while maintaining defense capabilities. This could influence the Union's campaign strategy for the upcoming federal elections [2].

3. **Shifts in Political Positions**: The SPD is facing criticism for moving away from its core voter base and reviving old policies, particularly concerning Russia. This has led to concerns about the party leadership's approach and its impact on foreign policy [3].

4. **Youth Political Leanings**: A survey found that a significant percentage of first-time voters trust far-right parties like the AfD to solve European problems. However, there is also a trend towards left-green and far-left views among young people, especially on university campuses, where protests and discussions on various social and political issues are prevalent [4].

5. **Political Protests**: There have been protests against right-wing extremism and the AfD in Germany, with a notable decrease in participation in recent demonstrations. The reasons for this decline are being investigated [5].

You might notice that the LLM even cited itself with the bracket citations that you got from the prompt-optimized context string. If you want to take your game to the next level, you can obtain all the metadata associated with each entry:

# Your user asks a question about the current political situation in Germany
user = {"role": "user", "content": "What is the current political situation in Germany?"}

# Grab a prompt-optimized string ready to go for your LLM:
response = sdk.news.search_news(
# Any natural language query. Any phrase, keyword...anything you want.
query=user["content"],
# Control the number of articles to include in the context
n_articles=10,
# Asking for both gives the string with the citations and the dicts filled with metadata
return_type="both",
# Use "nl" for natural language for your search, or "kw" for keyword search
method="nl"
)
# Now you have a prompt optimized string
news_articles = response.as_string
# And the corresponding set of dictionaries
dicts = response.as_dicts

Now your response.as_dicts contains a list of metadata dicts loaded with additional information like:

  • entities, keywords, classification
  • sentiment
  • links to original content
  • links to associated images
  • original language, country, source
  • and quite a bit more, details available in the API reference

Low Latency

AskNews prioritizes low-latency due to the location that the API call sits in a typical LLM stack. Luckily, AskNews has engineered the system to yield an average response time in the United States of approximately 100 ms for a search. This means that AskNews will not cause any noticable delay in your LLM application.

One important tip for general cloud applications, if you want to maintain the lowest latency possible from a client, you should instantiate the client one time, and re-use it as long as possible. Each time you instantiate a new client, you require new DNS resolution and new TLS handshake. So keep your AskNews client long lived!

Supporting materials

A full SDK documentation is available at https://docs.asknews.app, and the data itself can be interacted with via the main website at https://asknews.app.

If you need any support or if you have questions regarding how the system works, feel free to join us in the AskNews discord https://discord.gg/JQGkBz6HNa.

Other offerings

The AskNews API also gives access to much more than just up-to-the minute news search, we also provide:

  • Original stories written based on clustered articles
  • Media monitoring/tracking of top global narratives
  • A news-infused assistant
  • financial sentiment data

--

--

Emergent Methods

A computational science company focused on applied machine learning for real-time adaptive modeling of dynamic systems.