Using open source to analyse breaking news events

When news breaks, misinformation can spread like wildfire. Stefanie Le, a senior CIR investigator based in Washington DC, discusses the attempted assassination of former US President Donald Trump, the online misinformation that surrounded it, and how to use open source methods to try to make sense of it all.

Photo by Pinho via Unsplash

Latest reports, direct to your inbox

Be the first to know when we release new reports - subscribe below for instant notifications.

Last month, former US President Donald Trump was shot and injured during a campaign rally in Pennsylvania by a gunman positioned on a nearby roof. CIR Senior Investigator Stefanie Le was watching the incident unfold on the news and her social media feeds from Washington DC.

“In this day and age, if something’s happening, people have their phones out and they’re posting it,” Le says. “I saw the news of the Trump rally shooting and immediately knew that the internet was going to be ablaze with rumours.”

It wasn’t long before conspiracy theories started to fly. Photographs of the former US president’s raised fist and defiant stance as he was ushered away fed false narratives that the event was staged, Le explains. Fake images also emerged online, edited to appear that President Trump and US Secret Service agents were smiling as the incident unfolded.

The FBI eventually named the attacker as 20-year-old Thomas Matthew Crooks, but somewhat unusually, Crooks left few digital crumbs online – there was no social media profile, manifesto, or evidence of extremist views, Le explains.

With so little information about the shooter and his motivations, fake profiles and manifestos started appearing on social media and gaming platforms. According to Le, when there’s an information vacuum, actors will fill that space with conspiracy theories.

Widespread misinformation

Social media allows us to access news in real-time – both from journalists on the ground and eyewitnesses.

But the prevalence of misinformation on platforms such as X means we must approach our news feeds with increasing scrutiny: is this image new or old? Has it been edited? Who is sharing it?

The assassination attempt on President Trump saw several newsrooms utilise open source methods to debunk the various false narratives circulating online, from BBC Verify to Reuters.

But Le points out that individuals too can strengthen their media literacy skills and become more attuned to the tell-tale signs of mis- and disinformation. Below, she breaks down the key steps to assessing an incident using open source methods.

Data collection

“In a breaking news situation, the first step is to collect as much data as you can,” says Le. “That means all social media posts you could find across all platforms posting from the location.”

Various tools and techniques exist to speed up the process and narrow down your search:

Boolean search operators

Boolean search operators are logical connectors that widen or narrow a search to surface the most relevant posts, for example: “AND”, “OR”, and “NOT”.
These can be used on social media platforms to find posts that are relevant to the specific event.

X advanced search

Similar to how Boolean search operators work, X advanced search (formerly Twitter advanced search) is a free tool for refining a search on the social media platform.
Using the advanced search function allows you to search the entire platform and use specific keywords to filter posts.

‘Google dorking’

“Google dorking” refers to the use of advanced search operators to access information that is not easily findable using standard search queries.
One means of “Google dorking”, is doing a site search, where you input a site, colon, and then a keyword, as shown below.
More information on Google dorking is available here.

Live data gathering

Eyewitnesses may upload one or two images or videos of an incident to their social media profiles, but will often have other useful visuals they haven’t shared online.

To access more user-generated content (UGC) from a particular event, Le says she often reaches out to individual posters on social media to ask them to share more material. This may include asking if they’d be willing to share their perspective of an event, or the metadata of their posts to pinpoint the time and location of the material.

When investigating the Seoul crowd crush in 2022, Le recalls reaching out to users on Twitch, an interactive live streaming platform that was popular in South Korea at the time. She notes the importance of doing this transparently and with sensitivity around what the poster may have experienced. “We reached out to them, identified ourselves and said ‘I hope you’re okay, and would you mind sharing both what you’ve taken and what was your experience of the event?’ I think around 75 percent of people were willing to talk to us and share their material with us.”

Archiving information

On social media, users frequently remove or edit posts, and in some cases social media platforms will remove content if they believe it’s offensive or graphic.

Le points out the importance of archiving material you come across as you collect data. At CIR, we archive all of our open source data using an auto-archiver to ensure data is preserved in its original form, even if it is altered or deleted online.

Providing long-term access to the material you are analysing is particularly important if investigating potential human rights abuses, where open source evidence could be used in future prosecutions.

Source legitimacy

In a 24-hour news cycle, where both individuals and news outlets are rushing to break the news first, it’s important to pay attention to not only what is being posted online, but who is posting it.

“Anyone who’s at an event that you’re researching can post something,” says Le. “What you want to do is assess whether they are a real person who regularly posts reputable material, or whether they post online just to cause a lot of commotion and go viral.”

Only when you’ve checked that your sources are trustworthy and legitimate, can you move forward in your investigation.

Le provides four tell-tale signs that something might be amiss:

Username: Is it a username that seems “human” (e.g. without a lot of random letters and numbers)?
Posting history: Have they posted their political views? Is there a demonstrated bias in their posts that sheds light on their motivations, or indicates that they’re untrustworthy?
Rate of posts: If someone is posting 100 times a day, or at a higher post rate than humanly possible, it could be an indication that it’s a bot.
Followers: Stay away from accounts that don’t have a lot of followers or regularly post inaccurate news.

“I definitely lean towards looking for respected reporters or news organisations against someone who you would need to do the groundwork on for a verification,” Le adds.

Building a timeline of events

To start figuring out a timeline of an event, you can look at timestamps of posts online and piece together what occurred when. Photographs from the scene may enable you to understand the physical space where an event took place and how an incident unfolded. Images or videos with recognisable landmarks or features can be cross-checked with satellite imagery and geolocated.

Le again applies this to the rally shooting example: “Looking at the videos and pictures taken from the Trump rally shooting, you can start to see where the shooter was, what angle the shooter aimed from, where the Secret Service was placed, where they were looking, and the crowd relative to Trump, and really start building that out.”

Filling in the gaps

As an open source investigator, it’s crucial to critically analyse the information at hand by asking yourself “what isn’t adding up?” says Le.

“Once you’ve gotten enough user-generated content and verified enough to know what was actually taken at the scene, your source analysis separating what you can trust versus not, then you can move forward in your investigation.”

She advises starting by comparing the narrative reported by official sources, e.g. police or government reports, to the visual evidence available online, and looking for discrepancies. These inconsistencies may be key indicators of misinformation, bias, or overlooked detail, which you can then investigate further.

Identifying such discrepancies can form the foundation of your investigation, guiding you to dig deeper, gather more evidence, and possibly uncover the truth behind a narrative, Le explains.

The steps compiled above are merely the starting point of the open source process – a simple approach to apply to breaking news events that are shrouded in misinformation and competing narratives in a 24-hour news cycle.

What Le makes clear is that approaching breaking news events with meticulous scrutiny from the get-go will help you to build a robust and credible investigation.