How the news talked about the Israel-Hamas conflict in its first month

Authors
Ryan McGrady
Published
December 12, 2023

In the time since October 7, 2023, a lot has been written not just about the conflict in Israel/Palestine but about the words we use to talk about the conflict. First there was press about the BBC's decision not to use the word terrorist to talk about Hamas, later modifying its policy to only use the word as attributed to e.g. the UK government. The use of terrorism became a flashpoint, with other organizations like the AFP putting out statements on the subject. Vox put together a glossary of sorts to explain some of the terms appearing in the media which may need more context, ranging from occupation to Zionism. Among the kinds of language Gal Beckerman takes issue with in his Atlantic article is the use of passive voice, using the example "were subsequently killed" in a description of the attacks on October 7. The radio program On the Media dedicated a segment to the use of 9/11 as an analogy. Time ran a story on the use of the word genocide. Last week, Press Play with Madeleine Brand covered the "fraught" usage of the word ceasefire. 

Even the way the conflict is named is controversial. As Stephen Harrison pointed out in Slate, debates over how to name the Wikipedia article are active and complex, with potentially far-reaching consequences given the extent to which the internet relies on Wikipedia. Article titles on Wikipedia, in turn, follow the naming practices of the press and other “reliable sources,” so internal debates include copious links to news stories to highlight whether they use Israel-Hamas, Israel-Gaza, Israel-Palestine, or some other framing.

This is the kind of research Media Cloud is great for — analyses of trends in word use in news media. So let's look at some of these words and how they've been used in the first month of coverage of the conflict, October 7 - November 6.

Which sources are being measured?

Every Media Cloud query begins by specifying one or more sources or collections (groups of sources). Among the things I was curious to see was whether the use of words like terrorist and occupation vary based on the political leanings of a source, so I chose a set of five source collections divided by partisanship. They were developed by Faris, et al. in 2019 to study partisanship during the 2020 US election, placing sources in one of five quintiles (Right, Center Right, Center, Center Left, and Left) depending on how often those sources were shared on Twitter by self-identified liberals/democrats and self-identified conservatives/republicans. Those which were shared mostly by liberals/democrats are in the "Left" quintile, those shared slightly more by liberals/democrats are considered "Center Left," and so on. Doing it this way gets around the messy subjectivities involved with ascribing political positions to sources based on source content, but also means that in every instance I use language like “sources on the left,” what I really mean is “sources much more likely to be shared by people on the left than people on the right.” Examples of sources in each collection are: Breitbart in the Right collection, RT in the Center Right, CBS News in the Center, The Atlantic is in the Center Left, and Salon in the Left. 

Because the method was based on web domains rather than discrete sources, there is some overlap where a source may have straddled buckets but had one subdomain in one bucket and another subdomain in another. Media Cloud typically considers subdomains to be part of the same site, so rather than decide between them we have included them in both quintiles (for example, The New York Post in both the Right and Center Right quintiles).

Total attention to the conflict

First, let's look at coverage of the conflict in general. The collections each include between 1,665 and 4,028 sources. For this base query, I combined them (a total of 12,560 unique sources). I searched for stories that included at least two of these terms: Israel*, Hamas, Gaza*, or Palestin*, with wildcards to capture both e.g. Palestine and Palestinian. The selected stories further had to include at least two mentions of one of these terms (so, for example, Israel twice plus a mention of Hamas). These kinds of requirements are intended to cut down on the amount of noise where, say, the conflict might be mentioned at the end of an unrelated story. Importantly, I only searched for stories in English for methodological simplicity, but the partisanship collections include a relatively small number of non-English sources. The total number of stories matching my query was 183,073.

Figure 1 shows the total attention to the conflict in its first month. The percentages shown measure the proportion of all stories published that day which matched my query. The coloring simply highlights where coverage was greatest (green) and smallest (red). In total, 9.57% of all stories published between October 7 and November 6 matched the query. It is worth noting that while the numbers may not sound like a lot, this is based on all stories published that day, including a large number of sources which do not typically publish anything about politics or international relations. Compared with the vast majority of subjects, even the low point, 6.38% on November 6, is still very high. For example, total attention to Taylor Swift, who released The Eras Tour in theaters in the same period, was only 0.48%.

Figure 1: Total attention over time

Next, let's break those attention figures down by partisanship. Figure 2 includes five numbers under each day, representing the percentage of stories published in each collection that were about the conflict. Interestingly, while sources on the left, center-left, and center devoted similar levels of attention to the subject, it was somewhat higher among center-right sources and significantly higher among right-wing sources. This was true throughout the time period, often by a factor of two or more.

Figure 2 - Total attention by partisanship collection (L=Left, CL=Center Left, C=Center, CR=Center Right, R=Right)

Naming the conflict

As mentioned above, just the names of the parties in conflict is up for debate, so I searched for uses of three names: Israel-Hamas, Israel-Gaza, and Israel-Palestine. Each includes all variations (Israel, Israeli, Gaza, Gazan, Palestine, and Palestinian, with or without a hyphen). Though Israel is typically listed first in the names, that in itself is a choice and thus I also looked for the same pairs with the names reversed. I suspected that the Israel-Palestine query would dwarf the others since it could be used to refer not just to recent events but also to the larger, decades-long conflict. To my surprise, Israel-Hamas was used much more frequently than any others. Curious if the naming evolved over time, Figure 3 follows the proportion of stories matching my base query that used each name on each day, showing that while all coverage of the conflict decreased over time, there was not much change in relative proportions of parties named following the shift from Israel-Palestine on October 7 and 8 to Israel-Hamas for the rest of the timeline.

Figure 3 - Conflict naming over time

One of the things I'd like to do in a follow-up is to find which word is used most frequently following these pairs (war, conflict, etc.). Based on a preliminary look at the most commonly used bigrams and trigrams in this data, war became most common as soon as the Israeli government formally declared war.

Use of specific terms

That brings me to the use of language within the coverage. I extracted the top 1,000 most frequently used words for each day's stories and checked their use over time. The following figures illustrate changes in the use of certain words over time. The numbers you see reflect how many times, on average, the word appears in a story about the conflict on that day. So, for example, among the 5,512 total stories about the conflict on October 24, hostages appeared 2,763 times, leading to a word frequency of 0.501. Often, these peaks in word mentions correspond to specific events of the day, such as hostages being released in this example. 

There are so many words and phrases worth looking at for these figures, but I'm going to limit it to some of the most charged terms and those associated with clear narratives: terror* (terror, terrorism, terrorist), militant, occup* (occupation, occupied, occupier), settle* (settlement, settler), genocide, hostage, cease* (cease, ceasefire, cease-fire), and protest* (protest, protestor, protester). These include all of the singular and plural versions that appear among the top 1,000 words on any day. Missing values do not necessarily mean the term wasn’t used – only that it was not in the thousand most used words that day.

Figure 4 compares the frequency of terror* and militant over time, showing that terror* was used frequently in the first week after October 7, and spiked again around October 25, when 62% of stories about the conflict used the term. A cursory look at some of the sources which use it that day did not reveal a single obvious cause, although the release of hostages may have led to descriptions of the events when the hostages were taken. Use of militant was nearly as common as terror* on October 7 (militant appeared in 51% of stories vs. terror* in 56%), but fell off significantly after the first few days, perhaps driven by criticism of the BBC's initial preference for militant over terrorist when describing Hamas.

Figure 4 - Frequency of the use of terror* (terror, terrorism, terrorist) and militant* over time

Figure 5 tracks the frequency of three terms often used to contextualize the conflict or to criticize the Israeli government’s military response. Occup* and settle* both appeared in more than 10% of stories on the first day of coverage, then dipped in frequency before gradually increasing over the rest of the period. Genocide was not among the 1,000 most frequently used words at first, but appeared in 13% of stories by the last day. These trends may make be due to the increased proportion of coverage focused on Israel's response to the October 7 attacks and local conflicts taking place in and around settlements. 

Figure 5 - Frequency of the use of occup* (occupation, occupied, occupier), settle* (settlement, settler), and genocide over time

I expected coverage of the hostages to be proportionally more in the first week of the conflict, but its use peaked about two-and-a-half weeks in, as shown in Figure 6. This is likely due to stories about hostage releases, disputes over hostage posters, and the Israeli government's discussion of hostages in its public statements about its military response.

Figure 6 - Frequency of the use of hostage over time

The last pair of words (Figure 7) stand in for two narratives covering the response to the conflict: cease*, encompassing discussion of and calls for a cease-fire, and protest*, following stories that mention public demonstrations objecting to one side or the other. As the chart shows, while there was considerable coverage of related protests in the first two weeks after October 7 (peaking at 25% of stories about the conflict on October 18), it decreased considerably in the last days of the period (just 2% on November 6), while stories mentioning a cease-fire steadily increased from 0-2% in the first few days to as high as 35% on November 4.

Figure 7 - Frequency of the use of cease* (cease, cease fire, cease-fire) and protest* (protest, protester, protestor) over time

Each of these charts thus far uses a different scale to highlight differences. Figure 8 combines all of these terms to better put in perspective that, for example, while genocide became more prominent towards the end of the period, it was still used much less frequently than hostage

Figure 8 - Frequency of all keywords over time

There are, of course, many ways to interpret these trends. Some can be easily connected to particular news stories or developments. Others may be skewed by language used in, for example, a widely republished Associated Press report. 

The table below (Figure 9) also looks at word frequency, but instead of considering their use over time, it compares the frequency with which they are used in our partisanship collections. Each line of the table is colored individually to highlight the differences, but bear in mind that this doesn't always mean there's a very large difference – the colors are there to draw attention to the numbers rather than the other way around.

Figure 9 - Comparison of word frequency between partisanship collections

Some observations:

  • Right-wing publications were much more likely to use the word terror* and much less likely to use the term militant
  • On the use of the terms occup* and settle*, right-wing sources were more aligned with the center in their relative avoidance. This is a relatively uncommon overlap in research over politically charged subjects. 
  • Genocide was used almost equally by the right and left, and much less frequently by the center. This may be linked to the way traditional journalistic standards tend to lead publications to avoid using charged language (indeed, the center is colored red for most of these terms). This may also be why the center, and not the left, was the most likely to use the word militant (leading to debates over whether terrorist is a charged, subjective term or something with a concrete definition that journalists should apply).
  • While the right was more likely to mention hostages, this is an example where the colors are misleading — there's very little difference across the spectrum. Similarly on the subject of a cease-fire, the left uses the term more often, but the difference is not particularly great.
  • The right was more likely to mention protests, which may be rooted in stories criticizing certain elements of pro-Palestinian or anti-Israel demonstrations.

What else?

There is so much to dig into here, but I'll outline a few things I'd like to look at more in the future:

  • There are many more words worth exploring than I can reasonably include here, as well as bigrams and trigrams
  • Which people and organizations are mentioned most frequently
  • Use of active vs. passive voice
  • Use of popular analogies and metaphors: 9/11, ISIS, "open-air prison," etc.
  • Mis- and disinformation narratives
  • How often social media posts are referenced

In this post I have tried to err on the side of caution when it comes to interpretation and analysis, mostly presenting the data as-is. I am curious to hear what others might learn from it, and would love to get your ideas about what would be useful to look into. If you’d like to talk more or have an idea for collaboration, send an email to ryan@mediacloud.org

arrow out icon