2.3. STYLE FEATURES 17
appeal and persuade a wide scope of consumers that is not seen in true news articles. Style
approaches try to detect fake news by capturing the manipulators in the writing style of the
news content. ere are mainly three typical categories of style-based methods: Deception Styles,
Clickbaity Styles, and New Quality Styles.
2.3.1 DECEPTION STYLES
e motivation of deception detection originates from forensic psychology (i.e., Undeutsch Hy-
pothesis) [156] and various forensic tools including Criteria-based Content Analysis [160] and
Scientific-based Content Analysis [77] have been developed. More recently, advanced natural
language processing modelsare applied to spot deception phases from the following perspectives:
Deep syntax and Rhetorical structure. Deep syntax models have been implemented using prob-
abilistic context free grammars (PCFG), with which sentences can be transformed into rules
that describe the syntax structure. Based on the PCFG, different rules can be developed for de-
ception detection, such as unlexicalized/lexicalized production rules and grandparent rules [40].
Rhetorical structure theory can be utilized to capture the differences between deceptive and
truthful sentences [118]. Moreover, other features can be specifically designed to capture the
deceptive cues in writing styles to differentiate fake news, such as lying-detection features [2].
2.3.2 CLICKBAITY STYLES
Since fake news pieces are intentionally created for financial or political gain rather than for ob-
jective claims, they often contain opinionated and inflammatory language, crafted as “clickbait
(i.e., to entice users to click on the link to read the full article) or to incite confusion [25]. us,
it is reasonable to exploit linguistic features that capture the different writing styles and sensa-
tional headlines to detect fake news [137]. Biyani et al. [16] studied the characteristics of page
clickbaits,” whose news headlines were more interesting or appealing than the actual article.
We introduce the following Clickbaity Style features.
Content: Content features are used to quantify the certain content type and formatting such
as superlative (adjectives and adverbs), quotes, exclamations, use of uppercase letters, asking
questions, etc. Table 2.1 shows the content features.
Informality: Fake news as well as clickbaits can often be sensational, provoking, and gossip-
like content. erefore, their language tends to be less formal than that of professionally written
news articles. us, Biyani et al. use the following scores to indicate the readability/informality
level of a text.
Coleman–Liau score (CLScore): computed as 0:0588L 0:296S 15:8 where L is the
average number of letters and S is the average number of sentences per 100 words.
18 2. WHAT NEWS CONTENT TELLS
Table 2.1: Description of content features. Feature type “N” and “B” imply that numeric and
binary, respectively.
Feature Description Type
NumWords Number of words
NumCap Number of upper case words (excluding acronyms: words with less
than ve characters)
N
NumAcronym Number of acronyms (upper case words with less than ve characters) N
Is,NumExclm Presence/Number of exclamation marks B/N
Is,NumQues Presence/Number of question marks B/N
IsStartNum
HasNumber
Whether the title starts with a number B
HasNumber Whether the title contains a number (set only if the title doesnt start
with a number)
B
IsSuperlative Presence of superlative adverbs and adjectives (POS tags RBS, JJS) B
Is,NumQuote Presence/Number of quoted words (used “,’,‘; excluded ’m, re, ’ve, d, s,
s’)
B/N
IsStart5W1H Whether the title starts with 5W1H words (what, why, when, who,
which, how)
B
Is,NumNeg Presence/Number of negative sentiment words B/N
RIX and LIX indices (Anderson 1983 [3]): computed as RIX D LW=S and LIX D W =S C
.100LW/=W where W is the number of words, LW is the number of long words (7 or more
characters), and S is the number of sentences.
Formality measure (fmeasure) (Heylighen and Dewaele 1999 [51]): the score is used
to calculate the degree of formality of a text by measuring the amount of different
part-of-speech tags in it. It is computed as .nounfreq C adjectivefreq C prepositionfreq C
articlefreq pronounfreq verbfreq adverbfreq interjectionfreq C 100/=2.
e combination of content and informality features are then formalized as the clickbaity
style features.
2.3.3 NEWS QUALITY STYLES
News quality is a comprehensive indicator used to measure the readability, the amount of in-
formation, and the writing formality of news. Real news that is published by a professional
2.4. KNOWLEDGE-BASED METHODS 19
journalist or news agency is usually with high news quality, however the fake news that aimed
at misleading readers tends to have poor news quality.
Yang et al. [176] proposed eight types of linguistic features based on news writing guide-
lines to represent writing style, and studied the correlation between the writing style and news
quality. ey conducted research on eight aspects: readability, credibility, interactivity, sensa-
tion, logic, formality, interestingness, and structural integrity (see Table 2.2).
Table 2.2: Description of news quality style features
Categories Features Description
Readability Sentence_broken, Characters, Words,
Sentences, Clauses, Average_word_
length, Professional_words, RIX, LIX,
LW
Measuring the clarity and legibility
of the news
Credibility #@, Numerals, Ocial_speech, Time,
Place, Object, Uncertainty, Image
Measuring the rigor and reliability
of the news
Interactivity Question_mark, First_pron, Second_
pron, Interrogative_pron
Measure the interactivity between
the news and the reader
Sensation Sentiment_score, Adv_of_degree,
Modal_particle, First_pronoun, Second
_pronoun, Exclamation_mark, Ques-
tion_mark
Measure the impression that the
news leaves on the reader
Logic Forward reference, Conjunctions Determining whether the news is
logical and contextually coherent or
not
Formality Noun, Adj, Prep, Pron, Verb, Adv, Sen-
tence_broken
Measuring the formality of news
language
Interestingness Rhetoric, Exclamation mark, Face,
Idiom, Adv, Adj
Measuring the interestingness of
news language
Structural
Integrity
HasHead, HasImage, HasVideo,
HasTag, HasAt, HasUrl
Measuring the structural integrity
of news
2.4 KNOWLEDGE-BASED METHODS
Since fake news attempts to spread false claims in news content, one of the most straightforward
means of detecting it is to check the truthfulness of major claims in a news article to decide the