14 2. WHAT NEWS CONTENT TELLS
sentences around sentence s
i
. We can also introduce an attention mechanism to learn the weights
to measure the importance of each sentence, and the news article vector a is computed as follows:
a D
N
X
iD1
˛
i
h
i
; (2.10)
where ˛
i
measures the importance of i
th
sentence for the news piece a, and ˛
i
is calculated as
follows:
o
i
D tanh.W
s
h
i
C b
s
˛
i
D
exp.o
i
o
T
s
/
P
N
kD1
exp.o
k
o
T
s
/
;
(2.11)
where o
i
is a hidden representation of h
i
obtained by feeding the hidden state h
i
to a fully
embedding layer, and o
s
is the weight parameter that represents the sentence-level context vector.
2.2 VISUAL FEATURES
Visual cues have been shown to be an important manipulator for fake news propaganda.
1
As we
have described, fake news exploits the individual vulnerabilities of people and thus often relies
on sensational or even fake images to provoke anger or other emotional response of consumers.
Visual features are extracted from visual elements (e.g., images and videos) to capture the differ-
ent characteristics for fake news. Visual features are generally categorized into three types [21]:
Visual Statistical Features, Visual Content Features, and Neural Visual Features.
2.2.1 VISUAL STATISTICAL FEATURES
Visual statistical features represent the statistics attached to fake/real news pieces. Some repre-
sentative visual statistical features include [60] the following.
Count: the occurrence of images in fake news pieces, they count the total images in a news
event and the ratio of news posts containing at least one or more than one images.
Popularity: the popularity of images indicate the number of sharing on social media.
Image type: some images have particular type in resolution or style. For example, long
images are images with a very large length-to-width ratio. e ratio of these types of
images is also counted as a statistical feature.
2.2.2 VISUAL CONTENT FEATURES
Research [60] has shown that image contents in fake news and real news have different charac-
teristics. e representative visual content features are detailed as follows.
1
https://www.wired.com/2016/12/photos-fuel-spread-fake-news/
2.2. VISUAL FEATURES 15
Visual Clarity Score (VCS): Measures the distribution difference between two image sets:
one is the image set in a certain news event (event set) and the other is the image set
containing images from all events (collection set). Visual clarity score is measured as the
Kullback–Leibler divergence between two language models representing the event im-
age set and all image set, respectively. e bag-of-word image representation, such as
SIFT [82] or SURF [11] features, can be used to define language models for images.
Specifically, let p.wjc/ and p.wjk/ denote the term frequency of visual word w in collec-
tion set and event set, respectively, and the visual clarity score is denoted as
VCS D D
KL
.p.wjc/jjp.wjk//: (2.12)
Visual Coherence Score (VCoS): Measures the coherence degree of images in a certain
news event. is feature is computed based on visual similarity among images and can
reflect relevant relations of images in news events quantitatively. More specifically, the
average of similarity scores between every two images i
j
and i
k
are computed as the co-
herence score as follows:
VCoS D
1
jM.M 1/j
X
j;kD1; ;M Ij ¤k
sim.i
i
; i
k
/: (2.13)
Here M is number of images in event set and sim.i
j
; i
k
/ is the visual similarity between
image i
j
and image i
k
. In implementation, the similarity between the image pairs is cal-
culated based on their GIST [101] feature representations.
Visual Similarity Distribution Histogram (VSDH): Describes inter-image similarity in
a fine granularity level. It evaluates image distribution with a set of values by quantify-
ing the similarity matrix of each image pair in an event. e visual similarity matrix S is
obtained by calculating pairwise image similarity in a news event. e visual similarity is
also computed based on their GIST [101] feature representations. e similarity matrix S
is then quantified into an H -bin histogram by mapping each element in the matrix into
its corresponding bin, which results in a feature vector of H dimensions representing the
similarity relations among images,
VSDH.h/ D
1
M
2
jf.j; k/jj; k M; m
j;k
2 h
th
bingj; h D 1; ; H: (2.14)
Visual Diversity Score (VDS): Measures the visual difference of the image distribution.
First, images are ranked according to their popularity on social media, based on the as-
sumption that popular images would have better representation for the news event. en,
the diversity of an image is defined as its minimal difference with the images ranking be-
fore it in the entire image set [161]. At last, the visual diversity score is then calculated as
16 2. WHAT NEWS CONTENT TELLS
a weighted average of dissimilarity over all images, where top-ranked images have larger
weights [35],
VDS
D
M
X
j D1
1
j
j
X
kD1
.1
sim
.i
j
; i
k
//: (2.15)
Visual Clustering Score: Evaluates the image distribution over all images in the news
event from a clustering perspective. Representative clustering methods such as hierarchi-
cal agglomerative clustering [66] (HAC) algorithm can be utilized to obtain the image
clusters.
2.2.3 NEURAL VISUAL FEATURES
Multi-layer neural networks have been widely used for learning image feature representations.
Specifically, the specially designed architecture of CNNs are very powerful in extracting vi-
sual features from images, which can be used for various tasks [143, 162]. VGG 16 is one the
state-of-the-art CNNs (see Figure 2.3) for learning neural visual representations [143]. It is
comprised of three basic types of layers; convolutional layers for extracting translation-invariant
features from images, pooling layers for reducing the parameters, and fully connected layers for
classification tasks. To prevent CNN from over-fitting and to ease the training of deep CNNs,
dropout layers [145] and residual layers [50] are introduced to CNN structures. Recent work
that use images for fake news detection has adopted the VGG model [57, 143] to extract neural
visual features [165].
224 × 224 × 3 224 × 224 × 64
112 × 112 × 128
56 × 56 × 256
28 × 28 × 512
14 × 14 × 512
Convolution + ReLU
Maxpooling
Fully Connected + ReLU
Softmax
1 × 1 × 4096 1 × 1 × 1000
7 × 7 × 512
Figure 2.3: e illustration of VGG 16 framework for learning neural image features.
2.3 STYLE FEATURES
Fake news publishers often have malicious intent to spread distorted and misleading information
and influence large communities of consumers, requiring particular writing styles necessary to