DATA PAPER
KB4Rec: A Data Set for Linking Knowledge Bases with
Recommender Systems
Wayne Xin Zhao1†, Gaole He1, Kunlin Yang1, Hongjian Dou1, Jin Huang1, Siqi Ouyang2 & Ji-Rong Wen1
1School of Information, Renmin University of China, Peking 100872, China
2Jacobs Technion-Cornell Institute, Cornell Tech, New York 10044, USA
Schlüsselwörter: Knowledge-aware recommendation; Recommender system; Knowledge base
Zitat: W.X. Zhao, G. Er, K. Yang, H. Dou, J. Huang, S. Ouyang, & J.-R.Wen. KB4Rec: A data set for linking knowledge bases
with recommender systems. Datenintelligenz 1(2019), 121-136. doi: 10.1162/dint_a_00008
Erhalten: November 10, 2018; Überarbeitet: November 28, 2018; Akzeptiert: Dezember 2, 2018
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
.
T
/
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
ABSTRAKT
To develop a knowledge-aware recommender system, a key issue is how to obtain rich and structured
knowledge base (KB) information for recommender system (RS) Artikel. Existing data sets or methods either
use side information from original RSs (containing very few kinds of useful information) or utilize a private
KB. In diesem Papier, we present KB4Rec v1.0, a data set linking KB information for RSs. It has linked three widely
used RS data sets with two popular KBs, namely Freebase and YAGO. Based on our linked data set, we first
preform qualitative analysis experiments, and then we discuss the effect of two important factors (d.h.,
popularity and recency) on whether a RS item can be linked to a KB entity. Endlich, we compare several
knowledge-aware recommendation algorithms on our linked data set.
1. EINFÜHRUNG
Recommender systems (RS), which aim to match users with their interested items, have played an
important role in various online applications nowadays. Traditional recommendation algorithms mainly
focus on learning effective preference models from historical user-item interaction data, z.B., Matrix
factorization [1]. With the rapid development of Web technologies, various kinds of side information have
become available in RSs [2]. At an early stage, the used context information is usually unstructured, Und
its availability is limited to specific data domains or platforms.
† Corresponding author: Wayne Xin Zhao (Email: batmanfly@gmail.com; ORCID: 0000-0002-8333-6196).
© 2019 Chinese Academy of Sciences Published under a Creative Commons Attribution 4.0 International (CC BY 4.0)
Lizenz
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
More and more efforts have been made recently by both research and industry communities for structuring
world knowledge or domain facts in a variety of data domains. One of the most typical organization forms
is knowledge base (KB) [3]. KBs provide a general and unified way to organize and associate information
entities, which have been shown to be useful in many applications. Zum Beispiel, KBs have been used in
recommender systems, called knowledge-aware recommender systems [4]. To develop a knowledge-aware
recommender system, a key issue is how to obtain rich and structured KB information for RS items. Gesamt,
there are two main solutions from existing studies. Erste, side information has been collected from the RS
platform and used as contextual features [5, 6, 7, 8, 9], and some studies further construct tiny and simple
KB-like knowledge structure [10, 11, 12]. The number of attributes or relations is usually small, and much
useful item information is likely to be missing. Zweite, several works propose to link RS with private
KBs [13, 14, 15]. The linkage results are not publicly available. We are also aware of some closely related
Studien [16, 17], which aim to link RS items with DBpedia entities. By comparsion, our focus is on Freebase
[18] and YAGO [19], which are now widely used in many nature language processing (NLP) or related
domains [20, 21, 22].
To address the need for the linked data set of RS and KBs, we present a data set which links two public
KBs with recommender systems, named KB4Rec v1.0, freely available at https://github.com/RUCDM/
KB4Rec. Our basic idea is to heuristically link items from RSs with entities from public large-scale KBs.
On the RS side, we select three widely used data sets (d.h., MovieLens [5], LFM-1b [6] and Amazon book
[7]) covering three different data domains, namely movie, music and book; on the KB side, we select the
two well-known KBs (d.h., Freebase and YAGO). We try to maximize the applicability of our linked data set
by selecting very popular RS data sets and KBs. We do not share the original data sets, since they are
maintained by original researchers or publishers. These original copies are easily accessible online.
In our KB4Rec v1.0 data set, we have organized the linkage results as linked ID pairs, which consist of
a RS item ID and a KB entity ID. All the IDs are inner values from the original data sets. Once such a
linkage has been accomplished, it is able to reuse existing large-scale KB data for RSs. Zum Beispiel, Die
movie “Avatar” from MovieLens data set [5] has a corresponding entity entry in Freebase, and we are able
to obtain its attribute information by retrieving all its associated relation triples in Freebase. Based on the
linked data set, we first preform some qualitative analysis experiments, and then we discuss the effect of
two important factors (d.h., popularity and recency) on whether a RS item can be linked to a KB entity.
Endlich, we compare several knowledge-aware recommendation algorithms on our linked data set.
With our linkage results and original data copies, it is easy to develop an evaluation set for knowledge-
aware recommendation algorithms. We believe such a data set is beneficial to the development of
knowledge-aware recommender systems.
We use the terms of “items” and “entities,” respectively, for RSs and KBs.
122
Datenintelligenz
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
T
/
.
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
2. EXISTING DATA SETS AND METHODS
In diesem Abschnitt, we briefly review the related data sets and methods.
Early knowledge-aware recommendation algorithms are also called context-aware recommendation
Algorithmen, in which the side information from the original RS platform is considered context data. Für
Beispiel, social network information of Epinions data set is utilized in [23, 24], POI property information
of Yelp data set is utilized in [11], movie attribute information of MovieLens data set is utilized in [10] Und
user profile information of microblogging data set has been utilized in [25, 8]. These data sets usually
contain very few kinds of side information, and the relation between different kinds of side information is
ignored.
To obtain more structured side information, Heterogeneous Information Networks (HIN) have been
proposed as a technique for modeling complex connections between different types of objects [26]. In
HINs, we can effectively learn underlying relation patterns (called meta-path) and organize side information
via meta-path-based representations. Zum Beispiel, HIN-based recommendation systems have been applied
to solve PER [10], HeteRecom [27] and MCRec [28]. HIN based algorithms usually rely on graph search
Algorithmen, which is difficult to deal with large-scale relation pattern finding.
More recently, KBs have become a popular kind of data resources to store and organize world knowledge
or domain facts. Many studies have been carried out on the construction, inference and applications of
KBs [3]. In particular, several pioneering studies [13, 14, 15] try to leverage existing KB information for
improving the recommendation performance. They apply a heuristic method for linking RS items with KB
entities. In these studies, they use a private KB for linkage, which is not accessible to the public.
We are also aware of some closely related studies [16, 17], which aim to link RS items with KB entities.
Nevertheless our focus is on Freebase and YAGO, which are now widely used in many NLP or related
domains [20, 21, 22]. Besides, our data sets contain more linked entities and involved relations.
3. LINKED DATA SET CONSTRUCTION
In our work, we need to prepare two kinds of data sets, namely RS and KB. We first describe the original
RS and KB data sets and then discuss the linkage method.
3.1 RS Data Sets
Consider three popular RS data sets for linkage, namely MovieLens, LFM-1b and Amazon book, welche
are from three different domains of movie, music and book, jeweils.
•
MovieLens data set [7] describes users’ preferences on movies. A preference record takes the form
have been four MovieLens data sets released, known as 100K, 1M, 10M and 20M, reflecting the
approximate number of ratings in each data set. We select the largest MovieLens 20M for linkage.
Datenintelligenz
123
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
/
T
.
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
•
•
LFM-1b data set [8] describes users’ interaction records on music. It provides information including
artists, albums, tracks and users, as well as individual listening events. It records the listening events
of a user on songs, but does not contain rating information.
Amazon book data set [9] describes users’ preferences on book products, which has a data form, d.h.,
million users across nearly 23 million items.
The three data sets all provide several kinds of side information such as item titles (alle), IMDB ID (movie),
writer (Buch) and artist (Musik). We utilize such side information for subsequent KB linkage.
3.2 KB Data Sets
We adopt two large-scale pubic KBs, namely Freebase and YAGO.
Freebase [18] is a KG announced by Metaweb Technologies, Inc. In 2007 and was acquired by Google
Inc. on July 16, 2010. Freebase stores facts by triples of the form
down its services on August 31, 2016, we use its latest public version.
YAGO [19] is a large semantic KB, which is automatically constructed based on the information of
Wikipedia, WordNet, GeoNames and other data sources. It contains 447 million facts about 9.8 Million
entities in 10 different languages, with an accuracy of above 95% based on manual evaluation. In this
Papier, we use the version of YAGO in [29].
3.3 RS to KB Linkage
With two KB data sets and three RS data sets, we can form six linkage results. Nächste, we describe the
heuristic method for data linkage.
All three RS data sets provide the information of item titles. For Freebase, with offline KB search APIs,
we retrieve KB entities with item titles as queries. Our heuristic linkage method follows the similar idea in
[30]. If no KB entity with the exact same title was returned, we say the RS item is rejected in the linkage
Verfahren. If at least one KB entity with the exact same title was returned, we further incorporate one kind
of side information as a refined constraint for accurate linkage: IMBD ID, artist name and writer name are
used for the three domains of movie, music and book, jeweils. We have found only a small number
(um 1,000 for each domain) of RS items cannot be accurately linked or rejected via the above procedure,
and we simply discard them.
For YAGO, a KB entity is named in a similar way as that in its corresponding Wikipedia URL, in which
it is composed of the item title and its related information such as type. Zum Beispiel, film “Titanic” is marked
as “
link https://en.wikipedia.org/wiki/Titanic_(1997_film). daher, we first compare the title of RS items with
the prefix of KB entities. If at least one KB entity was returned, we leverage the “rdf:type” relation and suffix
124
Datenintelligenz
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
.
/
T
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
(if available) to filter out those entities from other domains. We find that most of the linkage in LFM-1b and
Amazon book data sets can be determined accurately (either linked or non-linked) in this way. By comparison,
there exist some ambiguous cases in MovieLens 20M data set, and they are further evaluated through the
year restriction.
During the linkage process, we have dealt with several problems that will affect the results of string match
Algorithmen, z.B., lowercase, abbreviation and the order of family/given names. Since the LFM-1b data set
is extremely large, we remove all the music items with fewer than 10 listening events. Even after filtering,
it still contains about 6.5 million music items.
We present an illustrative example for our linkage results in Figure 1. In this example, there are two pairs
of an item from MovieLens 20M and its linked entity from Freebase. The two movie items are “Spider man”
and “Spider man 2.” It is clear to see that both movies share many common attributes in Freebase. Mit
such linkage results, it is easy to obtain rich KB information about RS items, which are likely to be useful
in recommendation performance.
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
.
/
T
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
(cid:57)(cid:86)(cid:79)(cid:74)(cid:75)(cid:88)(cid:3)(cid:3) (cid:83) (cid:71)(cid:84)
(cid:90)(cid:79)(cid:90)(cid:82)(cid:75)
(cid:76)
(cid:79)
(cid:82)
(cid:83)
(cid:20)
(cid:84)
(cid:71)
(cid:83)
(cid:75)
(cid:27)(cid:25)(cid:26)(cid:31)
(cid:82)(cid:79)(cid:84)(cid:81)(cid:71)(cid:77)(cid:75)
(cid:83)(cid:20)(cid:22)(cid:23)(cid:24)(cid:89)(cid:23)(cid:74)
(cid:79)
(cid:83)
(cid:74)
(cid:72)
(cid:19)
(cid:47)
(cid:42)
(cid:72)(cid:47)(cid:42)
(cid:74)
(cid:76)(cid:79)(cid:82)(cid:83)(cid:20)(cid:79)(cid:83)
(cid:22)(cid:23)(cid:26)(cid:27)(cid:26)(cid:30)(cid:29)
(cid:76)(cid:79)(cid:82)(cid:83)(cid:20)(cid:89)(cid:90)(cid:71)(cid:88)
(cid:76)(cid:79)(cid:82) (cid:83) (cid:20)(cid:89) (cid:90) (cid:85) (cid:88) (cid:95)
(cid:95)
(cid:72)
(cid:69)
(cid:76)(cid:79)(cid:82)(cid:83)(cid:20)(cid:89)(cid:90)(cid:85)(cid:88)(cid:95)(cid:69)(cid:72)(cid:95)
(cid:76)(cid:79)(cid:82)(cid:83)(cid:20)(cid:74)(cid:79)(cid:88)(cid:75)(cid:73)(cid:90)(cid:75)
(cid:76)(cid:79)(cid:82)
(cid:83)
(cid:20)
(cid:83)
(cid:91)
(cid:89)(cid:79)
(cid:73)
(cid:74)(cid:69)(cid:72)(cid:95)
(cid:58)(cid:85)(cid:72)(cid:75)(cid:95)
(cid:51)(cid:71)(cid:77)(cid:91)(cid:79)(cid:88)(cid:75)
(cid:57)(cid:90)(cid:71)(cid:84)(cid:3)(cid:50)(cid:75)(cid:75)
(cid:57)(cid:90)(cid:75)(cid:92)(cid:75)(cid:3)(cid:42)(cid:79)(cid:90)(cid:81)(cid:85)
(cid:57)(cid:71)(cid:83)(cid:3)(cid:56)(cid:71)(cid:79)(cid:83)(cid:79)
(cid:42)(cid:71)(cid:84)(cid:84)(cid:95)
(cid:43)(cid:82)(cid:76)(cid:83)(cid:71)(cid:84)
(cid:76)(cid:79)(cid:82)
(cid:83)
(cid:20)(cid:89)
(cid:90)
(cid:71)
(cid:88)
(cid:76)(cid:79)(cid:82)(cid:83)(cid:20)(cid:89)(cid:90)(cid:85)(cid:88)(cid:95)(cid:69)(cid:72)(cid:95)
(cid:57)(cid:86)(cid:79)(cid:74)(cid:75)(cid:88)(cid:3) (cid:83) (cid:71)(cid:84)(cid:3)(cid:24)
(cid:90)
(cid:79)
(cid:90)
(cid:82)
(cid:75)
(cid:75)
(cid:83)
(cid:76)(cid:79)(cid:82)(cid:83)(cid:20)(cid:84)(cid:71)
(cid:76)(cid:79)(cid:82)(cid:83)(cid:20)(cid:89)(cid:90)(cid:85)(cid:88)(cid:95)(cid:69)(cid:72)(cid:95)
(cid:72)
(cid:69)
(cid:74)
(cid:73) (cid:90) (cid:75)
(cid:76)(cid:79)(cid:82) (cid:83) (cid:20) (cid:74) (cid:79)(cid:88) (cid:75)
(cid:76)(cid:79)(cid:82) (cid:83) (cid:20)(cid:83) (cid:91)(cid:89)(cid:79)(cid:73)
(cid:95)
(cid:83)(cid:20)(cid:22)(cid:24)(cid:93)(cid:77)(cid:81)(cid:23)
(cid:82)(cid:79)(cid:84)(cid:81)(cid:71)(cid:77)(cid:75)
(cid:30)(cid:28)(cid:25)(cid:28)
(cid:76)
(cid:79)
(cid:82)
(cid:83)
(cid:20)
(cid:79)
(cid:83)
(cid:74)
(cid:72)
(cid:47)
(cid:42)
(cid:72)(cid:19)(cid:47)(cid:42)
(cid:74)
(cid:79)(cid:83)
(cid:22)(cid:25)(cid:23)(cid:28)(cid:28)(cid:27)(cid:26)
Figur 1. Linkage example of MovieLens 20M items with Freebase entities. Notiz: We highlight the MovieLens IDs
and Freebase IDs in color.
3.4 B asic Statistics
We summarize the basic statistics of the three linked data sets in Table 1. It can be observed that for the
MovieLens 20M data set, we have a very high linkage ratio: um 95.2% oder 79.5% items can be accurately
linked to an entity from Freebase or YAGO. But for the rest two domains, the linkage ratios are very low,
especially using YAGO for linkage. MovieLens 20M data set has a high linkage ratio, which is probably
because that it contains fewer items than the other two data sets, which themselves are refined by original
releasers. Besides, we speculate that there may be some domain bias in the construction of KBs. Gesamt,
more RS items can be linked with Freebase entities than YAGO. Although the linkage ratios for the latter
two data sets are not high, the absolute numbers of linked items are large. We also report the number of
Datenintelligenz
125
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
overlapping linked entities for the two KBs in the last row of Table 1. We can see that there are also more
linked items in the movie domain. Such a linked data set is feasible for research-purpose studies.
Tisch 1. Statistics of the linkage results.
Data sets
Numbers
MovieLens 20M
LFM-1b
Amazon book
RS data sets
Freebase
YAGO
Overlap
#Users
#Items
#Interaktionen
#Linked-Items
Linkage ratio
#Linked-Items
Linkage ratio
#Überlappung
138,493
27,279
20,000,263
25,982
95.2%
21,688
79.5%
21,221
120,317
6,479,700
1,021,931,544
1,254,923
19.4%
49,608
0.8%
26,126
3,468,412
2,330,066
22,507,155
109,671
4.7%
17,607
0.8%
7,398
Notiz: The three domains correspond to the RS data sets of MovieLens 20M, LFM-1b and Amazon book, jeweils.
3.5 Shared Data Sets
We name the above linked KB data set for recommender systems as KB4Rec v1.0, freely available at
https://github.com/RUCDM/KB4Rec. In our KB4Rec v1.0 data set, we organized the linkage results by
linked ID pairs, which consist of a RS item ID and a KB entity ID. All the IDs are inner values from the
original data sets. For Freebase, we have 25,982, 1,254,923 Und 109,671 linked ID pairs for MovieLens
20M, LFM-1b and Amazon book, jeweils; for YAGO, we have 21,688, 49,608 Und 17,607 linked ID
pairs for MovieLens 20M, LFM-1b and Amazon book, jeweils.
4. LINKAGE ANALYSIS
Previously, we have shown the linkage ratios for different data sets. We find that a considerable amount
of RS items cannot be linked to KB entities. It is interesting to study what factors will affect the linkage
Verhältnis. We consider two factors for analysis.
4.1 Effect of Popularity on Linkage
Intuitively, a popular RS item should be more likely to be included in a KB than an unpopular item, seit
it is reasonable to incorporate more “important” RS items rated by the RS users into KBs. The construction
of KB itself usually involves manual efforts, which is difficult to avoid the bias of human attention. To
measure the popularity of a RS item, we adopt a simple frequency-based method by counting the number
of users who have interacted with the item. This measure characterizes the attractiveness of an item from
the users in a RS. Erste, we sort the items ascendingly according to its popularity value. Dann, we further
equally divide all the items into five ordered bins with the same number in each bin. Somit, an item with
a larger bin number will be more popular than another with a smaller bin number. Dann, wir berechnen die
linkage ratio for each bin and the results are reported in Figure 2. It can be observed that a bin with a larger
126
Datenintelligenz
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
/
T
.
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
number has a higher linkage ratio than the ones with a smaller number. The results indicate that popularity
is likely to have a positive effect on linkage.
1
Ö
ich
T
A
R
e
G
A
k
N
ich
L
0.96
0.92
0.88
0.84
0.80
A
C
Popularity bins in MovieLens 20M
D
B
E
(A) Freebase-Movie
0.90
Ö
ich
T
A
R
e
G
A
k
N
ich
L
0.84
0.78
0.72
0.66
0.60
A
C
Popularity bins in MovieLens 20M
D
B
E
(D) YAGO-Movie
Ö
ich
T
A
R
e
G
A
k
N
ich
L
0.6
0.5
0.4
0.3
0.2
0.1
0
Ö
ich
T
A
R
e
G
A
k
N
ich
L
0.06
0.05
0.04
0.03
0.02
0.01
0
Ö
ich
T
A
R
e
G
A
k
N
ich
L
0.09
0.08
0.07
0.06
0.05
0.04
0.03
0.02
Ö
ich
T
A
R
e
G
A
k
N
ich
L
0.028
0.024
0.020
0.016
0.012
E
A
B
D
C
E
Popularity bins in LFM-1b
(B) Freebase-Music
B
D
A
C
Popularity bins in LFM-1b
(e) YAGO-Music
B
A
C
Popularity bins in Amazon book
D
E
(C) Freebase-Book
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
A
C
Popularity bins in Amazon book
D
B
E
(F) YAGO-Book
Figur 2. Examining the effect of popularity on the linkage results. Notiz: We use A, B, … to indicate the bin
number in an ordered way. The fi rst three subfi gures correspond to the popularity analysis on Freebase, and the last
three subfi gures correspond to the popularity analysis on YAGO.
4.2 E ffect of Recency on Linkage
The second factor we consider is the recency, d.h., the time when a RS item was created. Our assumption
is that if a RS item was created or released on an earlier time, it would be more probable to be included
in KBs. Since human attention aggregation is a gradually growing process, a RS item usually requires a
considerable amount of time to become popular. To check this assumption, we need to obtain the release
date of RS items. Jedoch, only the MovieLens 20M data set contains such an attribute information, so we
only report the analysis result on this data set. We first sort the items according to their release dates
ascendingly, and then equally divide all the items into 10 ordered bins following the procedure of the above
popularity analysis. Endlich, we compute the linkage ratios for each bin. The results are reported in Figure
3. We can see that the linkage ratios gradually decrease with time going by. The results indicate that recency
is likely to have a negative effect on linkage, d.h., an older RS item seems to be more probable to be included
in a KB than a more recent one. In Figure 3 (A), the last bin has a dramatic drop, since our version of
MovieLens is April 2015.
Datenintelligenz
127
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
/
.
T
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
1
Ö
ich
T
A
R
0.95
e
G
A
k
N
ich
L
0.90
0.85
0.95
0.90
0.85
0.80
0.75
Ö
ich
T
A
R
e
G
A
k
N
ich
L
0.80
A B C D E
F G H I
J
A B C D E
F G H I
J
Time bins in MovieLens 20M
Time bins in MovieLens 20M
(A) Freebase-Movie
(B) YAGO-Movie
0.70
Figur 3. Examining the effect of recency on the linkage results. Notiz: We use A, B, … to indicate the bin number
in an ordered way. The fi rst subfi gure corresponds to the recency analysis on Freebase, and the second subfi gure
corresponds to the recency analysis on YAGO.
The above analysis has indicated that both popularity and recency have a considerable effect on the final
linkage results. Jedoch, the construction process of KB is very complicated, and many important factors
will affect this process. For future research, it is worth delving into what are other important factors and
how they affact the construction process of KB.
5. EX PERIMENT
In diesem Abschnitt, we present the comparison of some existing recommendation algorithms using our linked
data sets.
5.1 Ex perimental Setup
Our purpose is to test whether the incorporated KB information is useful to improve the recommendation
Leistung. In Freebase, there are more linked entities and associated relations. So we only adopt the
linked data set of Freebase for evaluation, and the results from YAGO are similar and omitted here.
The original linked data set is very large, so we first generate a small evaluation set for the following
experiments. We took the subset from the last year for LFM-1b data set and the subset from year 2005 Zu
2015 for MovieLens 20M data set. We also perform 3-core filtering for Amazon book data set and 10-core
filtering for other data sets. This part mainly follows the preprocessing step in [31]. Und dann, we have kept
items which are linked by our data set. We report the statistics of data sets in Table 2.
Tisch 2. Statistics of the evaluation data sets for the Freebase KB.
Data sets
MovieLens 20M
LFM-1b
Amazon book
#Users
61,583
7,694
65,125
#Items
19,533
30,658
69,975
#Interaktionen
5,868,015
203,975
828,560
Notiz: In this data set, all the items are linked with Freebase.
128
Datenintelligenz
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
T
.
/
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
Following [32], we consider the last-item recommendation task for evaluation. We set up such a task
since it is a commonly used evaluation setting for RSs, and it is easy to compare different methods. Gegeben
a user, first we sort the items according to the interaction timestamp ascendingly, and then we take the last
item into the test set and the rest into training set. The final goal is to predict the last item given the previous
interaction sequence of a user. Since enumerating all the items as candidate is time-consuming, we pair
each ground-truth with 100 negative items to form a randomly ordered list. Then each comparison method
is to return a ranked list according to its recommendation confidence. To evaluate different methods, Wir
adopt a variety of evaluation metrics, including the Mean Reciprocal Rank (MRR), Hit Ratio (HR) Und
Normalized Discounted cumulative gain (NDCG).
5.2 KB Information Representation
Our focus is to provide rich KB information for recommender systems. A simple way is to represent KB
information with a one-hot vector, which is sparse and large. Here we borrow the idea in [15, 33] to embed
KB data into low-dimensional vectors. Then the learned embeddings are used for subsequent recommendation
Algorithmen. To train TransE [33], we start with linked entities as seeds and expand the graph with one-step
suchen. As not all the relations in KBs are useful, we remove unfrequent and general-purpose relations
together with all their associated KB triples. After that, each linked item is associated with a learned KB
embedding vector. We report the statistics for training TransE in Table 3.
Tisch 3. Statistics of our subgraph for training TransE.
Data sets
MovieLens 20M
LFM-1b
Amazon book
#Entities
1,125,099
214,524
313,956
#Beziehungen
81
19
49
Notiz: #Entities indicates the number of entities that are extended by seed entities with one-step search in Freebase.
5.3 Methods to Compare
We consider the following methods for performance comparison:
•
•
BPR [34]: It learns a matrix factorization model by minimizing the pairwise ranking loss in a Bayesian
Rahmen.
SVDFeature [35]: It is a model for feature-based collaborative filtering. In this paper we use the KB
embeddings as context features to feed into SVDFeature.
Here, since our purpose is to illustrate the use of this linked data set, we only select four methods for performance
comparison. We will try more knowledge-ware recommendation algorithms in our future work.
Datenintelligenz
129
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
T
.
/
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
•
•
mCKE [13]: It first proposes to incorporate KB and other information to improve the recommendation
Leistung. For fairness, we implement a simplified version of CKE by only using KB information,
and exclude image and text information. Different from the original CKE, we fix KB representations
and adopt the learned embeddings by TransE.
KSR [31]: It is a Knowledge-enhanced Sequential Recommender (KSR). It incorporates KB information
to enhance the semantic representation memory networks.
5.4 Results and Analysis
The results of different methods for the last-item recommendation are presented in Table 4. We can
see that:
1)
2)
3)
Among all the methods, BPR performs worst on the first two data sets, but very well on the Amazon
book data set. A possible reason is the first two data sets are relatively dense while the Amazon book
data set is sparse. A lightweight method is likely to obtain a better performance than more complicated
methods on a sparse data set.
SVDFeature is implemented with a pairwise ranking loss function, and it can be roughly understood
as an enhanced BPR model with the incorporation of the learned KB embeddings. Compared with
BPR, SVDFeature is slightly better on the MovieLens 20M data set, substantially better on the LFM-1b
data set, but worse on the Amazon book data set. In SVDFeature, each context feature will incorporate
some number of parameters (deciding on the number of dimensions). Somit, on a sparse data set,
it may not work better than the simple BPR model.
Nächste, we analyze the performance of the knowledge-aware recommendation methods, nämlich
mCKE and KSR. Gesamt, mCKE does not work well as expected, which only has a good performance
on the LFM-1b data set. A possible reason is that our implementation of mCKE fixes the learned KB
embeddings, while the original CKE model adaptively updates KB embeddings. As a comparison,
the recently proposed KSR method works best consistently on the three data sets. KSR combines the
capacity of modeling data sequences from Recurrent Neural Networks (RNN) and the capacity of
storing data in a long term from Memory Networks (MN). It further enhances MNs with the learned
KB embeddings.
130
Datenintelligenz
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
T
/
.
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
Tisch 4. Performance comparison of different methods on the task of last-item recommendation.
Data sets
Methoden
MovieLens 20M
LFM-1b
Amazon book
6. CONCLUSION
BPR
SVDFeature
mCKE
KSR
BPR
SVDFeature
mCKE
KSR
BPR
SVDFeature
mCKE
KSR
MRR
0.128
0.204
0.178
0.294
0.227
0.337
0.371
0.427
0.222
0.264
0.248
0.353
Hit@10
NDCG@10
0.276
0.448
0.382
0.571
0.458
0.544
0.541
0.607
0.505
0.544
0.494
0.653
0.144
0.243
0.209
0.344
0.265
0.373
0.399
0.460
0.272
0.315
0.291
0.413
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
In diesem Papier, we present KB4Rec v1.0, a data set linking KB information for recommender systems. Es
has linked three widely used RS data sets with the popular KBs Freebase [18] and YAGO [19]. Based on
our linked data set, we first preform some qualitative analysis experiments, and then we discuss the effect
of two important factors (d.h., popularity and recency) on whether a RS item can be linked to a KB entity.
Endlich, we compare several knowledge-aware recommendation algorithms on our linked data set.
For future work, we will consider linking more RS data sets with KBs. We will also test the performance
of more knowledge-aware recommendation algorithms on more recomme ndation tasks using the linked
data set.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
/
T
.
ich
BEITRÄGE DES AUTORS
W.X. Zhao (batmanfly@gmail.com, corresponding author) and J.-R. Wen (jrwen@ruc.edu.cn) led the
whole work. W.X. Zhao organized the content and wrote the paper. G. Er (hegaole@ruc.edu.cn), H. Dou
(hongjiandou@ruc.edu.cn) and J. Huang (jin.huang@ruc.edu.cn) generated the linkage results for Freebase
and run the experimental results; K. Yang (kunliny@ruc.edu.cn) generated the linkage results for YAGO. Alle
the authors have made meaningful and valuable contributions in revising and proofreading the manuscript.
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
ACKNOWLEDGMENTS
The work was partially supported by National Natural Science Foundation of China under the grant
Zahlen 61872369, 61832017 Und 61502502.
Datenintelligenz
131
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
VERWEISE
[1] Y. Koren, R. Glocke, & C. Volinsky. Matrix factorization techniques for recommender systems. Computer 42(8)
[2]
(2009), 30–37. doi: 10.1109/MC.2009.263.
S. Rendle. Factorization machines with libFM. ACM Transactions on Intelligent Systems and Technology 3(3)
(2012), Artikel Nummer. 57. doi: 10.1145/2168752.2168771.
[3] Q. Wang, Z. Mao, B.Wang, & L. Guo. Knowledge graph embedding: A survey of approaches and applica-
tionen. IEEE Transactions on Knowledge and Data Engineering 29(12)(2017), 2724–2743. doi: 10.1109/
TKDE.2017.2754499.
S. Bouraga, ICH. Jureta, S. Faulkner, & C. Herssens. Knowledge-based recommendation systems: A survey. Inter-
national Journal of Intelligent Information Technologies 10(2)(2014), 1–19. doi: 10.4018/ijiit.2014040101.
F. M. Harper, & J. A. Konstan. The MovieLens data sets: History and context. ACM Transactions on Interactive
Intelligent Systems 5(4)(2016), Artikel Nummer. 19. doi: 10.1145/2827872.
[4]
[5]
[6] M. Schedl. The lfm-1b data set for music retrieval and recommendation. In: Verfahren der 2016 ACM
on International Conference on Multimedia Retrieval, 2016, S. 103–110. doi: 10.1145/2911996.2912004.
[7] R. Er, & J. Mcauley. Ups and downs: Modeling the visual evolution of fashion trends with one-class
collaborative filtering. In: Proceedings of the 25th International Conference on World Wide Web, 2016,
S. 507–517. doi: 10.1145/2872427.2883037.
[8] W.X. Zhao, S. Li, Y. Er, E.Y. Chang, J.-R. Wen, & X. Li. Connecting social media to e-commerce: Cold-start
product recommendation using microblogging information. IEEE Transactions on Knowledge and Data
Maschinenbau 28(5)(2016), 1147–1159. doi: 10.1109/TKDE.2015.2508816.
J. Huang, Z. Ren, W.X. Zhao, G. Er, J. Wen, & D. Dong. Taxonomy-aware multi-hop reasoning networks for
sequential recommendation. To appear in WSDM 2019.
[9]
[10] X. Yu, X. Ren, Y. Sun, Q. Gu, B. Sturt, U. Khandelwal, B. Norick, & J. Han. Personalized entity recommenda-
tion: A heterogeneous information network approach. In: Proceedings of the 7th ACM International Confer-
ence on Web Search and Data Mining, 2014, S. 283–292. doi: 10.1145/2556195.2556259.
[11] H. Gao, J. Tang, X. Hu, & H. Liu. Content-aware point of interest recommendation on location-based social
Netzwerke. In: AAAI’15 Proceedings of the Twenty-Ninth AAAI Conference on Artificial Intelligence, 2015,
S. 1721–1727. Verfügbar um: https://dl.acm.org/citation.cfm?id=2886559.
[12] C. Shi, B. Hu, W.X. Zhao, & P.S. Yu. Heterogeneous information network embedding for recommendation.
To appear in IEEE Transactions on Knowledge and Data Engineering. DOI: 10.1109/TKDE.2018.2833443.
[13] F. Zhang, N.J. Yuan, D. Lian, X. Xie, & W. Ma. Collaborative knowledge base embedding for recommender
Systeme. In: Proceedings of the 22nd ACM SIGKDD International Conference on Knowledge Discovery and
Data Mining, 2016, S. 353–362. doi: 10.1145/2939672.2939673.
[14] H. Wang, F. Zhang, X. Xie, & M. Guo. DKN: Deep knowledge-aware network for news recommendation.
In: Der 2018 World Wide Web Conference, 2018, S. 1835–1844. doi: 10.1145/3178876.3186175.
[15] H. Wang, F. Zhang, J. Wang, M. Zhao, W. Li, X. Xie, & M. Guo. Ripple network: Propagating user preferences
on the knowledge graph for recommender systems. arXiv preprint. arXiv: 1803.03467v4.
[16] T.D. Noia, V.C. Ostuni, P. Tomeo, & E.D. Sciascio. Sprank: Semantic path-based ranking for top-n recom-
mendations using linked open data. ACM Transactions on Intelligent Systems and Technology 8(1)(2016),
Artikel Nummer. 9. doi: 10.1145/2899005.
[17] T.D. Noia, & V.C. Ostuni. Recommender systems and linked open data. In: W. Faber, & A. Paschke (Hrsg.)
Reasoning Web 2015: Reasoning Web. Web Logic Rules. Cham, Schweiz: Springer International
Veröffentlichung, 2015, S. 88–113. doi: 10.1007/978-3-319-21768-0_4.
[18] Google. Freebase data dumps. Verfügbar um: https://developers.google.com/freebase/data.
[19] F.M. Suchanek, G. Kasneci, & G. Weikum. Yago: A core of semantic knowledge. In: The 16th International
Conference on the World Wide Web, 2007, S. 697–706. doi: 10.1145/1242572.1242667.
132
Datenintelligenz
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
.
/
T
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
[20] H. Bast, & E. Haussmann. More accurate question answering on Freebase. In: Proceedings of the 24th
ACM International Conference on Information and Knowledge Management, 2015, S. 1431–1440. doi:
10.1145/2806416.2806472.
[21] W. Cui, Y. Xiao, & W. Wang. KBQA: An online template based question answering system over Freebase.
In: Proceedings of the Twenty-Fifth International Joint Conference on Artificial Intelligence, 2016, S. 4240–
4241. Verfügbar um: https://dl.acm.org/citation.cfm?id=3061256.
[22] P. Adolphs, M. Theobald, U. Schafer, H. Uszkoreit, & G. Weikum. YAGO-QA: Answering questions by
structured knowledge queries. In: Verfahren der 2011 IEEE Fifth International Conference on Semantic
Computing, 2011, S. 158–161. doi: 10.1109/ICSC.2011.30.
[23] M. Jamali, & M. Ester. A matrix factorization technique with trust propagation for recommendation in social
Netzwerke. In: Proceedings of the fourth ACM conference on Recommender systems, 2010, S. 135–142.
doi: 10.1145/1864708.1864736.
[24] H. Ma, ICH. King, & M.R. Lyu. Learning to recommend with social trust ensemble. In: Proceedings of the
32nd international ACM SIGIR Conference on Research and Development in Information Retrieval, 2009,
S. 203–210. doi: 10.1145/1571941.1571978.
[25] W.X. Zhao, Y. Guo, Y. Er, H. Jiang, Y. Wu, & X. Li. We know what you want to buy: A demographic-based
system for product recommendation on microblogs. In: Proceedings of the 20th ACM SIGKDD International
Conference on Knowledge Discovery and Data Mining, 2014, S. 1935–1944. doi: 10.1145/2623330.2623351.
[26] Y. Sun, & J. Han. Mining heterogeneous information networks: A structural analysis approach. ACM SIGKDD
Explorations Newsletter 13(2)(2012), 20–28. doi: 10.1145/2481244.2481248.
[27] C. Shi, C. Zhou, X. Kong, P.S. Yu, G. Liu, & B. Wang. Heterecom: A semantic-based recommendation
systemin heterogeneous networks. In: Proceedings of the 18th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, 2012, S. 1552–1555. doi: 10.1145/2339530.2339778.
[28] B. Hu, C. Shi, W.X. Zhao, & P.S. Yu. Leveraging meta-path based context for top-N recommendation with
a neural co-attention model. In: Proceedings of the 24th ACM SIGKDD International Conference on
Knowledge Discovery and Data Mining, 2018, S. 1531–1540. doi: 10.1145/3219819.3219965.
[29] J. Hoffart, F.M. Suchanek, K. Berberich, & G. Weikum. Yago2: A spatially and temporally enhanced knowl-
edge base from Wikipedia. Artificial Intelligence 194 (2013), 28–61. doi: 10.1016/j.artint.2012.06.001.
[30] T. Scheffler, R. Schirru, & P. Lehmann. Matching points of interest from different social networking sites.
In: Proceedings of the 35th Annual German Conference on Advances in Artificial Intelligence, 2012,
S. 245–248. doi: 10.1007/978-3-642-33347-7_24.
[31] J. Huang, W.X. Zhao, H. Dou, J. Wen, & E.Y. Chang. Improving sequential recommendation with knowledge-
enhanced memory networks. To appear in SIGIR 2018.
[32] R. Er, W. Kang, & J. McAuley. Translation-based recommendation: A scalable method for modeling
sequential behavior. In: Proceedings of the Eleventh ACM Conference on Recommender Systems, 2017,
S. 161–169. doi: 10.1145/3109859.3109882.
[33] A. Bordes, N. Usunier, A. García-Durán, J. Weston, & Ö. Yakhnenko. Translating embeddings for modeling
multi-relational data. In: the Neural Information Processing Systems Conference (NIPS 2013), 2013,
S. 2787–2795. Verfügbar um: http://papers.nips.cc/paper/5071-translating-embeddings-for-modeling-multi-
relational-data.pdf.
[34] S. Rendle, C. Freudenthaler, Z. Gantner, & L. Schmidt-Thieme. BPR: Bayesian personalized ranking from
implicit feedback. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence,
2009, S. 452–461. Verfügbar um: https://dl.acm.org/citation.cfm?id=1795114.1795167
[35] T. Chen,W. Zhang, Q. Lu, K. Chen, Z. Zheng, & Y. Yu. Svdfeature: A toolkit for feature-based collaborative
filtering. The Journal of Machine Learning Research 13(1)(2012), 3619–3622. Availablet at: https://dl.acm.
org/citation.cfm?
Datenintelligenz
133
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
.
/
T
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
BIOGRAPHIE DES AUTORS
Wayne Xin Zhao is currently an associate professor at the School of
Information, Renmin University of China. He received his PhD Degree from
Peking University in 2014. His research interests are recommender systems
and natural language processing. He has published more than 50 referred
papers in international conferences and journals.
Gaole He is currently a graduate student at the School of Information,
Renmin University of China. He received his Bachelor Degree from School
of Information, Renmin University of China in 2018. His research mainly
focuses on knowledge graph, deep learning and network embedding.
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
.
T
/
ich
Kunlin Yang is currently an undergrduate student in the School of Information,
Renmin University of China. He is expected to receive his Bachelor of
Engineering Degree in 2019. His research interests include recommender
systems and knowledge graph.
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
134
Datenintelligenz
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
Hongjian Dou is currently a graduate student at the School of Information,
Renmin University of China. He is working in Beijing Key Laboratory of Big
Data Management and Analysis Methods, Peking. His research mainly
focuses on natural language processing, deep learning and recommender
Systeme.
Jin Huang is currently a graduate student at the School of Information,
Renmin University of China. She is working in Beijing Key Laboratory of Big
Data Management and Analysis Methods, Peking. Her research interests
include recommender systems, deep learning and knowledge base.
Siqi Ouyang is currently a graduate student at the Jacobs Technion-Cornell
Institut, Cornell Tech. She received her bachelor’s degree from Renmin
University of China in 2018. Her research mainly focuses on deep learning
in natural language processing and recommendation systems.
Datenintelligenz
135
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
/
T
.
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
KB4Rec: A Data Set for Linking Knowledge Bases with Recommender Systems
Ji-Rong Wen is a professor in the School of Information, Renmin
University of China. He is also the director of the Beijing Key Laboratory
of Big Data Management and Analysis Methods. Before that, Er
was a senior researcher and group manager of the Web Search and
Mining Group at MSRA. His main research interests include big data
management and analytics, information retrieval, data mining and
machine learning. He is currently the associate editor of the ACM
Transactions on Information Systems (TOIS). He is a senior member
of the IEEE.
l
D
Ö
w
N
Ö
A
D
e
D
F
R
Ö
M
H
T
T
P
:
/
/
D
ich
R
e
C
T
.
M
ich
T
.
e
D
u
D
N
/
ich
T
/
l
A
R
T
ich
C
e
–
P
D
F
/
/
/
/
1
2
1
2
1
1
4
7
6
6
9
5
D
N
_
A
_
0
0
0
0
8
P
D
.
T
/
ich
F
B
j
G
u
e
S
T
T
Ö
N
0
7
S
e
P
e
M
B
e
R
2
0
2
3
136
Datenintelligenz