Collecting and Analyzing Japanese Splogs based on Characteristics of Keywords

Yuuki Sato; Takehito Utsuro; Tomohiro Fukuhara; Yasuhide Kawada; Yoshiaki Murakami; Hiroshi Nakagawa; Noriko Kando

doi:10.1609/icwsm.v2i1.18659

Collecting and Analyzing Japanese Splogs based on Characteristics of Keywords

Authors

Yuuki Sato University of Tsukuba
Takehito Utsuro University of Tsukuba
Tomohiro Fukuhara University of Tokyo
Yasuhide Kawada Navix Co., Ltd.
Yoshiaki Murakami Navix Co., Ltd
Hiroshi Nakagawa University of Tokyo
Noriko Kando National Institute of Informatics

DOI:

https://doi.org/10.1609/icwsm.v2i1.18659

Abstract

This paper focuses on analyzing (Japanese) splogs based on various characteristics of keywords contained in them. We estimate the behavior of spammers when creating splogs from other sources by analyzing the characteristics of keywords contained in splogs. Since splogs often cause noises in word occurrence statistics in the blogosphere, we assume that we can efficiently collect splogs by sampling blog homepages containing keywords of a certain type on the date with its most frequent occurrence. We manually examine various features of collected blog homepages regarding whether their text content is excerpt from other sources or not, as well as whether they display affiliate advertisement or out-going links to affiliated sites. Among various informative results, it is important to note that more than half of the collected splogs are created by a very small number of spammers.

Downloads

Published

2021-09-25

How to Cite

Sato, Y., Utsuro, T., Fukuhara, T., Kawada, Y., Murakami, Y., Nakagawa, H., & Kando, N. (2021). Collecting and Analyzing Japanese Splogs based on Characteristics of Keywords. Proceedings of the International AAAI Conference on Web and Social Media, 2(1), 218–219. https://doi.org/10.1609/icwsm.v2i1.18659

Download Citation

Issue

Vol. 2 No. 1 (2008): Second International AAAI Conference on Weblogs and Social Media

Section

Poster Papers

Collecting and Analyzing Japanese Splogs based on Characteristics of Keywords

Authors

DOI:

Abstract

Downloads

Published

How to Cite

Issue

Section

Information