TBDClust: time-based density clustering to enable free browsing of sites in pay-per-use mobile Internet providers
Date
2017Author
Version
Acceso abierto / Sarbide irekia
Type
Artículo / Artikulua
Version
Versión publicada / Argitaratu den bertsioa
Impact
|
10.1016/j.jnca.2017.10.007
Abstract
The World Wide Web has evolved rapidly, incorporating new content types and becoming more dynamic. The contents from a website can be distributed between several servers, and as a consequence, web traffic has become increasingly complex. From a network traffic perspective, it can be difficult to ascertain which websites are being visited by a user, let alone which part of the user's traffic each ...
[++]
The World Wide Web has evolved rapidly, incorporating new content types and becoming more dynamic. The contents from a website can be distributed between several servers, and as a consequence, web traffic has become increasingly complex. From a network traffic perspective, it can be difficult to ascertain which websites are being visited by a user, let alone which part of the user's traffic each website is responsible for. In this paper we present a method for identifying the TCP connections involved in the same full webpage download without the need of deep packet inspection. This identification is needed for example to enable free browsing of specific websites in a pay per use mobile Internet access. It could be not only for third party promoted websites but also portals to gubernamental or medical emergency websites. The proposal is based on a modification of the DBSCAN clustering algorithm to work online and over one-dimensional sorted data. In order to validate our results we use both real traffic and packet captures from a controlled environment. The proposal achieves excellent results in consistency (99%) and completeness (92%), meaning that its error margin identifying the webpage downloads is minimal. [--]
Subject
Clustering TCP connections,
Time-based density clustering,
DBSCAN,
Mobile web browsing,
Online monitoring,
Real traffic dataset
Publisher
Elsevier
Published in
Journal of Network and Computer Applications, 99 (2017) 17-27
Departament
Universidad Pública de Navarra. Departamento de Automática y Computación /
Nafarroako Unibertsitate Publikoa. Automatika eta Konputazioa Saila /
Universidad Pública de Navarra/Nafarroako Unibertsitate Publikoa. Institute of Smart Cities - ISC
Publisher version
Sponsorship
This work is supported by Spanish MINECO through project PIT
(TEC2015-69417-C2-2-R).