Publication: TBDClust: time-based density clustering to enable free browsing of sites in pay-per-use mobile Internet providers
Date
Director
Publisher
Abstract
The World Wide Web has evolved rapidly, incorporating new content types and becoming more dynamic. The contents from a website can be distributed between several servers, and as a consequence, web traffic has become increasingly complex. From a network traffic perspective, it can be difficult to ascertain which websites are being visited by a user, let alone which part of the user's traffic each website is responsible for. In this paper we present a method for identifying the TCP connections involved in the same full webpage download without the need of deep packet inspection. This identification is needed for example to enable free browsing of specific websites in a pay per use mobile Internet access. It could be not only for third party promoted websites but also portals to gubernamental or medical emergency websites. The proposal is based on a modification of the DBSCAN clustering algorithm to work online and over one-dimensional sorted data. In order to validate our results we use both real traffic and packet captures from a controlled environment. The proposal achieves excellent results in consistency (99%) and completeness (92%), meaning that its error margin identifying the webpage downloads is minimal.
Description
Keywords
Department
Faculty/School
Degree
Doctorate program
item.page.cita
item.page.rights
© 2017 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license
Los documentos de Academica-e están protegidos por derechos de autor con todos los derechos reservados, a no ser que se indique lo contrario.