Torres García, Luis MiguelMagaña Lizarrondo, EduardoIzal Azcárate, MikelMorató Osés, Daniel2016-10-032016-10-032013978-1-4799-2969-6 (Electronic)10.1109/GIIS.2013.6684350https://academica-e.unavarra.es/handle/2454/22356Trabajo presentado al IEEE Global Information Infrastructure and Networking Symposium 2013 (GIIS 2013), 28-31 de octubre, Trento (Italia)The complexity of web traffic has grown in the past years as websites evolve and new services are provided over the HTTP protocol. When accessing a website, multiple connections to different servers are opened and it is usually difficult to distinguish which servers are related to which sites. However, this information is useful from the perspective of security and accounting and can also help to label web traffic and use it as ground truth for traffic classification systems. In this paper we present a method to discover server IP addresses related to specific websites in a traffic trace. Our method uses NetFlow-type records which makes it scalable and impervious to encryption of packet payloads. It is, moreover, popularity-aware in the sense that it takes into consideration the differences in the number of accesses to each site in order to provide a better identification of servers. The method can be used to gather data from a group of interesting websites or, by applying it to a representative set of websites, it can label a sizeable number of connections in a packet trace.application/pdfeng©2013 IEEEPacket tracePopularity-aware methodServer IP addressesWeb sitesHTTP protocolWeb trafficTraffic classification systemsGround truthTraffic traceNetFlow-type recordsEncryptionPacket payloadsA popularity-aware method for discovering server IP addresses related to websitesinfo:eu-repo/semantics/conferenceObjectinfo:eu-repo/semantics/openAccess