Machine learning for the classification of texts in hispanic literature by authors

Peñas Escribano, Lucas

Machine learning for the classification of texts in hispanic literature by authors

dc.contributor.advisorTFE	Pagola Barrio, Miguel
dc.contributor.affiliation	Escuela Técnica Superior de Ingeniería Industrial, Informática y de Telecomunicación	es_ES
dc.contributor.affiliation	Industria, Informatika eta Telekomunikazio Ingeniaritzako Goi Mailako Eskola Teknikoa	eu
dc.contributor.author	Peñas Escribano, Lucas
dc.date.accessioned	2025-02-18T15:26:40Z
dc.date.available	2025-02-18T15:26:40Z
dc.date.issued	2025
dc.date.updated	2025-02-18T13:33:01Z
dc.description.abstract	When we talk about a person´s style, we have made a probably unconscious exercise of patron recognition to be able to assure something like “he wouldn´t do that, it isn´t his style”. Taking this into account, it would be interesting to link what would be a more human and artistic matter like, “what is the style of this author?” with the usage of state of the art languague models to try to give an objective answer to that question. In this project, I merge some techniques learnt throught the degree (tokenization via bag of words, multiple classification methods…) with some new ones learnt through some investigation about how some recently famous languague models work (pre-trained Bert model, bidirectionality…). The main reasons why I have settled with this project are two. First, getting deeper into a kind of problem that I have previously faced to learn about new techniques and put them to the test along with the previous ones. And second, curiosity on how machine learning will approach a problem that I have seen solved by humans. The two main final objectives would be to obtain a method to classify the texts between author with a high enough level of confidence and to be able to extract the traits from each author that the classifier investigated in order to make its predictions. In conclusion, this project is a combination of gathering data, performing already know procedures, investigating and updating the processes with more advanced techniques, comparing and analyzing results and finally trying to reach a conclusion that tells us how long the bridge that separates humans and machines in this topic is.	en
dc.description.degree	Graduado o Graduada en Ingeniería Informática por la Universidad Pública de Navarra (Programa Internacional)	es_ES
dc.description.degree	Informatika Ingeniaritzan Graduatua Nafarroako Unibertsitate Publikoan (Nazioarteko Programa)	eu
dc.format.mimetype	application/pdf	en
dc.identifier.uri	https://academica-e.unavarra.es/handle/2454/53459
dc.language.iso	eng
dc.rights.accessRights	info:eu-repo/semantics/openAccess
dc.subject	Python	en
dc.subject	Machine learning	en
dc.subject	Languague model	en
dc.subject	Text classification	en
dc.subject	Bag of words	en
dc.subject	BERT	en
dc.subject	Bidirectionality	en
dc.subject	Neural network	en
dc.subject	Random forest	en
dc.subject	Support Vector Machine	en
dc.subject	Stochastic Gradient Descent	en
dc.subject	Extreme Gradient Boosting	en
dc.title	Machine learning for the classification of texts in hispanic literature by authors	en
dc.type	info:eu-repo/semantics/bachelorThesis
dspace.entity.type	Publication
relation.isAdvisorTFEOfPublication	e5ab14f5-4f2e-4000-a415-0a7c3b28ec78
relation.isAdvisorTFEOfPublication.latestForDiscovery	e5ab14f5-4f2e-4000-a415-0a7c3b28ec78

Files

Original bundle

Now showing 1 - 1 of 1

Name:: ML_for_the_classification_of_texts_by_authors.pdf
Size:: 2.25 MB
Format:: Adobe Portable Document Format

Download

License bundle

Now showing 1 - 1 of 1

Name:: license.txt
Size:: 1.71 KB
Format:: Item-specific license agreed to upon submission
Description:

Download

Collections

Trabajos Fin de Grado ETSIIT - TIIGMET Gradu Amaierako Lanak
Trabajos Fin de Grado - Gradu Amaierako Lanak