Theatres in the UK and classical music
Methodology, tools and data sources
The Polifonia corpus was a crucial tool to start off our research. It was used to get a better understanding of the themes we would explore in our story. By selecting the MusicBo “module”, we had the possibility to appreciate in real contexts the use of the words such as “composer” and “theatre” that were later used to formulate our queries. It was particularly useful to know what the word “classical” collocates with, significant information for our data search. Nouns such as “composer”, “composition” and “music” were, in fact, a starter for our Wikidata research. When Wikidata matchings corresponded to 0, we would look synonyms up on Polifonia through the “Type concept” tool. For instance, when we were looking for classical songs made by players it was useful to know that the synonym of player is “musician” because this term allowed us to retrieve data.
After a profound analysis of the queries, we were given in class, we could move onto the next step. We explored FRED’s graphs to understand how to structure our query and how to name our “entities”. By exploiting Wikidata, we were able to discover the “wd/wdt code” corresponding to the resources and properties that we were going to analyze. After several attempts at doing queries on Wikidata Query Service about specific arguments of our interest, some results were obtained.
For querying over RDF graphs, we used the standard language named SPARQL. We created the triple patterns required by SPARQL that includes a subject, a predicate, and an object. The SPARQL query is made up of two parts: in the SELECT section we included the number and order of data we wanted to retrieve, in the WHERE section we put the constraints. Furthermore, we employed more terminology to focus on the results we wanted to obtain such as LIMIT, OFFSET and ORDER BY DESC.
Lastly, we transported our functioning queries on the Melody platform employing all the kind of graphics available to create a data story which shows the information we retrieved from the previous exploring process. Moreover, deep research on the web was conducted in order to add an informative background to our graphics and display the results in context.
The challenges
Concerning the overall realization of the project, a significant number of challenges was encountered. On the one hand, the lack of data on Wikidata forced us to change the object of our queries various times. For instance, we couldn’t find concrete examples of classical music. On the other hand, when results were obtained, sometimes they would look rather doubtful such as when we tried to add a count query about the number of theatres in London, but the result was always 7, which does not correspond to the reality. In contrast, when Wikidata showed us reliable results, Melody would often not display any graphic; the same thing happened with queries conducted on Dbpedia. Moreover, despite following the query model to obtain a chart given during the course and the well-functioning of the query, Melody would not showcase any graphic. Another great challenge was given by Melody not registering some of our edits to the story, which made us do them all over again. Finally, we addressed the problems encountered by changing the object of the queries when results couldn’t be found, or by recreating the Dbpedia queries on Wikidata. Overall, facing these challenges was rather useful, as we could employ the competences acquired through practice to try overcoming them as well as learning from our mistakes.