Analyzing and Visualizing Translation Patterns of Wikidata Properties

John Samuel

CPE Lyon

CLEF 2018, Avignon, 10th September, 2018


Creative Commons License
Wikipedia Infobox
Wikipedia Infoboxes
  • Wikidata
    • Started in year 2012
    • is free, open, linked, structured, collaborative and multilingual knowledge base
    • From multi-(sub)domain multilingual Wikipedia sites to a single-domain multilingual website
    • Collaborative Multilingual Multi-domain Ontology development
Wikipedia to Wikidata
Wikidata to Wikipedia
Wikipedia Infobox Properties
Wikidata
Wikidata Properties
Wikidata Properties
Wikidata Properties
Wikidata Properties
P17: Translation History
  1. WDProp:
    • Collaborative Multilingual Multi-domain Ontology development: is it possible to achieve a truly multilingual experience?
  2. Goals:
    • Understanding Wikidata property proposal, creation and translation
    • Available templates and their usage
    • Providing real-time statistics to (multilingual) contributors
WDProp
  • WDProp
    • Get real-time translation statistics
    • Navigate supported languages, properties, datatypes, classes
    • Compare translation statistics
    • Find available properties for an entity
    • Uses Wikidata SPARQL endpoints and Mediawiki API
  • URL
WDProp: Translation path
WDProp: Translation path
WDProp: Translation path
  • Related Works:
    • Multilinguality in Wikidata
    • Property label stability in Wikidata
    • Linguistic influence patterns in Wikipedia
    • Deletion patterns in Wikipedia
Current Work (Work done with Thibaut Chamard)
  • Conclusion:
    • Wikidata properties form the foundation of Wikidata knowledge base
    • Though few in number (around 5200) compared to items (around 50 million), their creation and evolution needs to be fully understood
    • Recommendation for translation to contributors

References

  1. Kaffee, L. A., Piscopo, A., Vougiouklis, P., Simperl, E., Carr, L., & Pintscher, L. (2017, August). A glimpse into Babel: an analysis of multilinguality in Wikidata. In Proceedings of the 13th International Symposium on Open Collaboration (p. 14). ACM.
  2. Müller-Birn, C., Karran, B., Lehmann, J., & Luczak-Rösch, M. (2015, August). Peer-production system or collaborative ontology engineering effort: What is Wikidata?. In Proceedings of the 11th International Symposium on Open Collaboration (p. 20). ACM.
  3. Pellissier Tanon, T., & Kaffee, L. A. (2018, April). Property label stability in Wikidata: evolution and convergence of schemas in collaborative knowledge bases. In Companion of the The Web Conference 2018 on The Web Conference 2018 (pp. 1801-1803). International World Wide Web Conferences Steering Committee.
  4. Samoilenko, A., Karimi, F., Kunegis, J., Edler, D., & Strohmaier, M. (2015, June). Linguistic influence patterns within the global network of Wikipedia language editions. In Proceedings of the ACM Web Science Conference (p. 54). ACM.
  5. Samuel, J. (2017) Collaborative Approach to Developing a Multilingual Ontology: A Case Study of Wikidata. In : Research Conference on Metadata and Semantics Research. Springer, Cham, 2017. p. 167-172.
  6. Samuel, J. (2018). Towards Understanding and Improving Multilingual Collaborative Ontology Development in Wikidata. In: WikiWorkshop 2018
  7. Stefaner, M., Taraborelli, D., & Ciampaglia, G. L. (2011). Notabilia–Visualizing Deletion Discussions on Wikipedia.

Thank you

Questions?