Knowledge graphs: Difference between revisions

    From Consumerium development wiki R&D Wiki
    (→‎Lexicographical Wikidata: + Since 2018, Wikidata has also stored a new type of data: words, phrases and sentences, in many languages, described in many languages. This information is stored in new types of entities, called Lexemes (L), Forms (F) and Senses (S).)
    Line 96: Line 96:
    A '''lexeme''' is a unit of [[w:lexical semantics|lexical]] meaning that underlies a set of words that are related through [[w:inflection|inflection]]. It is a basic abstract unit of meaning,<ref>''The Cambridge Encyclopedia of The English Language''. Ed. [[w:David Crystal|]]. Cambridge: Cambridge University Press, 1995. p.&nbsp;118. {{ISBN|0521401798}}.</ref> a [[w:emic unit|unit]] of [[w:Morphology (linguistics)|morphological]] [[w:Semantic analysis (linguistics)|analysis]] in [[w:linguistics]] that roughly corresponds to a set of forms taken by a single root [[w:word]]. For example, in [[w:English language|English]], ''run'', ''runs'', ''ran'' and ''running'' are forms of the same lexeme, which can be represented as <span style="font-variant:small-caps; text-transform:lowercase;">RUN</span> (Wikipedia on 2019-12-29)
    A '''lexeme''' is a unit of [[w:lexical semantics|lexical]] meaning that underlies a set of words that are related through [[w:inflection|inflection]]. It is a basic abstract unit of meaning,<ref>''The Cambridge Encyclopedia of The English Language''. Ed. [[w:David Crystal|]]. Cambridge: Cambridge University Press, 1995. p.&nbsp;118. {{ISBN|0521401798}}.</ref> a [[w:emic unit|unit]] of [[w:Morphology (linguistics)|morphological]] [[w:Semantic analysis (linguistics)|analysis]] in [[w:linguistics]] that roughly corresponds to a set of forms taken by a single root [[w:word]]. For example, in [[w:English language|English]], ''run'', ''runs'', ''ran'' and ''running'' are forms of the same lexeme, which can be represented as <span style="font-variant:small-caps; text-transform:lowercase;">RUN</span> (Wikipedia on 2019-12-29)


    Since 2018, Wikidata has also stored a new type of data: words, phrases and sentences, in many languages, described in many languages. This information is stored in new types of entities, called '''Lexemes''' ('''L'''), '''Forms''' ('''F''') and '''Senses''' ('''S''').<ref>[[wikibooks:SPARQL/WIKIDATA Lexicographical data</ref>
    Since 2018, Wikidata has also stored a new type of data: words, phrases and sentences, in many languages, described in many languages. This information is stored in new types of entities, called '''Lexemes''' ('''L'''), '''Forms''' ('''F''') and '''Senses''' ('''S''').<ref>[[wikibooks:SPARQL/WIKIDATA Lexicographical data</ref>. This is enabled by [[mw:Extension:WikibaseLexeme|the WikibaseLexeme extension]].


    * [[wikibooks:SPARQL/WIKIDATA Lexicographical data|Wikibook on Wikidata's lexicographical data]]
    * [[wikibooks:SPARQL/WIKIDATA Lexicographical data|Wikibook on Wikidata's lexicographical data]]

    Revision as of 18:54, 29 December 2019

    Acquiring access for our consumers to a semantic network of relevant linked open data compiled by other efforts and structured by a number of ontologies is obviously key to Consumerium. Reciprocally we aim to share the information we gather and compile available to other efforts.

    DBpedia

    The DBpedia logo

    DBpedia (.org) is a community effort to enable the web moving "Towards a Public Data Infrastructure for a Large, Multilingual, Semantic Knowledge Graph".

    Today the DBpedia data sets contain a wealth of information structured into ontologies. This structured data can be queried with SPARQL query language at their public DBpedia SPARQL endpoint.

    There are many methods how the DBpedia ontology and datasets could be used in the Consumerium implementation stage wiki.

    Ontology classes useful for implementing Consumerium

    All DBpedia ontology classes

    DBpedia datasets

    DBpedia Databus

    At DBpedia there is ongoing work on what is called DBpedia Databus to take their game to the next level. Databus Alpha was published in May 2018.

    History of DBpedia

    DBpedia began as an effort to extract structured information from Wikipedia infobox templates and categories and to make this information available on the Web with the initial release on January 10th 2007.

    More info on DBpedia


    Wikidata

    The Wikidata logo

    Wikidata (.org) is a knowledge base, an effort to store and serve structured data to Wikimedia wikis and to a more limited extent to other parties. Wikidata effort saw the daylight in 2012.

    The underlying software is the Wikibase which consists of 2 Mediawiki extensions, the Wikibase Repository and the Wikibase Client.

    Wikibase allows interwiki links to be managed with Wikidata removing much contributor annoyanges, redundancy and error-proneness.

    Wikidata is obviously a very viable source of reference level data once it is technically possible for non-WMF wikis to access the data items. (See #LinkedWiki extension for a potential workaround for this limitation)

    It can be accessed outside of WMF wikis with with

    Main entry point of any Wikidata item is a JSON dictionary, that has this form:

    {“labels”: by-language dictionary

    “descriptions”: by-language dictionary

    “aliases”: by-language dictionary

    “claims”: list of property and values

    “sitelinks”: by-language dictionary}

    Lexicographical Wikidata

    A lexeme is a unit of lexical meaning that underlies a set of words that are related through inflection. It is a basic abstract unit of meaning,[1] a unit of morphological analysis in w:linguistics that roughly corresponds to a set of forms taken by a single root w:word. For example, in English, run, runs, ran and running are forms of the same lexeme, which can be represented as RUN (Wikipedia on 2019-12-29)

    Since 2018, Wikidata has also stored a new type of data: words, phrases and sentences, in many languages, described in many languages. This information is stored in new types of entities, called Lexemes (L), Forms (F) and Senses (S).[2]. This is enabled by the WikibaseLexeme extension.

    Useful information

    More info

    Wikibase

    The Wikibase logo

    Wikibase (wikiba.se) is a system for storing and querying structured data that powers Wikidata and other wikis.

    Wikibase consists of two extensions:

    1. Wikibase Repository that allows a wiki to work as a repository for structured data.
    2. Wikibase Client that allows a wiki to access structured data from a repository. The client can work only with repository databases it can access so they must be on the same machine or the same load balancer.

    Installation of Wikibase

    Wikibase installation instructions at Mediawiki.org and advanced configuration of Wikibase.

    The installation instructions assume you are installing the dependencies with Composer, a PHP package manager that makes the installation of dependencies easy.

    Useful extensions in conjunction with Wikibase

    Alternative to using Wikibase Client

    Useful information


    Semantic MediaWiki

    Semantic MediaWiki (.org) (SMW) is a free, open-source extension to MediaWiki that lets you store & query semantic data within the wiki and it seems well suited to Consumerium's information infrastructure needs.

    Spinoff extensions

    A variety of open-source MediaWiki extensions exist that use the data structure provided by Semantic MediaWiki.

    Among the most notable are of the Semantic MediaWiki extensions:

    • Semantic Forms - enables user-created forms for adding and editing pages that use semantic data
    • Semantic Result Formats - provides a large number of display formats for semantic data, including charts, graphs, calendars and mathematical functions
    • Semantic Drilldown - provides a w:faceted browser interface for viewing the semantic data in a wiki
    • Semantic Maps - displays geographic semantic data using various mapping services

    LinkedWiki extension

    A possible way to tap into various knowledge graphs is the LinkedWiki extension. LinkedWiki has been developed since 2010 by mw:User:Karima Rafes, a #Semantic MediaWiki developer and CEO of BorderCloud.com

    See also

    1. The Cambridge Encyclopedia of The English Language. Ed. [[w:David Crystal|]]. Cambridge: Cambridge University Press, 1995. p. 118. Template:ISBN.
    2. [[wikibooks:SPARQL/WIKIDATA Lexicographical data