Tuesday, October 14, 2014

Drilling Down W3C Wordnet using Grep (a novice adventure)



 I downloaded wn20full (WordNet on the W3C site (http://www.w3.org/2006/03/wn/wn20/)) and searched the contents for instances of wheel. The first is the string "wheel" with and without spaces around it, and the second is the string "wheel" with spaces around it. I then looked at the output for appropriate instances of wheels that could be used on automobiles. Thank you grymoire for your wonderful tutorial on regular expressions (http://www.grymoire.com/Unix/Regular.html) that helped make this possible.

The commands were as follows:

grep -n "wheel" /home/brent/Downloads/wn20full/* > /home/brent/Documents/wheel_long.txt

grep -n " wheel " /home/brent/Downloads/wn20full/* > /home/brent/Documents/wheel_short.txt
-----------------------------------
I plan to use these for the REA construction presented earlier.The next step is to figure out what to use in this, and if it can be referenced online. It would be wonderful if I could also do a SPARQL query.

1 comment:

  1. According to http://www.w3.org/2006/03/wn/wn20/ : "For example, it is possible to use the Full version with only the hyponym relation and exclude the other relations."

    Thus car_wheel

    http://www.w3.org/2006/03/wn/wn20/instances/synset-car_wheel-noun-1.rdf

    is a hyponym of:

    http://www.w3.org/2006/03/wn/wn20/instances/synset-wheel-noun-1.rdf

    according to wordnet-hyponym,rdf in the wn20 download

    Hyponym is defined in the wnfull ontology

    http://www.w3.org/2006/03/wn/wn20/schemas/wnfull.rdfs

    ReplyDelete