Tuesday, October 14, 2014
Drilling Down W3C Wordnet using Grep (a novice adventure)
I downloaded wn20full (WordNet on the W3C site (http://www.w3.org/2006/03/wn/wn20/)) and searched the contents for instances of wheel. The first is the string "wheel" with and without spaces around it, and the second is the string "wheel" with spaces around it. I then looked at the output for appropriate instances of wheels that could be used on automobiles. Thank you grymoire for your wonderful tutorial on regular expressions (http://www.grymoire.com/Unix/Regular.html) that helped make this possible.
The commands were as follows:
grep -n "wheel" /home/brent/Downloads/wn20full/* > /home/brent/Documents/wheel_long.txt
grep -n " wheel " /home/brent/Downloads/wn20full/* > /home/brent/Documents/wheel_short.txt
-----------------------------------
I plan to use these for the REA construction presented earlier.The next step is to figure out what to use in this, and if it can be referenced online. It would be wonderful if I could also do a SPARQL query.
According to http://www.w3.org/2006/03/wn/wn20/ : "For example, it is possible to use the Full version with only the hyponym relation and exclude the other relations."
ReplyDeleteThus car_wheel
http://www.w3.org/2006/03/wn/wn20/instances/synset-car_wheel-noun-1.rdf
is a hyponym of:
http://www.w3.org/2006/03/wn/wn20/instances/synset-wheel-noun-1.rdf
according to wordnet-hyponym,rdf in the wn20 download
Hyponym is defined in the wnfull ontology
http://www.w3.org/2006/03/wn/wn20/schemas/wnfull.rdfs