com.articulate.sigma.wordNet.WordNet

All Implemented Interfaces:: Serializable

public class WordNet extends Object implements Serializable

This program finds and displays SUMO terms that are related in meaning to the English expressions that are entered as input. Note that this program uses four WordNet data files, "NOUN.EXC", "VERB.EXC" etc, as well as four WordNet to SUMO mappings files called "WordNetMappings-nouns.txt", "WordNetMappings-verbs.txt" etc The main part of the program prompts the user for an English term and then returns associated SUMO concepts. The two primary public methods are initOnce() and page().

Author:

Ian Niles, Adam Pease

See Also:

Field Summary

Fields

Modifier and Type

Field

Description

static final int

ADJECTIVE

static final int

ADJECTIVE_SATELLITE

Map<String,String>

adjectiveDocumentationHash

Map<String,String>

adjectiveSUMOHash

Map<String,Set<String>>

adjectiveSynsetHash

static final int

ADVERB

Map<String,String>

adverbDocumentationHash

Map<String,String>

adverbSUMOHash

Map<String,Set<String>>

adverbSynsetHash

static String

baseDir

static File

baseDirFile

Map<String,String>

caseMap

static boolean

debug

static boolean

disable

Map<String,String>

exceptionNounHash

list of irregular plural forms where the key is the plural, singular is the value.

Map<String,String>

exceptionNounPluralHash

Map<String,String>

exceptionVerbHash

Map<String,String>

exceptionVerbPastHash

Map<String,String>

exceptionVerbPastProgHash

Map<String,String>

exceptVerbProgHash

Map<String,Set<String>>

ignoreCaseSynsetHash

static boolean

initNeeded

String

maxNounSynsetID

String

maxVerbSynsetID

MultiWords

multiWords

static final int

NOUN

Map<String,String>

nounDocumentationHash

Map<String,String>

nounSUMOHash

Map<String,Set<String>>

nounSynsetHash

Map<String,Map<String,String>>

OMW

A HashMap with language name keys and HashMapinvalid input: '<'String,String> values.

String

origMaxNounSynsetID

String

origMaxVerbSynsetID

static Pattern[]

regexPatterns

This array contains all of the compiled Pattern objects that will be used by methods in this file.

static final String[]

regexPatternStrings

This array contains all of the regular expression strings that will be compiled to Pattern objects for use in the methods in this file.

Map<String,List<com.articulate.sigma.utils.AVPair>>

relations

Keys are POS-prefixed synsets, values are ArrayList(s) of AVPair(s) in which the attribute is a pointer type according to http://wordnet.princeton.edu/man/wninput.5WN.html#sect3 and the value is a POS-prefixed synset @see WordNetUtilities.convertWordNetPointer

Map<String,String>

reverseSenseIndex

A HashMap where the keys are 9 digit POS prefixed WordNet synset byte offsets, and the values are of the form word_POS_sensenum (alpha POS like "VB").

Map<String,Integer>

senseFrequencies

a HashMap where the key is a 9-digit POS-prefixed sense and the value is a the number of times that sense occurs in the Brown corpus.

Map<String,String>

senseIndex

A HashMap where the keys are of the form word_POS_sensenum (alpha POS like "VB") and values are 8 digit WordNet synset byte offsets.

Map<String,String>

senseKeys

A HashMap where the keys are of the form word%POS:lex_filenum:lex_id (numeric POS) and values are 8 digit WordNet synset byte offsets.

List<String>

stopwords

English "stop words" such as "a", "at", "them", which have no or little inherent meaning when taken alone.

Map<String,List<String>>

SUMOHash

Keys are SUMO terms, values are ArrayLists(s) of POS-prefixed 9-digit synset String(s) meaning that the part of speech code is prepended to the synset number.

Map<String,List<String>>

synsetsToWords

Keys are String POS-prefixed synsets.

static final int

VERB

Map<String,String>

verbDocumentationHash

Map<String,List<String>>

verbFrames

A HashMap where keys are 8 digit WordNet synset byte offsets or synsets appended with a dash and a specific word such as "12345678-foo" or in the case where the frame applies to the entire synset, it's just the synset number.

static List<String>

VerbFrames

Map<String,String>

verbSUMOHash

Map<String,Set<String>>

verbSynsetHash

static WordNet

wn

Map<String,Map<String,Integer>>

wordCoFrequencies

a HashMap of HashMaps where the key is a word sense of the form word_POS_num signifying the word, part of speech and number of the sense in WordNet.

protected Map<String,Set<com.articulate.sigma.utils.AVPair>>

wordFrequencies

a HashMap of HashMaps where the key is a word and the value is a HashMap of 9-digit POS-prefixed senses which is the value of the AVPair, and the number of times that sense occurs in the Brown corpus, which is the key of the AVPair

Map<String,List<String>>

wordsToSenseKeys

A HashMap with words as keys and ArrayList as values.
Constructor Summary

Constructors

Constructor

Description

WordNet()
Method Summary

Modifier and Type

Method

Description

void

addToWordFreq(String word, com.articulate.sigma.utils.AVPair avp)

Add an entry to the wordFrequencies list, checking whether it has a valid count and synset pair.

static void

checkWordsToSenses()

Map<String,Integer>

collectCountedWordSenses(String sentence)

Collect all the synsets that represent the best guess at meanings for all the words in a sentence.

void

compileRegexPatterns()

This method compiles all of the regular expression pattern strings in regexPatternStrings and puts the resulting compiled Pattern objects in the Pattern[] regexPatterns.

boolean

containsWord(String word)

Does WordNet contain the given word.

boolean

containsWord(String word, int pos)

Does WordNet contain the given word.

boolean

containsWordIgnoreCase(String word)

Does WordNet contain the given word, ignoring case.

static <T> T

decoder()

String

displayByKey(String sumokbname, String key, String params)

String

displaySynset(String sumokbname, String synset, String params)

static void

encoder(Object object)

String

generateNounSynsetID()

Generate a new eight digit noun synset ID that doesn't have an existing hash

String

generateSynsetID(String l)

Generate a new 8 digit synset ID that doesn't have an existing hash

String

generateVerbSynsetID()

Generate a new eight digit verb synset ID that doesn't have an existing hash

String

getDocumentation(String synset)

static void

getEntailments()

MultiWords

getMultiWords()

Map<String,List<String>>

getSenseKeysFromWord(String word)

Get all the synsets for a given word.

String

getSUMOMapping(String synset)

Get the SUMO mapping for a POS-prefixed synset

Set<String>

getSynsetsFromWord(String word)

Get all the synsets for a given word.

String

getTransitivity(String synset, String word)

Frame transitivity intransitive - 1,2,3,4,7,23,35 transitive - everything else ditransitive - 15,16,17,18,19

File

getWnFile(String key, String override)

Returns the WordNet File object corresponding to key.

List<String>

getWordsFromSynset(String synset)

Map<String,String>

getWordsFromTerm(String SUMOterm)

Get the words and synsets corresponding to a SUMO term.

static void

initOnce()

Read the WordNet files only on initialization of the class.

boolean

isFile(String s)

boolean

isHyponym(String synset, String hypo)

boolean

isHyponymRecurse(String synset, String hypo, List<String> visited)

boolean

isStopWord(String word)

Check whether the word is a stop word

static void

loadSerialized()

Loads the most recently saved serialized version.

static void

main(String[] args)

A main method, used only for testing.

void

mergeWordCoFrequencies(Map<String,Map<String,Integer>> senses)

Merge a new set of word co-occurrence statistics into the existing set.

String

nounRootForm(String mixedCase, String input)

Return the root form of the noun, or null if it's not in the lexicon.

String

nounSynsetFromTermFormat(String tf, String SUMOterm, KB kb)

Generate a new noun synset from a termFormat

String

page(String inp, int pos, String kbname, String synset, String params)

This is the regular point of entry for this class.

Set<String>

prependPOS(Set<String> synsets, String POS)

Prepend a POS number to a set of 8 digit synsets

protected boolean

processNounLine(String line)

String

processPrologString(String doc)

Double any single quotes that appear.

void

readSenseCount()

Read word sense frequencies into a HashMap of PriorityQueues containing AVPairs where the value is a word and the attribute (on which PriorityQueue is sorted) is an 8 digit String representation of an integer count.

void

readSenseIndex(String filename)

Note that WordNet forces all these words to lowercase in the index.xxx files

void

readStopWords()

void

readWordCoFrequencies()

Return a HashMap of HashMaps where the key is a word sense of the form word_POS_num signifying the word, part of speech and number of the sense in WordNet.

String

removeStopWords(String sentence)

Remove stop words from a sentence.

List<String>

removeStopWords(List<String> sentence)

Remove stop words from a sentence.

String

senseKeyPOS(String senseKey)

static void

serialize()

save serialized version.

static boolean

serializedExists()

static boolean

serializedOld()

Check whether sources are newer than serialized version.

protected void

setMaxNounSynsetID(String synset)

protected void

setMaxVerbSynsetID(String synset)

static void

showHelp()

static List<String>

splitToArrayList(String st)

Return an ArrayList of the string split by spaces.

static List<String>

splitToArrayListSentence(String st)

Return an ArrayList of the string split by periods.

String

sumoFileDisplay(String pathname, String counter, String params)

A routine which takes a full pathname as input and returns a sentence by sentence display of sense and sentiment analysis

String

sumoSentenceDisplay(String input, String context, String params)

A routine which looks up a given list of words in the hashtables to find the relevant word definitions and SUMO mappings.

String

sumoSentimentDisplay(String sentence)

A routine that uses computeSentiment in DB.java to display a sentiment score for a single sentence as well as the individual scores of scored descriptors.

void

synsetFromTermFormat(Formula form, String tf, String SUMOterm, KB kb)

Generate a new synset from a termFormat statement

void

termFormatsToSynsets(KB kb)

Generate a new synset from a termFormat

static void

testProcessPointers()

A method used only for testing.

static void

testWordFreq()

A method used only for testing.

String

verbRootForm(String mixedCase, String input)

Return the present tense singular form of the verb, or null if it's not in the lexicon.

String

verbSynsetFromTermFormat(String tf, String SUMOterm, KB kb)

Generate a new verb synset from a termFormat

void

writeProlog(KB kb)

static void

writeWordCoFrequencies(String fname, Map<String,Map<String,Integer>> senses)

Write a HashMap of HashMaps where the key is a word sense of the form word_POS_num signifying the word, part of speech and number of the sense in WordNet.

void

writeWordNetG()

void

writeWordNetHyp()

void

writeWordNetProlog()

void

writeWordNetS()

Write WordNet data to a prolog file with a single kind of clause in the following format: s(Synset_ID, Word_No_in_the_Synset, Word, SS_Type, Synset_Rank_By_the_Word,Tag_Count)

void

writeXML()

Methods inherited from class java.lang.Object
clone, equals, finalize, getClass, hashCode, notify, notifyAll, toString, wait, wait, wait

Field Details
- disable
  
  public static boolean disable
- debug
  
  public static boolean debug
- wn
  
  public static WordNet wn
- baseDir
  
  public static String baseDir
- baseDirFile
  
  public static File baseDirFile
- initNeeded
  
  public static boolean initNeeded
- regexPatterns
  
  public static Pattern[] regexPatterns
  
  This array contains all of the compiled Pattern objects that will be used by methods in this file.
- nounSynsetHash
  
  public Map<String,Set<String>> nounSynsetHash
- verbSynsetHash
  
  public Map<String,Set<String>> verbSynsetHash
- adjectiveSynsetHash
  
  public Map<String,Set<String>> adjectiveSynsetHash
- adverbSynsetHash
  
  public Map<String,Set<String>> adverbSynsetHash
- ignoreCaseSynsetHash
  
  public Map<String,Set<String>> ignoreCaseSynsetHash
- verbDocumentationHash
  
  public Map<String,String> verbDocumentationHash
- adjectiveDocumentationHash
  
  public Map<String,String> adjectiveDocumentationHash
- adverbDocumentationHash
  
  public Map<String,String> adverbDocumentationHash
- nounDocumentationHash
  
  public Map<String,String> nounDocumentationHash
- nounSUMOHash
  
  public Map<String,String> nounSUMOHash
- verbSUMOHash
  
  public Map<String,String> verbSUMOHash
- adjectiveSUMOHash
  
  public Map<String,String> adjectiveSUMOHash
- adverbSUMOHash
  
  public Map<String,String> adverbSUMOHash
- maxNounSynsetID
  
  public String maxNounSynsetID
- maxVerbSynsetID
  
  public String maxVerbSynsetID
- origMaxNounSynsetID
  
  public String origMaxNounSynsetID
- origMaxVerbSynsetID
  
  public String origMaxVerbSynsetID
- SUMOHash
  
  public Map<String,List<String>> SUMOHash
  
  Keys are SUMO terms, values are ArrayLists(s) of POS-prefixed 9-digit synset String(s) meaning that the part of speech code is prepended to the synset number.
- synsetsToWords
  
  public Map<String,List<String>> synsetsToWords
  
  Keys are String POS-prefixed synsets. Values are ArrayList(s) of String(s) which are words. Note that the order of words in the file is preserved.
- exceptionVerbHash
  
  public Map<String,String> exceptionVerbHash
- exceptionVerbPastProgHash
  
  public Map<String,String> exceptionVerbPastProgHash
- exceptionVerbPastHash
  
  public Map<String,String> exceptionVerbPastHash
- exceptVerbProgHash
  
  public Map<String,String> exceptVerbProgHash
- exceptionNounHash
  
  public Map<String,String> exceptionNounHash
  
  list of irregular plural forms where the key is the plural, singular is the value.
- exceptionNounPluralHash
  
  public Map<String,String> exceptionNounPluralHash
- relations
  
  public Map<String,List<com.articulate.sigma.utils.AVPair>> relations
  
  Keys are POS-prefixed synsets, values are ArrayList(s) of AVPair(s) in which the attribute is a pointer type according to http://wordnet.princeton.edu/man/wninput.5WN.html#sect3 and the value is a POS-prefixed synset @see WordNetUtilities.convertWordNetPointer
- wordCoFrequencies
  
  public Map<String,Map<String,Integer>> wordCoFrequencies
  
  a HashMap of HashMaps where the key is a word sense of the form word_POS_num signifying the word, part of speech and number of the sense in WordNet. The value is a HashMap of words and the number of times that word occurs in sentences with the word sense given in the key.
- wordFrequencies
  
  protected Map<String,Set<com.articulate.sigma.utils.AVPair>> wordFrequencies
  
  a HashMap of HashMaps where the key is a word and the value is a HashMap of 9-digit POS-prefixed senses which is the value of the AVPair, and the number of times that sense occurs in the Brown corpus, which is the key of the AVPair
- caseMap
  
  public Map<String,String> caseMap
- senseFrequencies
  
  public Map<String,Integer> senseFrequencies
  
  a HashMap where the key is a 9-digit POS-prefixed sense and the value is a the number of times that sense occurs in the Brown corpus.
- stopwords
  
  public List<String> stopwords
  
  English "stop words" such as "a", "at", "them", which have no or little inherent meaning when taken alone.
- senseIndex
  
  public Map<String,String> senseIndex
  
  A HashMap where the keys are of the form word_POS_sensenum (alpha POS like "VB") and values are 8 digit WordNet synset byte offsets. Note that all words are from index.sense, which reduces all words to lower case
- senseKeys
  
  public Map<String,String> senseKeys
  
  A HashMap where the keys are of the form word%POS:lex_filenum:lex_id (numeric POS) and values are 8 digit WordNet synset byte offsets. Note that all words are from index.sense, which reduces all words to lower case
- reverseSenseIndex
  
  public Map<String,String> reverseSenseIndex
  
  A HashMap where the keys are 9 digit POS prefixed WordNet synset byte offsets, and the values are of the form word_POS_sensenum (alpha POS like "VB"). Note that all words are from index.sense, which reduces all words to lower case
- verbFrames
  
  public Map<String,List<String>> verbFrames
  
  A HashMap where keys are 8 digit WordNet synset byte offsets or synsets appended with a dash and a specific word such as "12345678-foo" or in the case where the frame applies to the entire synset, it's just the synset number. Values are ArrayList(s) of String verb frame numbers.
- wordsToSenseKeys
  
  public Map<String,List<String>> wordsToSenseKeys
  
  A HashMap with words as keys and ArrayList as values. The ArrayList contains word senses which are Strings of the form word_POS_num (alpha POS like "VB") signifying the word, part of speech and number of the sense in WordNet. Note that all words are from index.sense, which reduces all words to lower case
- multiWords
  
  public MultiWords multiWords
- NOUN
  
  public static final int NOUN
  See Also:
  
  Constant Field Values
- VERB
  
  public static final int VERB
  See Also:
  
  Constant Field Values
- ADJECTIVE
  
  public static final int ADJECTIVE
  See Also:
  
  Constant Field Values
- ADVERB
  
  public static final int ADVERB
  See Also:
  
  Constant Field Values
- ADJECTIVE_SATELLITE
  
  public static final int ADJECTIVE_SATELLITE
  See Also:
  
  Constant Field Values
- OMW
  
  public Map<String,Map<String,String>> OMW
  
  A HashMap with language name keys and HashMapinvalid input: '<'String,String> values. The interior HashMap has String keys which are PWN30 synsets with 8-digit synsets a dash and then a alphabetic part of speech character. Values are words in the target language.
- regexPatternStrings
  
  public static final String[] regexPatternStrings
  
  This array contains all of the regular expression strings that will be compiled to Pattern objects for use in the methods in this file.
- VerbFrames
  
  public static List<String> VerbFrames
Constructor Details
- WordNet
  
  public WordNet()
Method Details
- getMultiWords
  
  public MultiWords getMultiWords()
- compileRegexPatterns
  
  public void compileRegexPatterns()
  
  This method compiles all of the regular expression pattern strings in regexPatternStrings and puts the resulting compiled Pattern objects in the Pattern[] regexPatterns.
- getWnFile
  
  public File getWnFile(String key, String override)
  
  Returns the WordNet File object corresponding to key.
  
  Parameters:
  
  key - A descriptive literal String that maps to a regular expression pattern used to obtain a WordNet file.
  
  Returns:
  
  A File object
- splitToArrayList
  
  public static List<String> splitToArrayList(String st)
  
  Return an ArrayList of the string split by spaces.
- splitToArrayListSentence
  
  public static List<String> splitToArrayListSentence(String st)
  
  Return an ArrayList of the string split by periods.
- getSUMOMapping
  
  public String getSUMOMapping(String synset)
  
  Get the SUMO mapping for a POS-prefixed synset
- setMaxNounSynsetID
  
  protected void setMaxNounSynsetID(String synset)
- setMaxVerbSynsetID
  
  protected void setMaxVerbSynsetID(String synset)
- processNounLine
  
  protected boolean processNounLine(String line)
- mergeWordCoFrequencies
  
  public void mergeWordCoFrequencies(Map<String,Map<String,Integer>> senses)
  
  Merge a new set of word co-occurrence statistics into the existing set.
- writeWordCoFrequencies
  
  public static void writeWordCoFrequencies(String fname, Map<String,Map<String,Integer>> senses)
  
  Write a HashMap of HashMaps where the key is a word sense of the form word_POS_num signifying the word, part of speech and number of the sense in WordNet. The value is a HashMap of words and the number of times that word occurs in sentences with the word sense given in the key.
- readWordCoFrequencies
  
  public void readWordCoFrequencies()
  
  Return a HashMap of HashMaps where the key is a word sense of the form word_POS_num signifying the word, part of speech and number of the sense in WordNet. The value is a HashMap of words and the number of times that word occurs in sentences with the word sense given in the key.
- readStopWords
  
  public void readStopWords()
- readSenseIndex
  
  public void readSenseIndex(String filename)
  
  Note that WordNet forces all these words to lowercase in the index.xxx files
- readSenseCount
  
  public void readSenseCount()
  
  Read word sense frequencies into a HashMap of PriorityQueues containing AVPairs where the value is a word and the attribute (on which PriorityQueue is sorted) is an 8 digit String representation of an integer count.
- addToWordFreq
  
  public void addToWordFreq(String word, com.articulate.sigma.utils.AVPair avp)
  
  Add an entry to the wordFrequencies list, checking whether it has a valid count and synset pair.
- sumoSentenceDisplay
  
  public String sumoSentenceDisplay(String input, String context, String params)
  
  A routine which looks up a given list of words in the hashtables to find the relevant word definitions and SUMO mappings.
  
  Parameters:
  
  input - is the target sentence to be parsed. See WordSenseBody.jsp for usage.
  
  context - is the larger context of the sentence. Can mean more accurate results.
  
  params - is the set of html parameters
- sumoSentimentDisplay
  
  public String sumoSentimentDisplay(String sentence)
  
  A routine that uses computeSentiment in DB.java to display a sentiment score for a single sentence as well as the individual scores of scored descriptors.
  
  Parameters:
  
  sentence - is the target sentence to be scored. See WordSenseBody.jsp for usage.
- sumoFileDisplay
  
  public String sumoFileDisplay(String pathname, String counter, String params)
  
  A routine which takes a full pathname as input and returns a sentence by sentence display of sense and sentiment analysis
  
  Parameters:
  
  pathname -
  
  counter - is used to keep track of which sentence is being displayed
  
  params - is the set of html parameters
- isFile
  
  public boolean isFile(String s)
  
  Returns:
  
  true if the input String is a file pathname. Determined by whether the string contains a forward or backward slash. This is only used in WordSense.jsp and will fail if a sentence that is not a file contains a forward or back slash.
- isHyponymRecurse
  
  public boolean isHyponymRecurse(String synset, String hypo, List<String> visited)
  
  Returns:
  
  true if the first POS-prefixed synset is a hyponym of the second POS-prefixed synset. This is a recursive method.
- isHyponym
  
  public boolean isHyponym(String synset, String hypo)
  
  Returns:
  
  true if the first POS-prefixed synset is a hyponym of the second POS-prefixed synset. This is a recursive method.
- removeStopWords
  
  public String removeStopWords(String sentence)
  
  Remove stop words from a sentence.
- removeStopWords
  
  public List<String> removeStopWords(List<String> sentence)
  
  Remove stop words from a sentence.
- isStopWord
  
  public boolean isStopWord(String word)
  
  Check whether the word is a stop word
- collectCountedWordSenses
  
  public Map<String,Integer> collectCountedWordSenses(String sentence)
  
  Collect all the synsets that represent the best guess at meanings for all the words in a sentence. Keep track of how many times each sense appears.
- encoder
  
  public static void encoder(Object object)
- decoder
  
  public static <T> T decoder()
- serializedExists
  
  public static boolean serializedExists()
- serializedOld
  
  public static boolean serializedOld()
  
  Check whether sources are newer than serialized version.
- loadSerialized
  
  public static void loadSerialized()
  
  Loads the most recently saved serialized version.
- serialize
  
  public static void serialize()
  
  save serialized version.
- initOnce
  
  public static void initOnce()
  
  Read the WordNet files only on initialization of the class.
- nounRootForm
  
  public String nounRootForm(String mixedCase, String input)
  
  Return the root form of the noun, or null if it's not in the lexicon.
- verbRootForm
  
  public String verbRootForm(String mixedCase, String input)
  
  Return the present tense singular form of the verb, or null if it's not in the lexicon.
- prependPOS
  
  public Set<String> prependPOS(Set<String> synsets, String POS)
  
  Prepend a POS number to a set of 8 digit synsets
  
  Returns:
  
  an ArrayList of 9 digit synset Strings
- getSynsetsFromWord
  
  public Set<String> getSynsetsFromWord(String word)
  
  Get all the synsets for a given word. Print an error if this routine gives a result and getSenseKeysFromWord() doesn't
  
  Returns:
  
  an ArrayList of 9 digit synset Strings
- getSenseKeysFromWord
  
  public Map<String,List<String>> getSenseKeysFromWord(String word)
  
  Get all the synsets for a given word.
  
  Returns:
  
  a TreeMap of sense keys in the form of word_POS_num and values that are ArrayLists of synset Strings
- getWordsFromTerm
  
  public Map<String,String> getWordsFromTerm(String SUMOterm)
  
  Get the words and synsets corresponding to a SUMO term. The return is a Map of words with their corresponding synset number.
- getWordsFromSynset
  
  public List<String> getWordsFromSynset(String synset)
- containsWord
  
  public boolean containsWord(String word, int pos)
  
  Does WordNet contain the given word.
- containsWord
  
  public boolean containsWord(String word)
  
  Does WordNet contain the given word.
- containsWordIgnoreCase
  
  public boolean containsWordIgnoreCase(String word)
  
  Does WordNet contain the given word, ignoring case.
- page
  
  public String page(String inp, int pos, String kbname, String synset, String params)
  
  This is the regular point of entry for this class. It takes the word the user is searching for, and the part of speech index, does the search, and returns the string with HTML formatting codes to present to the user. The part of speech codes must be the same as in the menu options in WordNet.jsp and Browse.jsp
  
  Parameters:
  
  inp - The string the user is searching for.
  
  pos - The part of speech of the word 1=noun, 2=verb, 3=adjective, 4=adverb
  
  Returns:
  
  A string contained the HTML formatted search result.
- getDocumentation
  
  public String getDocumentation(String synset)
  
  Parameters:
  
  synset - is a synset with POS-prefix
- displaySynset
  
  public String displaySynset(String sumokbname, String synset, String params)
  
  Parameters:
  
  synset - is a synset with POS-prefix
- displayByKey
  
  public String displayByKey(String sumokbname, String key, String params)
  
  Parameters:
  
  key - is a WordNet sense key
  
  Returns:
  
  9-digit POS-prefix and synset number
- writeXML
  
  public void writeXML()
- getTransitivity
  
  public String getTransitivity(String synset, String word)
  
  Frame transitivity intransitive - 1,2,3,4,7,23,35 transitive - everything else ditransitive - 15,16,17,18,19
- writeProlog
  
  public void writeProlog(KB kb)
- senseKeyPOS
  
  public String senseKeyPOS(String senseKey)
- writeWordNetS
  
  public void writeWordNetS()
  
  Write WordNet data to a prolog file with a single kind of clause in the following format: s(Synset_ID, Word_No_in_the_Synset, Word, SS_Type, Synset_Rank_By_the_Word,Tag_Count)
- writeWordNetHyp
  
  public void writeWordNetHyp()
- processPrologString
  
  public String processPrologString(String doc)
  
  Double any single quotes that appear.
- writeWordNetG
  
  public void writeWordNetG()
- writeWordNetProlog
  
  public void writeWordNetProlog() throws IOException
  
  Throws:
  
  IOException
- generateSynsetID
  
  public String generateSynsetID(String l)
  
  Generate a new 8 digit synset ID that doesn't have an existing hash
- generateNounSynsetID
  
  public String generateNounSynsetID()
  
  Generate a new eight digit noun synset ID that doesn't have an existing hash
- generateVerbSynsetID
  
  public String generateVerbSynsetID()
  
  Generate a new eight digit verb synset ID that doesn't have an existing hash
- nounSynsetFromTermFormat
  
  public String nounSynsetFromTermFormat(String tf, String SUMOterm, KB kb)
  
  Generate a new noun synset from a termFormat
- verbSynsetFromTermFormat
  
  public String verbSynsetFromTermFormat(String tf, String SUMOterm, KB kb)
  
  Generate a new verb synset from a termFormat
- synsetFromTermFormat
  
  public void synsetFromTermFormat(Formula form, String tf, String SUMOterm, KB kb)
  
  Generate a new synset from a termFormat statement
  
  Parameters:
  
  form - is the entire termFormat statement
  
  tf - is the lexical item (word). note that in the case of a multi-word lexical item it should already have had spaces replaced by underscores
  
  SUMOterm - is the SUMO term that the lexical item is mapped to
- termFormatsToSynsets
  
  public void termFormatsToSynsets(KB kb)
  
  Generate a new synset from a termFormat
- testWordFreq
  
  public static void testWordFreq()
  
  A method used only for testing. It should not be called during normal operation.
- testProcessPointers
  
  public static void testProcessPointers()
  
  A method used only for testing. It should not be called during normal operation.
- checkWordsToSenses
  
  public static void checkWordsToSenses()
- getEntailments
  
  public static void getEntailments()
- showHelp
  
  public static void showHelp()
- main
  
  public static void main(String[] args)
  
  A main method, used only for testing. It should not be called during normal operation.

Class WordNet

Field Summary

Constructor Summary

Method Summary

Methods inherited from class java.lang.Object

Field Details

disable

debug

wn

baseDir

baseDirFile

initNeeded

regexPatterns

nounSynsetHash

verbSynsetHash

adjectiveSynsetHash

adverbSynsetHash

ignoreCaseSynsetHash

verbDocumentationHash

adjectiveDocumentationHash

adverbDocumentationHash

nounDocumentationHash

nounSUMOHash

verbSUMOHash

adjectiveSUMOHash

adverbSUMOHash

maxNounSynsetID

maxVerbSynsetID

origMaxNounSynsetID

origMaxVerbSynsetID

SUMOHash

synsetsToWords

exceptionVerbHash

exceptionVerbPastProgHash

exceptionVerbPastHash

exceptVerbProgHash

exceptionNounHash

exceptionNounPluralHash

relations

wordCoFrequencies

wordFrequencies

caseMap

senseFrequencies

stopwords

senseIndex

senseKeys

reverseSenseIndex

verbFrames

wordsToSenseKeys

multiWords

NOUN

VERB

ADJECTIVE

ADVERB

ADJECTIVE_SATELLITE

OMW

regexPatternStrings

VerbFrames

Constructor Details

WordNet

Method Details

getMultiWords

compileRegexPatterns

getWnFile

splitToArrayList

splitToArrayListSentence

getSUMOMapping

setMaxNounSynsetID

setMaxVerbSynsetID

processNounLine

mergeWordCoFrequencies

writeWordCoFrequencies

readWordCoFrequencies

readStopWords

readSenseIndex

readSenseCount

addToWordFreq

sumoSentenceDisplay

sumoSentimentDisplay

sumoFileDisplay