|
|
Sorry, I accidentally clicked the send mail.
Thanks for the reply.
> Have you also modified the index.noun file to account for your changes?
> index.noun contains a list of byte offsets into data.noun, and any changes to
> the latter mean the former is invalid.
I have modified the index.noun too,
> Alternatively, I wonder what platform you are working on? Records in the
> WordNet
> files must be terminated by just a single "\x0A". If you are working on a
> non-Unix platform that uses a multi-character record separator then the
> records
> will be a different length, so invalidating the index file.
I am working on Linux william-pc 2.6.24-16-generic #1 SMP Thu Apr 10 13:23:42
UTC 2008 i686 GNU/Linux
Ok,
I got to admit something, after knowing the seek function, only today I
realize how actually determine the synset id which is equivalient to
byte offset that you said. Before this I thought the synset id is
determined by some kind of database auto-increment id/ primary key
thing. lol.
Now I realized of course when I added let's say 3 character to the first line
and when the seek function try to seek(FH, 00001930, 0) ,
I will get
g)\n00001930 03 n 01 physical_entity 0 007 @ 00001740 n 0000 ~ 00002452 n
0000 ~ 00002684 n 0000 ~ 00007347 n 0000 ~ 00020827 n 0000 ~ 00029677 n
0000 ~ 14580597 n 0000 | an entity that has physical existence
00001740 03 n 02 entity 0 003 ~ 00001930 n 0000 ~ 00002137 n 0000 ~ 04424418 n
0000 | that which is perceived or known or inferred to have its own distinct
existence (living or nonliving)
00001930 03 n 01 physical_entity 0 007 @ 00001740 n 0000 ~ 00002452 n 0000 ~
00002684 n 0000 ~ 00007347 n 0000 ~ 00020827 n 0000 ~ 00029677 n 0000 ~
14580597 n 0000 | an entity that has physical existence
Not wonder it's invalid.
I wonder what is the reason they arrange the database in such a way ? Is it, it
would make the lookup faster ? And what is that index.noun file used for when
all the information in there is also in data.noun ?
So now how can I add new synonym words to the WordNet database without
affecting the original offset bytes ?
Thanks.
Send instant messages to your online friends http://uk.messenger.yahoo.com
|
|