|
|
> On Tue, 29 Apr 2008 20:53:50 +0530, santosh wrote:
> What about words with other characters like hyphen? What about
> constructs like "get_name"? Will you discard them too. What about words
> that end with a ; or ...? What about words that contain symbols like #@
> etc? Or words that end with an exclamation mark? Or words within
> parenthesis or braces?
all of them will be discarded. Only words containing letters like
"santosh" will be considered, nothing else.
> That's one way yes, suitable when you don't know the lengths of words in
> advance, or you don't want to possibly waste storage with statically
> allocated arrays.
yes, exactly, input will be at run-time only.
> Depends on your requirements really, and the type and frequency of input
> you expect. Will you put an upper limit on the length of words? It
> hardly makes sense to accept words longer than about 64 characters if
> you are dealing with normal English text.
ok, make the upper limit to 64 :) , I usually take it 100 as my style.
> Static 2D arrays are
> undoubtedly easier to work with but are less flexible than dynamically
> allocated arrays. Since statically allocated arrays are of fixed size
> it's possible for some elements to remain unused and hence wasted. OTOH
> a large number of small allocations may lead to memory fragmentation and
> also some wastage due to malloc bookeeping and possibly also a slowdown
> in speed if you'll be reading a very large number of words from a file.
> For input from a human it will not matter.
you want to say that there will be 2 types of implementations if
efficiency is my concern:
1.) input from human
2.) input from a text-file
??
couldn't there be a single implementation for both types of inputs ?
> One efficient method is to use a single dynamically allocated array in
> which words are stored sequentially. The length of each word could be
> specified by either one or two bytes prefixing the word itself. This
> results in very efficient storage, but is grossly inefficient if you
> want to insert and delete words at random. For this a hash table based
> approach is probably the best. OTOH a tree is very convenient for quick
> searching and sorting.
> If you tell us more details about the type and volume of input you
> expect and the facilities (like searching, insertion, etc.) you plan to
> implement, perhaps a tailored approach can be suggested.
The basic problem is to sort, count and print the sorted words. We are
not going to save a word in an array if it has already appeared, we will
just increase the count for that word.
K&R2 seems to suggest that a doubly-linked list using binary search is
the most efficient method to use, described in section 6.5 and is already
solved. Actually I am not able to understand the <getword> function of the
authors which actually is different from what I want, hence I need to
create one of my own.
--
http://lispmachine.wordpress.com/
my email ID is at the above address
|
|