perl.beginners
[Top] [All Lists]

Re: comparing some but not all fields in lists

Subject: Re: comparing some but not all fields in lists
From: Gunnar Hjalmarsson
Date: Tue, 04 Mar 2008 01:26:21 +0100
Newsgroups: perl.beginners


David Newman wrote:
Greetings. I'm looking to compare two contact lists in csv format, and then print out "here are the records in in Llist only, in Rlist only, and what's in common."

I should compare only 3 of the 82 fields in each list. There are differences in some of the other fields that I should ignore.

If I read in each csv file as an array, List::Compare does a nice job of comparing all 82 fields as a single array element. But I should only look at 3 fields, not all 82. (snippet A below)

I can also use List::Compare plus a split function to strip out just the 3 fields I'm comparing. However, the resuling arrays then only have three fields in each array element. (snippet B below)

How to compare only selected fields in each list, but then present all fields for any matches?

For one of the lists, you can store the records in a hash where the first 3 fields of each record is the key, and the complete record is the value. Something like:

    my (%Llist, @Rlist, @common);
    open my $L, '<', $Lfile or die $!;
    open my $R, '<', $Rfile or die $!;
    while ( <$L> ) {
        my ($comp) = /([^,]*,[^,]*,[^,]*)/;
        $Llist{$comp} = $_;
    }
    while ( <$R> ) {
        my ($comp) = /([^,]*,[^,]*,[^,]*)/;
        if ( $Llist{$comp} ) {
            push @common, delete $Llist{$comp};
        } else {
            push @Rlist, $_;
        }
    }
    print "Llist only:\n", values %Llist, "\n";
    print "Rlist only:\n", @Rlist, "\n";
    print "In common:\n", @common;

This is a simplified example, though. Consider, for instance, the possibility that there is a company named "Smith, Jones & Co". To deal with that, you'd better use a CSV module for parsing the records.

--
Gunnar Hjalmarsson
Email: http://www.gunnar.cc/cgi-bin/contact.pl

<Prev in Thread] Current Thread [Next in Thread>