Gunnar Hjalmarsson
Date: Tue, 04 Mar 2008 01:26:21 +0100
David Newman wrote:
Greetings. I'm looking to compare two contact lists in csv format, and then print out "here are the records in in Llist only, in Rlist only, and what's in common."
I should compare only 3 of the 82 fields in each list. There are
differences in some of the other fields that I should ignore.
If I read in each csv file as an array, List::Compare does a nice job of
comparing all 82 fields as a single array element. But I should only
look at 3 fields, not all 82. (snippet A below)
I can also use List::Compare plus a split function to strip out just the
3 fields I'm comparing. However, the resuling arrays then only have
three fields in each array element. (snippet B below)
How to compare only selected fields in each list, but then present all
fields for any matches?
For one of the lists, you can store the records in a hash where the
first 3 fields of each record is the key, and the complete record is the
value. Something like:
    my (%Llist, @Rlist, @common);
    open my $L, '<', $Lfile or die $!;
    open my $R, '<', $Rfile or die $!;
    while ( <$L> ) {
        my ($comp) = /([^,]*,[^,]*,[^,]*)/;
        $Llist{$comp} = $_;
    while ( <$R> ) {
        my ($comp) = /([^,]*,[^,]*,[^,]*)/;
        if ( $Llist{$comp} ) {
            push @common, delete $Llist{$comp};
        } else {
            push @Rlist, $_;
    print "Llist only:\n", values %Llist, "\n";
    print "Rlist only:\n", @Rlist, "\n";
    print "In common:\n", @common;

This is a simplified example, though. Consider, for instance, the possibility that there is a company named "Smith, Jones & Co". To deal with that, you'd better use a CSV module for parsing the records.
Gunnar Hjalmarsson

