unire più righe in base a column1

Ho un file come di seguito ..

abc, 12345
def, text and nos    
ghi, something else   
jkl, words and numbers

abc, 56345   
def, text and nos   
ghi, something else 
jkl, words and numbers

abc, 15475  
def, text and nos 
ghi, something else
jkl, words and numbers

abc, 123345
def, text and nos
ghi, something else  
jkl, words and numbers

Voglio convertirlo (unirlo) come:

abc, 12345, 56345, 15475, 123345
def, text and nos, text and nos,text and nos,text and nos
ghi, something else, something else, something else, something else   
jkl, words and numbers, words and numbers, words and numbers, words and numbers

— pvkbhat
fonte

Hai effettivamente delle righe vuote extra nel tuo file di input? In caso contrario, modificali e rimuovili, dovresti mostrare il file esattamente come è.

— terdon

Risposte:

Se non ti dispiace l'ordine di output:

$ awk -F',' 'NF>1{a[$1] = a[$1]","$2};END{for(i in a)print i""a[i]}' file 
jkl, words and numbers, words and numbers, words and numbers, words and numbers
abc, 12345, 56345, 15475, 123345
ghi, something else, something else, something else, something else
def, text and nos, text and nos, text and nos, text and nos

Spiegazione

NF>1 ciò significa che dobbiamo solo elaborare una riga che non è vuota.
Salviamo tutto il primo campo nella matrice associativa a, con la chiave è il primo campo, il valore è il secondo campo (o il resto della riga). Se la chiave ha già un valore, concediamo due valori.
Nel ENDblocco, eseguiamo il ciclo attraverso l'array associativo a, stampiamo tutte le sue chiavi con il valore corrispondente.

O usando perlmanterrà l'ordine:

$perl -F',' -anle 'next if /^$/;$h{$F[0]} = $h{$F[0]}.", ".$F[1];
    END{print $_,$h{$_},"\n" for sort keys %h}' file
abc, 12345, 56345, 15475, 123345

def, text and nos, text and nos, text and nos, text and nos

ghi, something else, something else, something else, something else

jkl, words and numbers, words and numbers, words and numbers, words and numbers

— cuonglm
fonte

la tua soluzione perl dalla mia domanda unix.stackexchange.com/questions/124181/… dovrebbe funzionare bene?

— Ramesh,

No. L'OP vuole concaticare la stringa in base alla colonna 1, indipendentemente dalla duplicazione o meno. La tua domanda non vuole essere duplicata.

— cuonglm

Oh va bene. A prima vista, sembrava quasi simile alla mia domanda. :)

— Ramesh,

Pulito, +1! Ciò non mantiene l'ordine, ma lo ricrea solo in questo esempio particolare in cui i campi sono in ordine alfabetico.

— terdon

Solo per ridere, avevo scritto quasi esattamente lo stesso approccio prima di leggere la tua risposta: perl -F, -lane 'next unless /./;push @{$k{$F[0]}}, ",@F[1..$#F]"; END{print "$_@{$k{$_}}" foreach keys(%k)}' file:) Le grandi menti pensano allo stesso modo!

— terdon

Oh, è facile. Ecco una versione semplice che mantiene l'ordine delle chiavi come appaiono nel file:

$ awk -F, '
    /.+/{
        if (!($1 in Val)) { Key[++i] = $1; }
        Val[$1] = Val[$1] "," $2; 
    }
    END{
        for (j = 1; j <= i; j++) {
            printf("%s %s\n%s", Key[j], Val[Key[j]], (j == i) ? "" : "\n");       
        }                                    
    }' file.txt

L'output dovrebbe essere simile al seguente:

abc, 12345, 56345, 15475, 123345

def, text and nos, text and nos, text and nos, text and nos

ghi, something else, something else, something else, something else

jkl, words and numbers, words and numbers, words and numbers, words and numbers

Se non ti dispiace avere una riga vuota in più alla fine, basta sostituire la printflinea conprintf("%s %s\n\n", Key[j], Val[Key[j]]);