Sostituzione del motivo di abbinamento del caso con sed

14

Ho un codice sorgente distribuito su più file.

Ha un modello abcdefche devo sostituire pqrstuvxyz.
Il modello potrebbe essere Abcdef(Frase caso) quindi deve essere sostituito con Pqrstuvxyz.
Il modello potrebbe essere AbCdEf(Attiva / disattiva caso) quindi deve essere sostituito PqRsTuVxYz.

In breve, devo abbinare il caso del modello di origine e applicare il modello di destinazione appropriato.

Come posso ottenere questo utilizzando sedo qualsiasi altro strumento?

text-processing sed awk

— user1263746
fonte

E se lo è ABcDeF?

— Stéphane Chazelas,

PQrStUvxyz - Ottengo il tuo punto.

— user1263746

Quindi se ABcDeF-> PQrStUvxyz, quindi sicuramente AbCdEf-> PqRsTuvxyzsarebbe logicamente coerente. Se il caso deve essere copiato da una stringa all'altra, cosa dovrebbe succedere se la seconda stringa di sostituzione è più lunga.

— Graeme,

Bene, consente di tagliare la sostituzione in "pqrstu" per brevità.

— user1263746

9

Soluzione portatile che utilizza sed:

sed '
:1
/[aA][bB][cC][dD][eE][fF]/!b
s//\
&\
pqrstu\
PQRSTU\
/;:2
s/\n[[:lower:]]\(.*\n\)\(.\)\(.*\n\).\(.*\n\)/\2\
\1\3\4/;s/\n[^[:lower:]]\(.*\n\).\(.*\n\)\(.\)\(.*\n\)/\3\
\1\2\4/;t2
s/\n.*\n//;b1'

È un po 'più semplice con GNU sed:

search=abcdef replace=pqrstuvwx
sed -r ":1;/$search/I!b;s//\n&&&\n$replace\n/;:2
    s/\n[[:lower:]](.*\n)(.)(.*\n)/\l\2\n\1\3/
    s/\n[^[:lower:]](.*\n)(.)(.*\n)/\u\2\n\1\3/;t2
    s/\n.*\n(.*)\n/\1/g;b1"

Usando &&&sopra, riutilizziamo il modello case della stringa per il resto della sostituzione, quindi ABcdefverremmo cambiati in PQrstuVWxe AbCdEfinPqRsTuVwX . Modificalo in &per influenzare solo il caso dei primi 6 caratteri.

(nota che non può fare ciò che si vuole o può incorrere in un ciclo infinito se la sostituzione può essere soggetta a sostituzione (per esempio se sostituire fooper foo, o bcdper abcd)

— Stéphane Chazelas
fonte

8

Soluzione portatile che utilizza awk:

awk -v find=abcdef -v rep=pqrstu '{
  lwr=tolower($0)
  offset=index(lwr, tolower(find))

  if( offset > 0 ) {
    printf "%s", substr($0, 0, offset)
    len=length(find)

    for( i=0; i<len; i++ ) {
      out=substr(rep, i+1, 1)

      if( substr($0, offset+i, 1) == substr(lwr, offset+i, 1) )
        printf "%s", tolower(out)
      else
        printf "%s", toupper(out)
    }

    printf "%s\n", substr($0, offset+len)
  }
}'

Esempio di input:

other abcdef other
other Abcdef other
other AbCdEf other

Esempio di output:

other pqrstu other
other Pqrstu other
other PqRsTu other

Aggiornare

Come sottolineato nei commenti, quanto sopra sostituirà solo la prima istanza di findin ogni riga. Per sostituire tutte le istanze:

awk -v find=abcdef -v rep=pqrstu '{
  input=$0
  lwr=tolower(input)
  offset=index(lwr, tolower(find))

  if( offset > 0 ) {
    while( offset > 0 ) {

      printf "%s", substr(input, 0, offset)
      len=length(find)

      for( i=0; i<len; i++ ) {
        out=substr(rep, i+1, 1)

        if( substr(input, offset+i, 1) == substr(lwr, offset+i, 1) )
          printf "%s", tolower(out)
        else
          printf "%s", toupper(out)
      }

      input=substr(input, offset+len)
      lwr=substr(lwr, offset+len)
      offset=index(lwr, tolower(find))
    }

    print input
  }
}'

Esempio di input:

other abcdef other ABCdef other
other Abcdef other abcDEF
other AbCdEf other aBCdEf other

Esempio di output:

other pqrstu other PQRstu other
other Pqrstu other pqrSTU
other PqRsTu other pQRsTu other

— Graeme
fonte

Si noti che elabora solo un'istanza per riga.

— Stéphane Chazelas,

@StephaneChazelas, aggiornato per gestire più istanze.

— Graeme,

6

Puoi usare perl. Direttamente dalla domanda frequente - citando da perldoc perlfaq6:

Come posso sostituire il case insensitive sull'LHS preservando il case sull'RHS?

Ecco una deliziosa soluzione Perlish di Larry Rosler. Sfrutta le proprietà di xor bit a bit sulle stringhe ASCII.

   $_= "this is a TEsT case";

   $old = 'test';
   $new = 'success';

   s{(\Q$old\E)}
   { uc $new | (uc $1 ^ $1) .
           (uc(substr $1, -1) ^ substr $1, -1) x
           (length($new) - length $1)
   }egi;

   print;

Ed eccolo come una subroutine, modellato su quanto sopra:

       sub preserve_case($$) {
               my ($old, $new) = @_;
               my $mask = uc $old ^ $old;

               uc $new | $mask .
                       substr($mask, -1) x (length($new) - length($old))
   }

       $string = "this is a TEsT case";
       $string =~ s/(test)/preserve_case($1, "success")/egi;
       print "$string\n";

Questo stampa:

           this is a SUcCESS case

In alternativa, per mantenere il caso della parola sostitutiva se è più lunga dell'originale, è possibile utilizzare questo codice, di Jeff Pinyan:

   sub preserve_case {
           my ($from, $to) = @_;
           my ($lf, $lt) = map length, @_;

           if ($lt < $lf) { $from = substr $from, 0, $lt }
           else { $from .= substr $to, $lf }

           return uc $to | ($from ^ uc $from);
           }

Questo cambia la frase in "questo è un caso SUcCess".

Solo per dimostrare che i programmatori C possono scrivere C in qualsiasi linguaggio di programmazione, se si preferisce una soluzione più simile al C, il seguente script fa sì che la sostituzione abbia lo stesso caso, lettera per lettera, dell'originale. (Succede anche che sia più lento del 240% circa rispetto alla soluzione Perlish.) Se la sostituzione ha più caratteri della stringa sostituita, il caso dell'ultimo carattere viene usato per il resto della sostituzione.

   # Original by Nathan Torkington, massaged by Jeffrey Friedl
   #
   sub preserve_case($$)
   {
           my ($old, $new) = @_;
           my ($state) = 0; # 0 = no change; 1 = lc; 2 = uc
           my ($i, $oldlen, $newlen, $c) = (0, length($old), length($new));
           my ($len) = $oldlen < $newlen ? $oldlen : $newlen;

           for ($i = 0; $i < $len; $i++) {
                   if ($c = substr($old, $i, 1), $c =~ /[\W\d_]/) {
                           $state = 0;
                   } elsif (lc $c eq $c) {
                           substr($new, $i, 1) = lc(substr($new, $i, 1));
                           $state = 1;
                   } else {
                           substr($new, $i, 1) = uc(substr($new, $i, 1));
                           $state = 2;
                   }
           }
           # finish up with any remaining new (for when new is longer than old)
           if ($newlen > $oldlen) {
                   if ($state == 1) {
                           substr($new, $oldlen) = lc(substr($new, $oldlen));
                   } elsif ($state == 2) {
                           substr($new, $oldlen) = uc(substr($new, $oldlen));
                   }
           }
           return $new;
   }

— devnull
fonte

Nota che è limitato alle lettere ASCII.

— Stéphane Chazelas,

5

Se tagli la sostituzione a pqrstu, prova questo:

Ingresso:

abcdef
Abcdef
AbCdEf
ABcDeF

ouput:

$ perl -lpe 's/$_/$_^lc($_)^"pqrstu"/ei' file
pqrstu
Pqrstu
PqRsTu
PQrStU

Se si desidera sostituire con prstuvxyz, può essere questo:

$ perl -lne '@c=unpack("(A4)*",$_);
    $_ =~ s/$_/$_^lc($_)^"pqrstu"/ei;
    $c[0] =~ s/$c[0]/$c[0]^lc($c[0])^"vxyz"/ei;
    print $_,$c[0]' file
pqrstuvxyz
PqrstuVxyz
PqRsTuVxYz
PQrStUVXyZ

Non riesco a trovare alcuna regola da mappare ABcDeF-> PQrStUvxyz.

— cuonglm
fonte

Nota che è limitato alle lettere ASCII.

— Stéphane Chazelas,

3

Qualcosa del genere farebbe quello che hai descritto.

sed -i.bak -e "s/abcdef/pqrstuvxyz/g" \
 -e "s/AbCdEf/PqRsTuVxYz/g" \
 -e "s/Abcdef/Pqrstuvxyz/g" files/src

— unx
fonte