Sostituisci gli spazi bianchi con le tabulazioni in Linux

98

Come posso sostituire gli spazi bianchi con le tabulazioni in Linux in un dato file di testo?

linux tabs whitespace

— biznez
fonte

168

Usa il programma inspand (1)

UNEXPAND(1)                      User Commands                     UNEXPAND(1)

NAME
       unexpand - convert spaces to tabs

SYNOPSIS
       unexpand [OPTION]... [FILE]...

DESCRIPTION
       Convert  blanks in each FILE to tabs, writing to standard output.  With
       no FILE, or when FILE is -, read standard input.

       Mandatory arguments to long options are  mandatory  for  short  options
       too.

       -a, --all
              convert all blanks, instead of just initial blanks

       --first-only
              convert only leading sequences of blanks (overrides -a)

       -t, --tabs=N
              have tabs N characters apart instead of 8 (enables -a)

       -t, --tabs=LIST
              use comma separated LIST of tab positions (enables -a)

       --help display this help and exit

       --version
              output version information and exit
. . .
STANDARDS
       The expand and unexpand utilities conform to IEEE Std 1003.1-2001
       (``POSIX.1'').

— DigitalRoss
fonte

4

Woah, non ho mai saputo dell'esistenza di espansione / non espansione. Stavo cercando di fare l'opposto ed espandere è stato perfetto piuttosto che dover scherzare con tro sed.

— Ibrahim

4

Per la cronaca, espandere / non espandere sono utilità standard .

— kojiro

4

Così bello che questi sono standard. Amo la filosofia UNIX . Sarebbe bello se potesse funzionare sul posto però.

— Matthew Flaschen

3

Non penso che inspand funzionerà qui .. converte solo gli spazi iniziali e solo con due o più spazi .. vedi qui: lists.gnu.org/archive/html/bug-textutils/2001-01/msg00025.html

— olala

13

Solo un avvertimento: non espandere non convertirà un singolo spazio in una scheda. Se è necessario convertire alla cieca tutte le sequenze di caratteri 0x20 in una singola scheda, è necessario uno strumento diverso.

— Steve S.

44

Penso che tu possa provare con awk

awk -v OFS="\t" '$1=$1' file1

o SED se preferisci

sed 's/[:blank:]+/,/g' thefile.txt > the_modified_copy.txt

o anche tr

tr -s '\t' < thefile.txt | tr '\t' ' ' > the_modified_copy.txt

o una versione semplificata della soluzione tr suggerita da Sam Bisbee

tr ' ' \\t < someFile > someFile

— Jonathan
fonte

4

Nel tuo esempio sed, le migliori pratiche impongono di utilizzare tr per sostituire singoli caratteri su sed per motivi di efficienza / velocità. Inoltre, l'esempio tr è molto più semplice in questo modo:tr ' ' \\t < someFile > someFile

— Sam Bisbee

2

Ovviamente tr ha prestazioni migliori di sed, ma la ragione principale che ho per amare Unix è che ci sono molti modi per fare qualcosa. Se prevedi di fare questa sostituzione molte volte cercherai una soluzione con una buona prestazione, ma se la farai solo una volta, cercherai una soluzione che implichi un comando che ti faccia sentire a tuo agio.

— Jonathan

2

arg. Ho dovuto usare tentativi ed errori per far funzionare il sed. Non ho idea del motivo per cui sono dovuto sfuggire al segno più in questo modo:ls -l | sed "s/ \+/ /g"

— Jess

Con awk -v OFS="\t" '$1=$1' file1ho notato che se hai una riga che inizia con il numero 0 (ad esempio 0 1 2), la riga verrà omessa dal risultato.

— Nikola Novak

@Jess Hai trovato la regex "corretta sintassi predefinita". Per impostazione predefinita, sed tratta il segno più (senza caratteri di escape) come carattere semplice. Lo stesso vale per altri caratteri come "?", ... Puoi trovare maggiori informazioni qui: gnu.org/software/sed/manual/html_node/… . Dettagli di sintassi simili possono essere trovati qui (nota che questo è man per grep, non sed): gnu.org/software/grep/manual/grep.html#Basic-vs-Extended .

— Victor Yarema

11

Utilizzando Perl :

perl -p -i -e 's/ /\t/g' file.txt

— John Millikin
fonte

3

Si è verificato un problema simile con la sostituzione degli spazi consecutivi con una singola scheda. Perl ha funzionato ha funzionato solo con l'aggiunta di un "+" alla regexp.

— Todd

Anche se, ovviamente, volevo fare il contrario: convertire le tabulazioni in due spazi:perl -p -i -e 's/\t/ /g' *.java

— TimP

Posso farlo in modo ricorsivo?

— Aaron Franke

9

better tr command:

tr [:blank:] \\t

This will clean up the output of say, unzip -l , for further processing with grep, cut, etc.

e.g.,

unzip -l some-jars-and-textfiles.zip | tr [:blank:] \\t | cut -f 5 | grep jar

— Tarkin
fonte

I don't have to use quotes to get it to work: tr [:blank:] \\t

— Ömer An

3

Download and run the following script to recursively convert soft tabs to hard tabs in plain text files.

Place and execute the script from inside the folder which contains the plain text files.

#!/bin/bash

find . -type f -and -not -path './.git/*' -exec grep -Iq . {} \; -and -print | while read -r file; do {
    echo "Converting... "$file"";
    data=$(unexpand --first-only -t 4 "$file");
    rm "$file";
    echo "$data" > "$file";
}; done;

— daka
fonte

2

Example command for converting each .js file under the current dir to tabs (only leading spaces are converted):

find . -name "*.js" -exec bash -c 'unexpand -t 4 --first-only "$0" > /tmp/totabbuff && mv /tmp/totabbuff "$0"' {} \;

— arkod
fonte

Tested in cygwin on windows 7.

— arkod

1

You can also use astyle. I found it quite useful and it has several options too:

Tab and Bracket Options:
   If  no  indentation  option is set, the default option of 4 spaces will be used. Equivalent to -s4 --indent=spaces=4.  If no brackets option is set, the
   brackets will not be changed.

   --indent=spaces, --indent=spaces=#, -s, -s#
          Indent using # spaces per indent. Between 1 to 20.  Not specifying # will result in a default of 4 spaces per indent.

   --indent=tab, --indent=tab=#, -t, -t#
          Indent using tab characters, assuming that each tab is # spaces long.  Between 1 and 20. Not specifying # will result in a default assumption  of
          4 spaces per tab.`

— Ankur Agarwal
fonte

0

If you are talking about replacing all consecutive spaces on a line with a tab then tr -s '[:blank:]' '\t'.

[root@sysresccd /run/archiso/img_dev]# sfdisk -l -q -o Device,Start /dev/sda
Device         Start
/dev/sda1       2048
/dev/sda2     411648
/dev/sda3    2508800
/dev/sda4   10639360
/dev/sda5   75307008
/dev/sda6   96278528
/dev/sda7  115809778
[root@sysresccd /run/archiso/img_dev]# sfdisk -l -q -o Device,Start /dev/sda | tr -s '[:blank:]' '\t'
Device  Start
/dev/sda1       2048
/dev/sda2       411648
/dev/sda3       2508800
/dev/sda4       10639360
/dev/sda5       75307008
/dev/sda6       96278528
/dev/sda7       115809778

If you are talking about replacing all whitespace (e.g. space, tab, newline, etc.) then tr -s '[:space:]'.

[root@sysresccd /run/archiso/img_dev]# sfdisk -l -q -o Device,Start /dev/sda | tr -s '[:space:]' '\t'
Device  Start   /dev/sda1       2048    /dev/sda2       411648  /dev/sda3       2508800 /dev/sda4       10639360        /dev/sda5       75307008        /dev/sda6     96278528        /dev/sda7       115809778

If you are talking about fixing a tab-damaged file then use expand and unexpand as mentioned in other answers.

— shrewmouse
fonte

0

Using sed:

T=$(printf "\t")
sed "s/[[:blank:]]\+/$T/g"

or

sed "s/[[:space:]]\+/$T/g"

— Tibor
fonte

-1

This will replace consecutive spaces with one space (but not tab).

tr -s '[:blank:]'

This will replace consecutive spaces with a tab.

tr -s '[:blank:]' '\t'

— mel
fonte

Actually, with the -c it replaces consecutive characters that are not spaces.

— wingedsubmariner

1

The question is about tabs, this isn't an answer.

— Matthew Read