Conversione di char [] in byte []

Question 1

Vorrei convertire un array di caratteri in un array di byte in Java. Quali metodi esistono per effettuare questa conversione?

Question 2

char[] ch = ?
new String(ch).getBytes();

o

new String(ch).getBytes("UTF-8");

per ottenere un set di caratteri non predefinito.

Aggiornamento: da Java 7:new String(ch).getBytes(StandardCharsets.UTF_8);

Question 3

Converti senza creare Stringoggetti:

import java.nio.CharBuffer;
import java.nio.ByteBuffer;
import java.util.Arrays;

byte[] toBytes(char[] chars) {
  CharBuffer charBuffer = CharBuffer.wrap(chars);
  ByteBuffer byteBuffer = Charset.forName("UTF-8").encode(charBuffer);
  byte[] bytes = Arrays.copyOfRange(byteBuffer.array(),
            byteBuffer.position(), byteBuffer.limit());
  Arrays.fill(byteBuffer.array(), (byte) 0); // clear sensitive data
  return bytes;
}

Utilizzo:

char[] chars = {'0', '1', '2', '3', '4', '5', '6', '7', '8', '9'};
byte[] bytes = toBytes(chars);
/* do something with chars/bytes */
Arrays.fill(chars, '\u0000'); // clear sensitive data
Arrays.fill(bytes, (byte) 0); // clear sensitive data

La soluzione si ispira alla raccomandazione di Swing di memorizzare le password in char []. (Vedi Perché char [] è preferito a String per le password? )

Ricorda di non scrivere dati sensibili nei log e assicurati che JVM non conservi alcun riferimento ad essi.

Il codice sopra è corretto ma non efficace. Se non hai bisogno di prestazioni ma desideri sicurezza, puoi usarlo. Se anche la sicurezza non è un obiettivo, fallo semplicemente String.getBytes. Il codice sopra non è efficace se guardi verso il basso l'implementazione di encodein JDK. Inoltre è necessario copiare array e creare buffer. Un altro modo per convertire è inline tutto il codice sottostante encode(esempio per UTF-8 ):

val xs: Array[Char] = "A ß € 嗨 𝄞 🙂".toArray
val len = xs.length
val ys: Array[Byte] = new Array(3 * len) // worst case
var i = 0; var j = 0 // i for chars; j for bytes
while (i < len) { // fill ys with bytes
  val c = xs(i)
  if (c < 0x80) {
    ys(j) = c.toByte
    i = i + 1
    j = j + 1
  } else if (c < 0x800) {
    ys(j) = (0xc0 | (c >> 6)).toByte
    ys(j + 1) = (0x80 | (c & 0x3f)).toByte
    i = i + 1
    j = j + 2
  } else if (Character.isHighSurrogate(c)) {
    if (len - i < 2) throw new Exception("overflow")
    val d = xs(i + 1)
    val uc: Int = 
      if (Character.isLowSurrogate(d)) {
        Character.toCodePoint(c, d)
      } else {
        throw new Exception("malformed")
      }
    ys(j) = (0xf0 | ((uc >> 18))).toByte
    ys(j + 1) = (0x80 | ((uc >> 12) & 0x3f)).toByte
    ys(j + 2) = (0x80 | ((uc >>  6) & 0x3f)).toByte
    ys(j + 3) = (0x80 | (uc & 0x3f)).toByte
    i = i + 2 // 2 chars
    j = j + 4
  } else if (Character.isLowSurrogate(c)) {
    throw new Exception("malformed")
  } else {
    ys(j) = (0xe0 | (c >> 12)).toByte
    ys(j + 1) = (0x80 | ((c >> 6) & 0x3f)).toByte
    ys(j + 2) = (0x80 | (c & 0x3f)).toByte
    i = i + 1
    j = j + 3
  }
}
// check
println(new String(ys, 0, j, "UTF-8"))

Mi scusi se uso il linguaggio Scala. Se hai problemi con la conversione di questo codice in Java, posso riscriverlo. Per quanto riguarda le prestazioni, controlla sempre i dati reali (con JMH per esempio). Questo codice è molto simile a quello che puoi vedere in JDK [ 2 ] e Protobuf [ 3 ].

Question 4

Modifica: la risposta di Andrey è stata aggiornata, quindi quanto segue non si applica più.

La risposta di Andrey (il voto più alto al momento della scrittura) è leggermente errata. Avrei aggiunto questo come commento ma non sono abbastanza rispettabile.

Nella risposta di Andrey:

char[] chars = {'c', 'h', 'a', 'r', 's'}
byte[] bytes = Charset.forName("UTF-8").encode(CharBuffer.wrap(chars)).array();

la chiamata ad array () potrebbe non restituire il valore desiderato, ad esempio:

char[] c = "aaaaaaaaaa".toCharArray();
System.out.println(Arrays.toString(Charset.forName("UTF-8").encode(CharBuffer.wrap(c)).array()));

produzione:

[97, 97, 97, 97, 97, 97, 97, 97, 97, 97, 0]

Come si può vedere è stato aggiunto uno zero byte. Per evitare ciò, utilizzare quanto segue:

char[] c = "aaaaaaaaaa".toCharArray();
ByteBuffer bb = Charset.forName("UTF-8").encode(CharBuffer.wrap(c));
byte[] b = new byte[bb.remaining()];
bb.get(b);
System.out.println(Arrays.toString(b));

produzione:

[97, 97, 97, 97, 97, 97, 97, 97, 97, 97]

Poiché la risposta alludeva anche all'uso delle password, potrebbe valere la pena cancellare l'array che supporta ByteBuffer (accessibile tramite la funzione array ()):

ByteBuffer bb = Charset.forName("UTF-8").encode(CharBuffer.wrap(c));
byte[] b = new byte[bb.remaining()];
bb.get(b);
blankOutByteArray(bb.array());
System.out.println(Arrays.toString(b));

Question 5

private static byte[] charArrayToByteArray(char[] c_array) {
        byte[] b_array = new byte[c_array.length];
        for(int i= 0; i < c_array.length; i++) {
            b_array[i] = (byte)(0xFF & (int)c_array[i]);
        }
        return b_array;
}

Question 6

Potresti creare un metodo:

public byte[] toBytes(char[] data) {
byte[] toRet = new byte[data.length];
for(int i = 0; i < toRet.length; i++) {
toRet[i] = (byte) data[i];
}
return toRet;
}

Spero che sia di aiuto