Trova la matrice con il punteggio più alto senza la proprietà X

Questa sfida è in parte una sfida di algoritmi, in parte una sfida di ottimizzazione e in parte semplicemente una sfida di codice più veloce.

Una matrice ciclica è completamente specificata dalla sua prima riga r. Le righe rimanenti sono ciascuna permutazioni cicliche della riga rcon offset uguale all'indice di riga. Permetteremo matrici cicliche che non sono quadrate, in modo che manchino semplicemente alcune delle loro ultime righe. Tuttavia, assumiamo sempre che il numero di righe non sia superiore al numero di colonne. Ad esempio, si consideri la seguente matrice ciclica 3 per 5.

10111
11011
11101

Diciamo che una matrice ha la proprietà X se contiene due insiemi di colonne non vuoti con indici non identici che hanno la stessa somma (vettoriale). La somma vettoriale di due colonne è semplicemente una somma saggia delle due colonne. Questa è la somma di due colonne contenenti xelementi, ognuna è un'altra colonna contenente xelementi.

La matrice sopra ha banalmente la proprietà X poiché la prima e l'ultima colonna sono uguali. La matrice identità non ha mai la proprietà X.

Se rimuoviamo solo l'ultima colonna della matrice sopra, otteniamo un esempio che non ha la proprietà X e darebbe un punteggio di 4/3.

1011
1101
1110

L'obiettivo

Il compito è scrivere codice per trovare la matrice ciclica con il punteggio più alto le cui voci sono tutte 0 o 1 e che non ha la proprietà X.

Punto

Il tuo punteggio sarà il numero di colonne diviso per il numero di righe nella tua matrice di punteggio migliore.

Tie Breaker

Se due risposte hanno lo stesso punteggio, vince la prima inviata.

Nel caso (molto) improbabile che qualcuno trovi un metodo per ottenere punteggi illimitati, verrà accettata la prima prova valida di tale soluzione. Nel caso ancor più improbabile che tu possa trovare una prova dell'ottimalità di una matrice finita, naturalmente assegnerò anche la vittoria.

Suggerimento

Ottenere un punteggio di 12/8 non è troppo difficile.

Lingue e biblioteche

Puoi usare qualsiasi lingua che abbia un compilatore / interprete / ecc. Liberamente disponibile. per Linux e tutte le librerie che sono anche disponibili gratuitamente per Linux.

Voci principali

36/19 di Peter Taylor (Java)
32/17 di Suboptimus Prime (C #)
21/12 per justhalf (Python 2)

code-challenge binary-matrix

— randomra
fonte

Ah, la proprietà X è sulle colonne, non sulle righe.

— Ottimizzatore

Come scritto, la matrice 1 per 2 01 ha la proprietà X perché l'insieme della prima colonna ha la stessa somma vettoriale dell'insieme vuoto. Forse intendevi serie di colonne non vuote? Penso che sia più pulito non cambiarlo però.

— xnor

La lettura più semplice delle regole è ancora che 01ha la proprietà X: (1) = (0) + (1). Se vuoi escluderlo, dovresti dire che i due gruppi di colonne devono essere disgiunti.

— Peter Taylor

Questa domanda fornirà molte informazioni su questo problema (su quanto sia difficile controllare la proprietà X, che è NP-difficile, sfortunatamente) mathoverflow.net/questions/157634/…

— solo il

Attualmente stiamo solo forzando brutalmente tutte le 2^mcombinazioni di colonne per verificare la proprietà X. Se potessimo in qualche modo escogitare uno schema "meet in the middle" (vedere il problema "somma di sottoinsieme"), ciò potrebbe probabilmente ridurlo a m * 2^(m/2)...

— kennytm

Risposte:

16/9 20/11 22/12 28/15 30/16 32/17 34/18 36/19 (Java)

Questo utilizza una serie di idee per ridurre lo spazio e i costi di ricerca. Visualizza la cronologia delle revisioni per maggiori dettagli sulle versioni precedenti del codice.

È chiaro che wlog possiamo considerare solo matrici circolanti in cui la prima riga è a parola di Lyndon : se la parola non è un numero primo, allora deve avere la proprietà X, e altrimenti possiamo ruotare senza influenzare il punteggio o la proprietà X.
Sulla base dell'euristica dei corti vincitori osservati, sto ora analizzando le parole di Lyndon iniziando da quelle con densità del 50% (cioè lo stesso numero di 0e 1) e lavorando; Uso l'algoritmo descritto nel codice A Gray per collane a densità fissa e parole di Lyndon in tempo ammortizzato costante , Sawada e Williams, Theoretical Computer Science 502 (2013): 46-54.
Un'osservazione empirica è che i valori si presentano in coppia: ogni parola di Lyndon ottimale che ho trovato ha lo stesso punteggio della sua inversione. Quindi ottengo un fattore di due speedup considerando solo la metà di ciascuna di queste coppie.
Il mio codice originale ha funzionato BigIntegerper fornire un test esatto. Ottengo un significativo miglioramento della velocità, a rischio di falsi negativi, facendo funzionare modulo un grande numero primo e mantenendo tutto in primitive. Il primo che ho scelto è il più grande più piccolo di 2 ⁵⁷ , in quanto ciò consente di moltiplicare per la base della mia rappresentazione vettoriale nozionale senza traboccare.
Ho rubato l' euristico di Suboptimus Prime secondo cui è possibile ottenere rapidi rifiuti considerando i sottoinsiemi in ordine crescente di dimensioni. Ora ho unito l'idea con l'approccio meet-in-the-middle del sottoinsieme ternario per testare i sottoinsiemi in collisione. (Credito a KennyTM per aver suggerito di provare ad adattare l'approccio dal problema del sottoinsieme di numeri interi; penso che xnor e io abbiamo visto il modo di farlo praticamente contemporaneamente). Invece di cercare due sottoinsiemi che possono includere ogni colonna 0 o 1 volte e avere la stessa somma, cerchiamo un sottoinsieme che può includere ogni colonna -1, 0 o 1 volte e sommare a zero. Ciò riduce significativamente i requisiti di memoria.
C'è un ulteriore fattore di due risparmi nei requisiti di memoria osservando che dal momento che ogni elemento in {-1,0,1}^m ha la sua negazione anche in{-1,0,1}^m è necessario solo memorizzare uno dei due.
Inoltre, migliora i requisiti di memoria e le prestazioni utilizzando un'implementazione hashmap personalizzata. Per testare 36/19 è necessario archiviare 3 ^ 18 somme e 3 ^ 18 lunghezze sono quasi 3 GB senza alcun sovraccarico - gli ho dato 6 GB di heap perché 4 GB non erano sufficienti; andare oltre (cioè test 38/20) entro 8 GB di RAM richiederebbe un'ulteriore ottimizzazione per memorizzare ints piuttosto che long. Con 20 bit richiesti per dire quale sottoinsieme produce la somma che lascerebbe 12 bit più i bit impliciti dal bucket; Temo che ci sarebbero troppe collisioni false per ottenere colpi.
Poiché il peso delle prove suggerisce che dovremmo guardare 2n/(n+1) , sto accelerando le cose semplicemente testandolo.
Esistono risultati statistici inutili ma rassicuranti.

import java.util.*;

// Aiming to find a solution for (2n, n+1).
public class PPCG41021_QRTernary_FixedDensity {
    private static final int N = 36;
    private static int density;
    private static long start;
    private static long nextProgressReport;

    public static void main(String[] args) {
        start = System.nanoTime();
        nextProgressReport = start + 5 * 60 * 1000000000L;

        // 0, -1, 1, -2, 2, ...
        for (int i = 0; i < N - 1; i++) {
            int off = i >> 1;
            if ((i & 1) == 1) off = ~off;
            density = (N >> 1) + off;

            // Iterate over Lyndon words of length N and given density.
            for (int j = 0; j < N; j++) a[j] = j < N - density ? '0' : '1';
            c = 1;
            Bs[1] = N - density;
            Bt[1] = density;
            gen(N - density, density, 1);
            System.out.println("----");
        }

        System.out.println("Finished in " + (System.nanoTime() - start)/1000000 + " ms");
    }

    private static int c;
    private static int[] Bs = new int[N + 1], Bt = new int[N + 1];
    private static char[] a = new char[N];
    private static void gen(int s, int t, int r) {
        if (s > 0 && t > 0) {
            int j = oracle(s, t, r);
            for (int i = t - 1; i >= j; i--) {
                updateBlock(s, t, i);
                char tmp = a[s - 1]; a[s - 1] = a[s+t-i - 1]; a[s+t-i - 1] = tmp;
                gen(s-1, t-i, testSuffix(r) ? c-1 : r);
                tmp = a[s - 1]; a[s - 1] = a[s+t-i - 1]; a[s+t-i - 1] = tmp;
                restoreBlock(s, t, i);
            }
        }
        visit();
    }

    private static int oracle(int s, int t, int r) {
        int j = pseudoOracle(s, t, r);
        updateBlock(s, t, j);
        int p = testNecklace(testSuffix(r) ? c - 1 : r);
        restoreBlock(s, t, j);
        return p == N ? j : j + 1;
    }

    private static int pseudoOracle(int s, int t, int r) {
        if (s == 1) return t;
        if (c == 1) return s == 2 ? N / 2 : 1;
        if (s - 1 > Bs[r] + 1) return 0;
        if (s - 1 == Bs[r] + 1) return cmpPair(s-1, t, Bs[c-1]+1, Bt[c-1]) <= 0 ? 0 : 1;
        if (s - 1 == Bs[r]) {
            if (s == 2) return Math.max(t - Bt[r], (t+1) >> 1);
            return Math.max(t - Bt[r], (cmpPair(s-1, t, Bs[c-1] + 1, Bt[c-1]) <= 0) ? 0 : 1); 
        }
        if (s == Bs[r]) return t;
        throw new UnsupportedOperationException("Hit the case not covered by the paper or its accompanying code");
    }

    private static int testNecklace(int r) {
        if (density == 0 || density == N) return 1;
        int p = 0;
        for (int i = 0; i < c; i++) {
            if (r - i <= 0) r += c;
            if (cmpBlocks(c-i, r-i) < 0) return 0;
            if (cmpBlocks(c-i, r-1) > 0) return N;
            if (r < c) p += Bs[r-i] + Bt[r-i];
        }
        return p;
    }

    private static int cmpPair(int a1, int a2, int b1, int b2) {
        if (a1 < b1) return -1;
        if (a1 > b1) return 1;
        if (a2 < b2) return -1;
        if (a2 > b2) return 1;
        return 0;
    }

    private static int cmpBlocks(int i, int j) {
        return cmpPair(Bs[i], Bt[i], Bs[j], Bt[j]);
    }

    private static boolean testSuffix(int r) {
        for (int i = 0; i < r; i++) {
            if (c - 1 - i == r) return true;
            if (cmpBlocks(c-1-i, r-i) < 0) return false;
            if (cmpBlocks(c-1-i, r-1) > 0) return true;
        }
        return false;
    }

    private static void updateBlock(int s, int t, int i) {
        if (i == 0 && c > 1) {
            Bs[c-1]++;
            Bs[c] = s - 1;
        }
        else {
            Bs[c] = 1;
            Bt[c] = i;
            Bs[c+1] = s-1;
            Bt[c+1] = t-i;
            c++;
        }
    }

    private static void restoreBlock(int s, int t, int i) {
        if (i == 0 && (c > 0 || (Bs[1] != 1 || Bt[1] != 0))) {
            Bs[c-1]--;
            Bs[c] = s;
        }
        else {
            Bs[c-1] = s;
            Bt[c-1] = t;
            c--;
        }
    }

    private static long[] stats = new long[N/2+1];
    private static long visited = 0;
    private static void visit() {
        String word = new String(a);

        visited++;
        if (precedesReversal(word) && testTernary(word)) System.out.println(word + " after " + (System.nanoTime() - start)/1000000 + " ms");
        if (System.nanoTime() > nextProgressReport) {
            System.out.println("Progress: visited " + visited + "; stats " + Arrays.toString(stats) + " after " + (System.nanoTime() - start)/1000000 + " ms");
             nextProgressReport += 5 * 60 * 1000000000L;
        }
    }

    private static boolean precedesReversal(String w) {
        int n = w.length();
        StringBuilder rev = new StringBuilder(w);
        rev.reverse();
        rev.append(rev, 0, n);
        for (int i = 0; i < n; i++) {
            if (rev.substring(i, i + n).compareTo(w) < 0) return false;
        }
        return true;
    }

    private static boolean testTernary(String word) {
        int n = word.length();
        String rep = word + word;

        int base = 1;
        for (char ch : word.toCharArray()) base += ch & 1;

        // Operating base b for b up to 32 implies that we can multiply by b modulo p<2^57 without overflowing a long.
        // We're storing 3^(n/2) ~= 2^(0.8*n) sums, so while n < 35.6 we don't get *too* bad a probability of false reject.
        // (In fact the birthday paradox assumes independence, and our values aren't independent, so we're better off than that).
        long p = (1L << 57) - 13;
        long[] basis = new long[n];
        basis[0] = 1;
        for (int i = 1; i < basis.length; i++) basis[i] = (basis[i-1] * base) % p;

        int rows = n / 2 + 1;
        long[] colVals = new long[n];
        for (int col = 0; col < n; col++) {
            for (int row = 0; row < rows; row++) {
                colVals[col] = (colVals[col] + basis[row] * (rep.charAt(row + col) & 1)) % p;
            }
        }

        MapInt57Int27 map = new MapInt57Int27();
        // Special-case the initial insertion.
        int[] oldLens = new int[map.entries.length];
        int[] oldSupercounts = new int[1 << 10];
        {
            // count = 1
            for (int k = 0; k < n/2; k++) {
                int val = 1 << (25 - k);
                if (!map.put(colVals[k], val)) { stats[1]++; return false; }
                if (!map.put(colVals[k + n/2], val + (1 << 26))) { stats[1]++; return false; }
            }
        }
        final long keyMask = (1L << 37) - 1;
        for (int count = 2; count <= n/2; count++) {
            int[] lens = map.counts.clone();
            int[] supercounts = map.supercounts.clone();
            for (int sup = 0; sup < 1 << 10; sup++) {
                int unaccountedFor = supercounts[sup] - oldSupercounts[sup];
                for (int supi = 0; supi < 1 << 10 && unaccountedFor > 0; supi++) {
                    int i = (sup << 10) + supi;
                    int stop = lens[i];
                    unaccountedFor -= stop - oldLens[i];
                    for (int j = oldLens[i]; j < stop; j++) {
                        long existingKV = map.entries[i][j];
                        long existingKey = ((existingKV & keyMask) << 20) + i;
                        int existingVal = (int)(existingKV >>> 37);

                        // For each possible prepend...
                        int half = (existingVal >> 26) * n/2;
                        // We have 27 bits of key, of which the top marks the half, so 26 bits. That means there are 6 bits at the top which we need to not count.
                        int k = Integer.numberOfLeadingZeros(existingVal << 6) - 1;
                        while (k >= 0) {
                            int newVal = existingVal | (1 << (25 - k));
                            long pos = (existingKey + colVals[k + half]) % p;
                            if (pos << 1 > p) pos = p - pos;
                            if (pos == 0 || !map.put(pos, newVal)) { stats[count]++; return false; }
                            long neg = (p - existingKey + colVals[k + half]) % p;
                            if (neg << 1 > p) neg = p - neg;
                            if (neg == 0 || !map.put(neg, newVal)) { stats[count]++; return false; }
                            k--;
                        }
                    }
                }
            }
            oldLens = lens;
            oldSupercounts = supercounts;
        }

        stats[n/2]++;
        return true;
    }

    static class MapInt57Int27 {
        private long[][] entries;
        private int[] counts;
        private int[] supercounts;

        public MapInt57Int27() {
            entries = new long[1 << 20][];
            counts = new int[1 << 20];
            supercounts = new int[1 << 10];
        }

        public boolean put(long key, int val) {
            int bucket = (int)(key & (entries.length - 1));
            long insert = (key >>> 20) | (((long)val) << 37);
            final long mask = (1L << 37) - 1;

            long[] chain = entries[bucket];
            if (chain == null) {
                chain = new long[16];
                entries[bucket] = chain;
                chain[0] = insert;
                counts[bucket]++;
                supercounts[bucket >> 10]++;
                return true;
            }

            int stop = counts[bucket];
            for (int i = 0; i < stop; i++) {
                if ((chain[i] & mask) == (insert & mask)) {
                    return false;
                }
            }

            if (stop == chain.length) {
                long[] newChain = new long[chain.length < 512 ? chain.length << 1 : chain.length + 512];
                System.arraycopy(chain, 0, newChain, 0, chain.length);
                entries[bucket] = newChain;
                chain = newChain;
            }
            chain[stop] = insert;
            counts[bucket]++;
            supercounts[bucket >> 10]++;
            return true;
        }
    }
}

Il primo trovato è

000001001010110001000101001111111111

e questo è l'unico successo in 15 ore.

Vincitori più piccoli:

4/3:    0111                       (plus 8 different 8/6)
9/6:    001001011                  (and 5 others)
11/7:   00010100111                (and 3 others)
13/8:   0001001101011              (and 5 others)
15/9:   000010110110111            (and 21 others)
16/9:   0000101110111011           (and 1 other)
20/11:  00000101111011110111       (and others)
22/12:  0000001100110011101011     (and others)
24/13:  000000101011101011101011   (and others)
26/14:  00000001101110010011010111 (and others)
28/15:  0000000010000111100111010111 (and others)
30/16:  000000001011001110011010101111 (and probably others)
32/17:  00001100010010100100101011111111 (and others)
34/18:  0000101000100101000110010111111111 (and others)

— Peter Taylor
fonte

Questo è un buon miglioramento. Sembra che usare le parole di Lyndon significhi che devi solo controllare circa 2 ^ n / n stringhe binarie per la prima riga, invece di 2 ^ n.

Dato che stai usando ogni cifra di BigInteger come cella matrice, non ci sarà una risposta sbagliata quando n> 10?

— kennytm,

@KennyTM, nota che il secondo parametro è il radix. C'è un piccolo bug: dovrei usare npiuttosto che rows, sebbene sia a prova di errore, nel senso che scarterebbe soluzioni valide piuttosto che accettarne di non valide. Inoltre non influisce sui risultati.

— Peter Taylor,

Penso che siamo praticamente limitati a questo punteggio, poiché il controllo della proprietà X richiede molto tempo, a meno che non abbiamo trovato un'altra condizione equivalente che può essere valutata più velocemente. Ecco perché ero così ansioso di vedere che "non prime" implica la proprietà X = D

— solo il

@SuboptimusPrime, l'ho trovato su people.math.sfu.ca/~kya17/teaching/math343/16-343.pdf e risolto un bug. È interessante notare che l'algoritmo che sto usando per scorrere le parole di Lyndon appartiene a una classe di algoritmi correlati che esegue anche sottoinsiemi di k-of-n, quindi potrei essere in grado di refactoring e condividere del codice.

— Peter Taylor,

Python 2 - 21/12

Nel processo di dimostrare che 2-(3/n)esiste sempre un per qualsiasin

Ispirato a questa domanda , ho usato la sequenza di De Bruijn per forzare brutalmente le possibili matrici. E dopo il brutoforcing n=6,7,8,9,10, ho trovato uno schema che la soluzione più alta ha sempre la forma (n, 2n-3).

Così ho creato un altro metodo per rinforzare proprio quella forma di matrice e usare il multiprocessing per accelerare le cose, poiché questo compito è altamente distribuibile. In Ubuntu a 16 core, può trovare una soluzione n=12in circa 4 minuti:

Prova (0, 254)
Prova (254, 509)
Cercando (509, 764)
Cercando (764, 1018)
Cercando (1018, 1273)
Cercando (1273, 1528)
Cercando (1528, 1782)
Cercando (1782, 2037)
Cercando (2037, 2292)
Cercando (2292, 2546)
Cercando (2546, 2801)
Provare (2801, 3056)
Cercando (3056, 3310)
Prova (3820, 4075)
Cercando (3565, 3820)
Provare (3310, 3565)
(1625, 1646)
[[0 0 0 1 0 0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 0]
 [0 0 1 0 0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 0 0]
 [0 1 0 0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 0 0 0]
 [1 0 0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 0 0 0 0]
 [0 0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 0 0 0 0 1]
 [0 1 0 1 1 1 1 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0]
 [1 0 1 1 1 1 0 0 0 1 0 0 1 1 0 0 0 0 0 1 0 0]
 [0 1 1 1 1 0 0 0 1 0 0 1 1 0 0 0 0 1 0 0 1]
 [1 1 1 1 0 0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 0]
 [1 1 1 0 0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 0 1]
 [1 1 0 0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 0 1 1]
 [1 0 0 0 1 0 0 1 1 0 0 0 0 1 0 0 1 0 1 1 1]]
(12, 21)
Punteggio: 1.7500

reali 4m9.121s
utente 42m47.472s
sys 0m5.780s

La maggior parte del calcolo va al controllo della proprietà X, che richiede il controllo di tutti i sottoinsiemi (ci sono 2^(2n-3)sottoinsiemi)

Nota che ruoto la prima riga a sinistra, non a destra come nella domanda. Ma questi sono equivalenti poiché puoi semplicemente invertire l'intera matrice. =)

Il codice:

import math
import numpy as np
from itertools import combinations
from multiprocessing import Process, Queue, cpu_count

def de_bruijn(k, n):
    """
    De Bruijn sequence for alphabet k
    and subsequences of length n.
    """
    alphabet = list(range(k))
    a = [0] * k * n
    sequence = []
    def db(t, p):
        if t > n:
            if n % p == 0:
                for j in range(1, p + 1):
                    sequence.append(a[j])
        else:
            a[t] = a[t - p]
            db(t + 1, p)
            for j in range(a[t - p] + 1, k):
                a[t] = j
                db(t + 1, t)
    db(1, 1)
    return sequence

def generate_cyclic_matrix(seq, n):
    result = []
    for i in range(n):
        result.append(seq[i:]+seq[:i])
    return np.array(result)

def generate_cyclic_matrix_without_property_x(n=3, n_jobs=-1):
    seq = de_bruijn(2,n)
    seq = seq + seq[:n/2]
    max_idx = len(seq)
    max_score = 1
    max_matrix = np.array([[]])
    max_ij = (0,0)
    workers = []
    queue = Queue()
    if n_jobs < 0:
        n_jobs += cpu_count()+1
    for i in range(n_jobs):
        worker = Process(target=worker_function, args=(seq,i*(2**n-2*n+3)/n_jobs, (i+1)*(2**n-2*n+3)/n_jobs, n, queue))
        workers.append(worker)
        worker.start()
    (result, max_ij) = queue.get()
    for worker in workers:
        worker.terminate()
    return (result, max_ij)

def worker_function(seq,min_idx,max_idx,n,queue):
    print 'Trying (%d, %d)' % (min_idx, max_idx)
    for i in range(min_idx, max_idx):
        j = i+2*n-3
        result = generate_cyclic_matrix(seq[i:j], n)
        if has_property_x(result):
            continue
        else:
            queue.put( (result, (i,j)) )
            return

def has_property_x(mat):
    vecs = zip(*mat)
    vector_sums = set()
    for i in range(1, len(vecs)+1):
        for combination in combinations(vecs, i):
            vector_sum = tuple(sum(combination, np.array([0]*len(mat))))
            if vector_sum in vector_sums:
                return True
            else:
                vector_sums.add(vector_sum)
    return False

def main():
    import sys
    n = int(sys.argv[1])
    if len(sys.argv) > 2:
        n_jobs = int(sys.argv[2])
    else:
        n_jobs = -1
    (matrix, ij) = generate_cyclic_matrix_without_property_x(n, n_jobs)
    print ij
    print matrix
    print matrix.shape
    print 'Score: %.4f' % (float(matrix.shape[1])/matrix.shape[0])

if __name__ == '__main__':
    main()

Vecchia risposta, per riferimento

La soluzione ottimale finora ( n=10):

(855, 872)
[[1 1 0 1 0 1 0 0 1 1 1 1 0 1 1 1 0]
 [1 0 1 0 1 0 0 1 1 1 1 0 1 1 1 0 1]
 [0 1 0 1 0 0 1 1 1 1 0 1 1 1 0 1 1]
 [1 0 1 0 0 1 1 1 1 0 1 1 1 0 1 1 0]
 [0 1 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1]
 [1 0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 0]
 [0 0 1 1 1 1 0 1 1 1 0 1 1 0 1 0 1]
 [0 1 1 1 1 0 1 1 1 0 1 1 0 1 0 1 0]
 [1 1 1 1 0 1 1 1 0 1 1 0 1 0 1 0 0]
 [1 1 1 0 1 1 1 0 1 1 0 1 0 1 0 0 1]]
(10, 17)
Punteggio: 1.7000

Per n=7:

(86, 97)
[[0 1 1 1 0 1 0 0 1 1 1]
 [1 1 1 0 1 0 0 1 1 1 0]
 [1 1 0 1 0 0 1 1 1 0 1]
 [1 0 1 0 0 1 1 1 0 1 1]
 [0 1 0 0 1 1 1 0 1 1 1]
 [1 0 0 1 1 1 0 1 1 1 0]
 [0 0 1 1 1 0 1 1 1 0 1]]
(7, 11)
Punteggio: 1.5714

Una soluzione con la forma descritta da OP ( n=8):

(227, 239)
[[0 1 0 1 1 1 1 1 0 1 1 0]
 [1 0 1 1 1 1 1 0 1 1 0 0]
 [0 1 1 1 1 1 0 1 1 0 0 1]
 [1 1 1 1 1 0 1 1 0 0 1 0]
 [1 1 1 1 0 1 1 0 0 1 0 1]
 [1 1 1 0 1 1 0 0 1 0 1 1]
 [1 1 0 1 1 0 0 1 0 1 1 1]
 [1 0 1 1 0 0 1 0 1 1 1 1]]
(8, 12)
Punteggio: 1.5000

Ma uno migliore ( n=8):

(95, 108)
[[0 1 1 0 0 1 0 0 0 1 1 0 1]
 [1 1 0 0 1 0 0 0 1 1 0 1 0]
 [1 0 0 1 0 0 0 1 1 0 1 0 1]
 [0 0 1 0 0 0 1 1 0 1 0 1 1]
 [0 1 0 0 0 1 1 0 1 0 1 1 0]
 [1 0 0 0 1 1 0 1 0 1 1 0 0]
 [0 0 0 1 1 0 1 0 1 1 0 0 1]
 [0 0 1 1 0 1 0 1 1 0 0 1 0]]
(8, 13)
Punteggio: 1.6250

Ha anche trovato un'altra soluzione ottimale su n=9:

(103, 118)
[[0 1 0 1 1 1 0 0 0 0 1 1 0 0 1]
 [1 0 1 1 1 0 0 0 0 1 1 0 0 1 0]
 [0 1 1 1 0 0 0 0 1 1 0 0 1 0 1]
 [1 1 1 0 0 0 0 1 1 0 0 1 0 1 0]
 [1 1 0 0 0 0 1 1 0 0 1 0 1 0 1]
 [1 0 0 0 0 1 1 0 0 1 0 1 0 1 1]
 [0 0 0 0 1 1 0 0 1 0 1 0 1 1 1]
 [0 0 0 1 1 0 0 1 0 1 0 1 1 1 0]
 [0 0 1 1 0 0 1 0 1 0 1 1 1 0 0]]
(9, 15)
Punteggio: 1.6667

Il codice è il seguente. È solo forza bruta, ma almeno può trovare qualcosa di meglio della pretesa di OP =)

import numpy as np
from itertools import combinations

def de_bruijn(k, n):
    """
    De Bruijn sequence for alphabet k
    and subsequences of length n.
    """
    alphabet = list(range(k))
    a = [0] * k * n
    sequence = []
    def db(t, p):
        if t > n:
            if n % p == 0:
                for j in range(1, p + 1):
                    sequence.append(a[j])
        else:
            a[t] = a[t - p]
            db(t + 1, p)
            for j in range(a[t - p] + 1, k):
                a[t] = j
                db(t + 1, t)
    db(1, 1)
    return sequence

def generate_cyclic_matrix(seq, n):
    result = []
    for i in range(n):
        result.append(seq[i:]+seq[:i])
    return np.array(result)

def generate_cyclic_matrix_without_property_x(n=3):
    seq = de_bruijn(2,n)
    max_score = 0
    max_matrix = []
    max_ij = (0,0)
    for i in range(2**n+1):
        for j in range(i+n, 2**n+1):
            score = float(j-i)/n
            if score <= max_score:
                continue
            result = generate_cyclic_matrix(seq[i:j], n)
            if has_property_x(result):
                continue
            else:
                if score > max_score:
                    max_score = score
                    max_matrix = result
                    max_ij = (i,j)
    return (max_matrix, max_ij)

def has_property_x(mat):
    vecs = zip(*mat)
    vector_sums = set()
    for i in range(1, len(vecs)):
        for combination in combinations(vecs, i):
            vector_sum = tuple(sum(combination, np.array([0]*len(mat))))
            if vector_sum in vector_sums:
                return True
            else:
                vector_sums.add(vector_sum)
    return False

def main():
    import sys
    n = int(sys.argv[1])
    (matrix, ij) = generate_cyclic_matrix_without_property_x(n)
    print ij
    print matrix
    print matrix.shape
    print 'Score: %.4f' % (float(matrix.shape[1])/matrix.shape[0])

if __name__ == '__main__':
    main()

— justhalf
fonte

Un ottimo inizio :)

@Lembik Ora posso battere quasi (limitato dal tempo di calcolo) chiunque rivendichi un punteggio inferiore a 2. =)

— solo il

In tal caso, riesci a battere il 19/10?

@Lembik Non penso di poterlo fare. Richiede n >= 31, il che implica che dovrei controllare fino a 2^(2n-3) = 2^59combinazioni di vettore tridimensionale. Non

— finiremo

Puoi provare che puoi sempre ottenere una matrice din*(2n-3)

— xnor

24/13 26/14 28/15 30/16 32/17 (C #)

Modifica: informazioni obsolete eliminate dalla mia risposta. Sto usando principalmente lo stesso algoritmo di Peter Taylor ( Modifica: sembra che stia usando un algoritmo migliore ora), anche se ho aggiunto alcune delle mie ottimizzazioni:

Ho implementato la strategia "meet in the middle" di ricerca di set di colonne con la stessa somma vettoriale (suggerito dal commento di questo KennyTM ). Questa strategia ha migliorato molto l'utilizzo della memoria, ma è piuttosto lenta, quindi ho aggiunto la HasPropertyXFastfunzione, che controlla rapidamente se ci sono piccoli set con somme uguali prima di utilizzare l'approccio "meet in the middle".
Durante l'iterazione dei set di colonne in HasPropertyXFast funzione, parto dal controllo dei set di colonne con 1 colonna, quindi con 2, 3 e così via. La funzione ritorna non appena viene trovata la prima collisione delle somme di colonna. In pratica significa che di solito devo controllare solo poche centinaia o migliaia di set di colonne anziché milioni.
Sto usando le longvariabili per memorizzare e confrontare intere colonne e le loro somme vettoriali. Questo approccio è almeno un ordine di grandezza più veloce rispetto al confronto di colonne come matrici.
Ho aggiunto la mia implementazione di hashset, ottimizzata per il long tipo di dati e per i miei schemi di utilizzo.
Sto riutilizzando gli stessi 3 hashset per l'intera durata dell'applicazione per ridurre il numero di allocazioni di memoria e migliorare le prestazioni.
Supporto per il multithreading.

Uscita del programma:

00000000000111011101010010011111
10000000000011101110101001001111
11000000000001110111010100100111
11100000000000111011101010010011
11110000000000011101110101001001
11111000000000001110111010100100
01111100000000000111011101010010
00111110000000000011101110101001
10011111000000000001110111010100
01001111100000000000111011101010
00100111110000000000011101110101
10010011111000000000001110111010
01001001111100000000000111011101
10100100111110000000000011101110
01010010011111000000000001110111
10101001001111100000000000111011
11010100100111110000000000011101
Score: 32/17 = 1,88235294117647
Time elapsed: 02:11:05.9791250

Codice:

using System;
using System.Collections.Generic;
using System.Diagnostics;
using System.Linq;
using System.Text;
using System.Threading.Tasks;

class Program
{
    const int MaxWidth = 32;
    const int MaxHeight = 17;

    static object _lyndonWordLock = new object();

    static void Main(string[] args)
    {
        Stopwatch sw = Stopwatch.StartNew();
        double maxScore = 0;
        const int minHeight = 17; // 1
        for (int height = minHeight; height <= MaxHeight; height++)
        {
            Console.WriteLine("Row count = " + height);
            Console.WriteLine("Time elapsed: " + sw.Elapsed + "\r\n");

            int minWidth = Math.Max(height, (int)(height * maxScore) + 1);
            for (int width = minWidth; width <= MaxWidth; width++)
            {
#if MULTITHREADING
                int[,] matrix = FindMatrixParallel(width, height);
#else
                int[,] matrix = FindMatrix(width, height);
#endif
                if (matrix != null)
                {
                    PrintMatrix(matrix);
                    Console.WriteLine("Time elapsed: " + sw.Elapsed + "\r\n");
                    maxScore = (double)width / height;
                }
                else
                    break;
            }
        }
    }

#if MULTITHREADING
    static int[,] FindMatrixParallel(int width, int height)
    {
        _lyndonWord = 0;
        _stopSearch = false;

        int threadCount = Environment.ProcessorCount;
        Task<int[,]>[] tasks = new Task<int[,]>[threadCount];
        for (int i = 0; i < threadCount; i++)
            tasks[i] = Task<int[,]>.Run(() => FindMatrix(width, height));

        int index = Task.WaitAny(tasks);
        if (tasks[index].Result != null)
            _stopSearch = true;

        Task.WaitAll(tasks);
        foreach (Task<int[,]> task in tasks)
            if (task.Result != null)
                return task.Result;

        return null;
    }

    static volatile bool _stopSearch;
#endif

    static int[,] FindMatrix(int width, int height)
    {
#if MULTITHREADING
        _columnSums = new LongSet();
        _left = new LongSet();
        _right = new LongSet();
#endif

        foreach (long rowTemplate in GetLyndonWords(width))
        {
            int[,] matrix = new int[width, height];
            for (int x = 0; x < width; x++)
            {
                int cellValue = (int)(rowTemplate >> (width - 1 - x)) % 2;
                for (int y = 0; y < height; y++)
                    matrix[(x + y) % width, y] = cellValue;
            }

            if (!HasPropertyX(matrix))
                return matrix;

#if MULTITHREADING
            if (_stopSearch)
                return null;
#endif
        }

        return null;
    }

#if MULTITHREADING
    static long _lyndonWord;
#endif

    static IEnumerable<long> GetLyndonWords(int length)
    {
        long lyndonWord = 0;
        long max = (1L << (length - 1)) - 1;
        while (lyndonWord <= max)
        {
            if ((lyndonWord % 2 != 0) && PrecedesReversal(lyndonWord, length))
                yield return lyndonWord;

#if MULTITHREADING
            lock (_lyndonWordLock)
            {
                if (_lyndonWord <= max)
                    _lyndonWord = NextLyndonWord(_lyndonWord, length);
                else
                    yield break;

                lyndonWord = _lyndonWord;
            }
#else
            lyndonWord = NextLyndonWord(lyndonWord, length);
#endif
        }
    }

    static readonly int[] _lookup =
    {
        32, 0, 1, 26, 2, 23, 27, 0, 3, 16, 24, 30, 28, 11, 0, 13, 4, 7, 17,
        0, 25, 22, 31, 15, 29, 10, 12, 6, 0, 21, 14, 9, 5, 20, 8, 19, 18
    };

    static int NumberOfTrailingZeros(uint i)
    {
        return _lookup[(i & -i) % 37];
    }

    static long NextLyndonWord(long w, int length)
    {
        if (w == 0)
            return 1;

        int currentLength = length - NumberOfTrailingZeros((uint)w);
        while (currentLength < length)
        {
            w += w >> currentLength;
            currentLength *= 2;
        }

        w++;

        return w;
    }

    private static bool PrecedesReversal(long lyndonWord, int length)
    {
        int shift = length - 1;

        long reverse = 0;
        for (int i = 0; i < length; i++)
        {
            long bit = (lyndonWord >> i) % 2;
            reverse |= bit << (shift - i);
        }

        for (int i = 0; i < length; i++)
        {
            if (reverse < lyndonWord)
                return false;

            long bit = reverse % 2;
            reverse /= 2;
            reverse += bit << shift;
        }

        return true;
    }

#if MULTITHREADING
    [ThreadStatic]
#endif
    static LongSet _left = new LongSet();
#if MULTITHREADING
    [ThreadStatic]
#endif
    static LongSet _right = new LongSet();

    static bool HasPropertyX(int[,] matrix)
    {
        long[] matrixColumns = GetMatrixColumns(matrix);
        if (matrixColumns.Length == 1)
            return false;

        return HasPropertyXFast(matrixColumns) || MeetInTheMiddle(matrixColumns);
    }

    static bool MeetInTheMiddle(long[] matrixColumns)
    {
        long[] leftColumns = matrixColumns.Take(matrixColumns.Length / 2).ToArray();
        long[] rightColumns = matrixColumns.Skip(matrixColumns.Length / 2).ToArray();

        if (PrepareHashSet(leftColumns, _left) || PrepareHashSet(rightColumns, _right))
            return true;

        foreach (long columnSum in _left.GetValues())
            if (_right.Contains(columnSum))
                return true;

        return false;
    }

    static bool PrepareHashSet(long[] columns, LongSet sums)
    {
        int setSize = (int)System.Numerics.BigInteger.Pow(3, columns.Length);
        sums.Reset(setSize, setSize);
        foreach (long column in columns)
        {
            foreach (long sum in sums.GetValues())
                if (!sums.Add(sum + column) || !sums.Add(sum - column))
                    return true;

            if (!sums.Add(column) || !sums.Add(-column))
                return true;
        }

        return false;
    }

#if MULTITHREADING
    [ThreadStatic]
#endif
    static LongSet _columnSums = new LongSet();

    static bool HasPropertyXFast(long[] matrixColumns)
    {
        int width = matrixColumns.Length;

        int maxColumnCount = width / 3;
        _columnSums.Reset(width, SumOfBinomialCoefficients(width, maxColumnCount));

        int resetBit, setBit;
        for (int k = 1; k <= maxColumnCount; k++)
        {
            uint columnMask = (1u << k) - 1;
            long sum = 0;
            for (int i = 0; i < k; i++)
                sum += matrixColumns[i];

            while (true)
            {
                if (!_columnSums.Add(sum))
                    return true;
                if (!NextColumnMask(columnMask, k, width, out resetBit, out setBit))
                    break;
                columnMask ^= (1u << resetBit) ^ (1u << setBit);
                sum = sum - matrixColumns[resetBit] + matrixColumns[setBit];
            }
        }

        return false;
    }

    // stolen from Peter Taylor
    static bool NextColumnMask(uint mask, int k, int n, out int resetBit, out int setBit)
    {
        int gap = NumberOfTrailingZeros(~mask);
        int next = 1 + NumberOfTrailingZeros(mask & (mask + 1));

        if (((k - gap) & 1) == 0)
        {
            if (gap == 0)
            {
                resetBit = next - 1;
                setBit = next - 2;
            }
            else if (gap == 1)
            {
                resetBit = 0;
                setBit = 1;
            }
            else
            {
                resetBit = gap - 2;
                setBit = gap;
            }
        }
        else
        {
            if (next == n)
            {
                resetBit = 0;
                setBit = 0;
                return false;
            }

            if ((mask & (1 << next)) == 0)
            {
                if (gap == 0)
                {
                    resetBit = next - 1;
                    setBit = next;
                }
                else
                {
                    resetBit = gap - 1;
                    setBit = next;
                }
            }
            else
            {
                resetBit = next;
                setBit = gap;
            }
        }

        return true;
    }

    static long[] GetMatrixColumns(int[,] matrix)
    {
        int width = matrix.GetLength(0);
        int height = matrix.GetLength(1);

        long[] result = new long[width];
        for (int x = 0; x < width; x++)
        {
            long column = 0;
            for (int y = 0; y < height; y++)
            {
                column *= 13;
                if (matrix[x, y] == 1)
                    column++;
            }

            result[x] = column;
        }

        return result;
    }

    static int SumOfBinomialCoefficients(int n, int k)
    {
        int result = 0;
        for (int i = 0; i <= k; i++)
            result += BinomialCoefficient(n, i);
        return result;
    }

    static int BinomialCoefficient(int n, int k)
    {
        long result = 1;
        for (int i = n - k + 1; i <= n; i++)
            result *= i;
        for (int i = 2; i <= k; i++)
            result /= i;
        return (int)result;
    }

    static void PrintMatrix(int[,] matrix)
    {
        int width = matrix.GetLength(0);
        int height = matrix.GetLength(1);

        for (int y = 0; y < height; y++)
        {
            for (int x = 0; x < width; x++)
                Console.Write(matrix[x, y]);
            Console.WriteLine();
        }

        Console.WriteLine("Score: {0}/{1} = {2}", width, height, (double)width / height);
    }
}


class LongSet
{
    private static readonly int[] primes =
    {
        17, 37, 67, 89, 113, 149, 191, 239, 307, 389, 487, 613, 769, 967, 1213, 1523, 1907,
        2389, 2999, 3761, 4703, 5879, 7349, 9187, 11489, 14369, 17971, 22469, 28087, 35111,
        43889, 54869, 68597, 85751, 107197, 133999, 167521, 209431, 261791, 327247, 409063,
        511333, 639167, 798961, 998717, 1248407, 1560511, 1950643, 2438309, 3047909,
        809891, 4762367, 5952959, 7441219, 9301529, 11626913, 14533661, 18167089, 22708867,
        28386089, 35482627, 44353297, 55441637, 69302071, 86627603, 108284507, 135355669,
        169194593, 211493263, 264366593, 330458263, 413072843, 516341057, 645426329,
        806782913, 1008478649, 1260598321
    };

    private int[] _buckets;
    private int[] _nextItemIndexes;
    private long[] _items;
    private int _count;
    private int _minCapacity;
    private int _maxCapacity;
    private int _currentCapacity;

    public LongSet()
    {
        Initialize(0, 0);
    }

    private int GetPrime(int capacity)
    {
        foreach (int prime in primes)
            if (prime >= capacity)
                return prime;

        return int.MaxValue;
    }

    public void Reset(int minCapacity, int maxCapacity)
    {
        if (maxCapacity > _maxCapacity)
            Initialize(minCapacity, maxCapacity);
        else
            ClearBuckets();
    }

    private void Initialize(int minCapacity, int maxCapacity)
    {
        _minCapacity = GetPrime(minCapacity);
        _maxCapacity = GetPrime(maxCapacity);
        _currentCapacity = _minCapacity;

        _buckets = new int[_maxCapacity];
        _nextItemIndexes = new int[_maxCapacity];
        _items = new long[_maxCapacity];
        _count = 0;
    }

    private void ClearBuckets()
    {
        Array.Clear(_buckets, 0, _currentCapacity);
        _count = 0;
        _currentCapacity = _minCapacity;
    }

    public bool Add(long value)
    {
        int bucket = (int)((ulong)value % (ulong)_currentCapacity);
        for (int i = _buckets[bucket] - 1; i >= 0; i = _nextItemIndexes[i])
            if (_items[i] == value)
                return false;

        if (_count == _currentCapacity)
        {
            Grow();
            bucket = (int)((ulong)value % (ulong)_currentCapacity);
        }

        int index = _count;
        _items[index] = value;
        _nextItemIndexes[index] = _buckets[bucket] - 1;
        _buckets[bucket] = index + 1;
        _count++;

        return true;
    }

    private void Grow()
    {
        Array.Clear(_buckets, 0, _currentCapacity);

        const int growthFactor = 8;
        int newCapacity = GetPrime(_currentCapacity * growthFactor);
        if (newCapacity > _maxCapacity)
            newCapacity = _maxCapacity;
        _currentCapacity = newCapacity;

        for (int i = 0; i < _count; i++)
        {
            int bucket = (int)((ulong)_items[i] % (ulong)newCapacity);
            _nextItemIndexes[i] = _buckets[bucket] - 1;
            _buckets[bucket] = i + 1;
        }
    }

    public bool Contains(long value)
    {
        int bucket = (int)((ulong)value % (ulong)_buckets.Length);
        for (int i = _buckets[bucket] - 1; i >= 0; i = _nextItemIndexes[i])
            if (_items[i] == value)
                return true;

        return false;
    }

    public IReadOnlyList<long> GetValues()
    {
        return new ArraySegment<long>(_items, 0, _count);
    }
}

File di configurazione:

<?xml version="1.0" encoding="utf-8" ?>
<configuration>
  <runtime>
    <gcAllowVeryLargeObjects enabled="true" />
  </runtime>
</configuration>

— Suboptimus Prime
fonte

Per alcuni aspetti sembra che tu abbia pessimizzato piuttosto che ottimizzato. L'unica cosa che sembra davvero un'ottimizzazione è consentire ai bit di scontrarsi usando ulonge lasciando che il turno si concluda invece di usare BigInteger.

— Peter Taylor,

@PeterTaylor L'ottimizzazione più importante è nella funzione HasPropertyX. La funzione ritorna non appena viene trovata la prima collisione delle somme delle colonne (diversamente dalla funzione scoreLyndonWord). Ho anche ordinato le maschere di colonna in modo tale che per prima cosa controlliamo i set di colonne che hanno maggiori probabilità di scontrarsi. Queste due ottimizzazioni hanno migliorato le prestazioni di un ordine di grandezza.

— Suboptimus Prime,

Sebbene i cambiamenti nelle prestazioni siano spesso sorprendenti, in linea di principio l'interruzione anticipata non dovrebbe dare più di un fattore 2 e GetSumOfColumnsaggiunge un loop extra che mi aspetterei di costare più di quel fattore 2. L'ordinamento delle maschere sembra interessante: forse potresti modificare la risposta per parlarne un po '? (E ad un certo punto sperimenterò un modo alternativo per eseguire l'interruzione anticipata: il motivo per cui non posso farlo è che HashSet non supporta l'iterazione e la modifica simultanee, ma ho idee per evitare la necessità di un iteratore) .

— Peter Taylor,

@justhalf, utilizzando un approccio Gray-esque per scorrere il sottoinsiemi di una dimensione fissa è effettivamente utile. Mi ha permesso di trovare un 26/14 in meno di 9 minuti e 34 di loro in due ore, a quel punto ho interrotto. Attualmente sto testando per vedere se riesco a ottenere il 28/15 in un tempo ragionevole.

— Peter Taylor,

@Lembik, ho esplorato esaurientemente il 29/15 in 75,5 ore. Il 31/16 richiederebbe circa 3 volte di più, quindi più di una settimana. Entrambi abbiamo fatto alcune ottimizzazioni da quando ho iniziato a eseguire quel test del 29/15, quindi forse sarebbe sceso da una settimana. Non c'è nulla che ti impedisca di compilare il mio codice o il codice di SuboptimusPrime e di eseguirlo tu stesso se hai un computer che puoi lasciare così a lungo.

— Peter Taylor,