LINQ: valori distinti

136

Ho il seguente oggetto impostato da un XML:

id           category

5            1
5            3
5            4
5            3
5            3

Ho bisogno di un elenco distinto di questi elementi:

5            1
5            3
5            4

Come posso distinguere anche per ID categoria e in LINQ?

linq distinct

— balint
fonte

221

Stai cercando di essere distinto da più di un campo? In tal caso, basta usare un tipo anonimo e l'operatore Distinct e dovrebbe andare bene:

var query = doc.Elements("whatever")
               .Select(element => new {
                             id = (int) element.Attribute("id"),
                             category = (int) element.Attribute("cat") })
               .Distinct();

Se stai cercando di ottenere un insieme distinto di valori di un tipo "più grande", ma solo guardando un sottoinsieme di proprietà per l'aspetto di distinzione, probabilmente vorrai DistinctByimplementato in MoreLINQ in DistinctBy.cs:

 public static IEnumerable<TSource> DistinctBy<TSource, TKey>(
     this IEnumerable<TSource> source,
     Func<TSource, TKey> keySelector,
     IEqualityComparer<TKey> comparer)
 {
     HashSet<TKey> knownKeys = new HashSet<TKey>(comparer);
     foreach (TSource element in source)
     {
         if (knownKeys.Add(keySelector(element)))
         {
             yield return element;
         }
     }
 }

(Se si passa nullcome comparatore, utilizzerà il comparatore predefinito per il tipo di chiave.)

— Jon Skeet
fonte

Oh, quindi per "tipo più grande" potresti voler dire che voglio ancora tutte le proprietà nel risultato, anche se voglio solo confrontare alcune proprietà per determinare la distinzione?

— The Red Pea,

@TheRedPea: Sì, esattamente.

— Jon Skeet,

34

Basta usare il Distinct()proprio comparatore.

http://msdn.microsoft.com/en-us/library/bb338049.aspx

— Stu
fonte

27

Oltre alla risposta di Jon Skeet, puoi anche usare il gruppo tramite espressioni per ottenere i gruppi univoci lungo con un conteggio per ogni iterazione di gruppi:

var query = from e in doc.Elements("whatever")
            group e by new { id = e.Key, val = e.Value } into g
            select new { id = g.Key.id, val = g.Key.val, count = g.Count() };

— James Alexander
fonte

4

Hai scritto "oltre alla risposta di Jon Skeet" ... Non so se una cosa del genere sia possibile. ;)

— Yehuda Makarov

13

Per chiunque sia ancora alla ricerca; ecco un altro modo di implementare un comparatore lambda personalizzato.

public class LambdaComparer<T> : IEqualityComparer<T>
    {
        private readonly Func<T, T, bool> _expression;

        public LambdaComparer(Func<T, T, bool> lambda)
        {
            _expression = lambda;
        }

        public bool Equals(T x, T y)
        {
            return _expression(x, y);
        }

        public int GetHashCode(T obj)
        {
            /*
             If you just return 0 for the hash the Equals comparer will kick in. 
             The underlying evaluation checks the hash and then short circuits the evaluation if it is false.
             Otherwise, it checks the Equals. If you force the hash to be true (by assuming 0 for both objects), 
             you will always fall through to the Equals check which is what we are always going for.
            */
            return 0;
        }
    }

è quindi possibile creare un'estensione per il distinto linq che può prendere in lambda

   public static IEnumerable<T> Distinct<T>(this IEnumerable<T> list,  Func<T, T, bool> lambda)
        {
            return list.Distinct(new LambdaComparer<T>(lambda));
        }

Uso:

var availableItems = list.Distinct((p, p1) => p.Id== p1.Id);

— Ricky G
fonte

Osservando la fonte di riferimento, Distinct usa un set di hash per memorizzare elementi che ha già prodotto. Restituire sempre lo stesso codice hash significa che ogni elemento restituito in precedenza viene esaminato ogni volta. Un codice hash più robusto accelererebbe le cose perché si confrontava solo con elementi nello stesso bucket hash. Zero è un valore predefinito ragionevole, ma potrebbe valere la pena supportare un secondo lambda per il codice hash.

— Darryl,

Buon punto! Proverò a modificare quando avrò tempo, se al momento stai lavorando in questo dominio, sentiti libero di modificare

— Ricky G

8

Sono un po 'in ritardo per la risposta, ma potresti voler fare questo se vuoi l'intero elemento, non solo i valori che vuoi raggruppare per:

var query = doc.Elements("whatever")
               .GroupBy(element => new {
                             id = (int) element.Attribute("id"),
                             category = (int) element.Attribute("cat") })
               .Select(e => e.First());

Questo ti darà il primo intero elemento corrispondente al tuo gruppo per selezione, proprio come il secondo esempio di Jon Skeets usando DistinctBy, ma senza implementare il comparatore IEqualityComparer. DistinctBy molto probabilmente sarà più veloce, ma la soluzione sopra implicherà meno codice se le prestazioni non sono un problema.

— Olle Johansson
fonte

4

// First Get DataTable as dt
// DataRowComparer Compare columns numbers in each row & data in each row

IEnumerable<DataRow> Distinct = dt.AsEnumerable().Distinct(DataRowComparer.Default);

foreach (DataRow row in Distinct)
{
    Console.WriteLine("{0,-15} {1,-15}",
        row.Field<int>(0),
        row.Field<string>(1)); 
}

— Mohamed Elsayed
fonte

0

Dato che stiamo parlando di avere ogni elemento esattamente una volta, un "set" ha più senso per me.

Esempio con classi e IEqualityComparer implementati:

 public class Product
    {
        public int Id { get; set; }
        public string Name { get; set; }

        public Product(int x, string y)
        {
            Id = x;
            Name = y;
        }
    }

    public class ProductCompare : IEqualityComparer<Product>
    {
        public bool Equals(Product x, Product y)
        {  //Check whether the compared objects reference the same data.
            if (Object.ReferenceEquals(x, y)) return true;

            //Check whether any of the compared objects is null.
            if (Object.ReferenceEquals(x, null) || Object.ReferenceEquals(y, null))
                return false;

            //Check whether the products' properties are equal.
            return x.Id == y.Id && x.Name == y.Name;
        }
        public int GetHashCode(Product product)
        {
            //Check whether the object is null
            if (Object.ReferenceEquals(product, null)) return 0;

            //Get hash code for the Name field if it is not null.
            int hashProductName = product.Name == null ? 0 : product.Name.GetHashCode();

            //Get hash code for the Code field.
            int hashProductCode = product.Id.GetHashCode();

            //Calculate the hash code for the product.
            return hashProductName ^ hashProductCode;
        }
    }

Adesso

List<Product> originalList = new List<Product> {new Product(1, "ad"), new Product(1, "ad")};
var setList = new HashSet<Product>(originalList, new ProductCompare()).ToList();

setList avrà elementi unici

Ho pensato a questo mentre ho a che fare con la .Except()differenza

— Aditya AVS
fonte