Forum Discussion

PhilG2165's avatar
PhilG2165
Copper Contributor
Sep 13, 2024

Removing duplicates from Access query

I have an Access database to catalog a collection.  In the table "Sets", there is a column that "Material Type" that links to another table (Materials.ID).  Material Type allows multi-selection to support cases where a set is made of two or more materials.  The problem is that when an item has multiple Material Types, the item shows up multiple times in reports.

 

For example:

  1. In the table Sets, item 292 has two Material Types (Metal and Recycled)
  2. When querying, item 292 shows up twice.  This would make some sense if each result row shows a separate Material Type field (Metal in one, an Recycled in the other), but each result shows both values.
  3. Then, when I have a form based on the query, I get two duplicate items of 292:


So, my question is: How do I get rid of the duplicate rows in the query?  If the query only had one row, the form would only have one row. I've never been able to find any help or support articles that address this issue.

 

Thanks for any assistance you can offer.

 

Phil Garding

  • Harun24HR's avatar
    Harun24HR
    Bronze Contributor
    Use the keyword DISTINCT in query SQL to remove duplicates.
    • PhilG2165's avatar
      PhilG2165
      Copper Contributor

      Harun24HR 

       

      Thank you for your response.  I finally have time to return to this. When I try adding DISTINCT to the query, I get an error message: "The DISTINCT keyword cannot be used with the multi-valued field 'Sets.[Material Type]'.

       

       

      My query is this (generated by the query designer):

       

      SELECT Material.[Material Type], Sets.[Material Type], Sets.[Material Description], Origin.Continent, Sets.Origin, Origin.Country, Sets.ID, Sets.Name, Sets.[Acquired Year], Sets.[Number of Pieces], Sets.[Height of Tallest Piece (Inches)], Sets.[Material Type].Value, Sets.[Manufacturer/Artist], Sets.[Photo 1]
      FROM Material INNER JOIN (Origin INNER JOIN Sets ON Origin.[ID] = Sets.[Origin]) ON Material.ID = Sets.[Material Type].Value
      ORDER BY Sets.ID;

       

      What I tried was to add DISTINCT after SELECT, like this:

       

      SELECT DISTINCT Material.[Material Type], Sets.[Material Type], Sets.[Material Description], Origin.Continent, Sets.Origin, Origin.Country, Sets.ID, Sets.Name, Sets.[Acquired Year], Sets.[Number of Pieces], Sets.[Height of Tallest Piece (Inches)], Sets.[Material Type].Value, Sets.[Manufacturer/Artist], Sets.[Photo 1]
      FROM Material INNER JOIN (Origin INNER JOIN Sets ON Origin.[ID] = Sets.[Origin]) ON Material.ID = Sets.[Material Type].Value
      ORDER BY Sets.ID;

       

      Is Even if I remove the reference to [Material Type], the error triggers on other fields.  My photo field originally allowed three photos, and this seems to be tripping it, too:

       

      Thanks, 

      Phil.

      • Ken_Sheridan's avatar
        Ken_Sheridan
        Brass Contributor

        The real solution is not to include a 'multi-valued field' in the table, but, as said before, instead model the many-to-many relationship type in the conventional way by means of a table which resolves the relationship type into two one-to-many relationship types:

         

        Sets----<SetMaterials>----Materials

         

        The SetMaterials table will have two foreign key columns referencing the primary key columns of Sets and Materials respectively.

         

        If you continue to use a 'multi-valued field', however, provided that you exclude the Material Table from a simple SELECT query it should not return duplicate rows.

         

        The same principles would apply to use of the photo column if this as an 'attachments' field, but from its Name, Photo 1,  I suspect that you have three separate photo columns.  This would mean that the table is not normalized to First Normal Form (1NF), however, which requires that each row in a table should have only one value of each attribute.  Again the correct way to model this would be as a many-to-many relationship type with a Photos table, with the relationship type modelled by a third table as described above.  For examples of this you might like to take a look at Images.zip in my public databases folder at:

         

        https://onedrive.live.com/?id=44CC60D7FEA42912%21169&cid=44CC60D7FEA42912

         

        The zip archive contains a number of files of differing degrees of complexity for associating images with a record.  The basic Images.accdb file would probably be appropriate to your requirements.  Note that a key feature of this, and other files in the archive, is that the images themselves are not stored in the database, which can cause excessive bloat.  Instead the paths to the image files are stored in a column of text data type, and each image is loaded into an Image control at runtime by setting the ControlSource property of the Image control to the name of the column containing the path.

    • Ken_Sheridan's avatar
      Ken_Sheridan
      Brass Contributor
      This results from the use of a 'multi-valued field', which is something most experienced Access developers counsel against. Using the DISTINCT predicate will make a query non-updatable, so this might not be suitable for a form's RecordSource. I would advise that, rather than using a multi-valued field, you model the many-to-many relationship type between Sets and MaterialTypes by the time-honoured method for modelling the relationship type by a table which resolves the relationship type into two one-to-many relationship types.
      You might like to take a look at MVFCorrector.zip in my public databases folder at:

      https://1drv.ms/f/c/44cc60d7fea42912/EhIppP7XYMwggESpAAAAAAABaDKZCllSuweYBPJ5zKa3cg

      This little demo file illustrates how the database structure can be corrected by means of code in a form's module

Resources