Forum Discussion

SpaceAge's avatar
SpaceAge
Copper Contributor
Mar 07, 2025

Matching columns in two spreadsheets

I have a larger spreadsheet with email addresses in one column and a lot of other demographic data in the other columns. I have a second, smaller, spreadsheet also with email addresses in one column. I would like to match the two spreadsheets and create a third spreadsheet with all the demographic information from the first spreadsheet associated with the email addresses of the second spreadsheet. Help!

  • resourcefulmynd's avatar
    resourcefulmynd
    Copper Contributor

    Hi SpaceAge you could try using PowerQuery to merge the files. Upload it using Get Data in Data tab and when in excel, select merge queries as New. It will create a new table and you can close and load the file into yoir excel workbook

  • Once I have data imported to the workbook, my preference is to work with worksheet formulas.  Unfortunately, Microsoft make it difficult to return arrays of arrays (in this case multiple records) from a formula.  Probably the best option is to fall back on INDEX/XMATCH to perform the join

    = LET(
        rowNum, XMATCH(Tabelle2[MailAddress],Tabelle1[MailAddress]),
        colNum, SEQUENCE(1, COLUMNS(Tabelle1)),
        INDEX(Tabelle1, rowNum, colNum)
      )

    The other approach I used is far more obscure.  

    = LET(
        recordϑ, BYROW(Tabelle1, THUNK),
        resultϑ, XLOOKUP(Tabelle2[MailAddress], Tabelle1[MailAddress], recordϑ),
        EVALTHUNKARRλ(resultϑ)
      )

    This defines an array of functions, 'recordϑ', each one of which will return a record if evaluated.  I then use XLOOKUP to return the array, 'resultϑ', of matched records.  So far, so good but the catch is that I still have to evaluate the result functions (called thunks) without Excel throwing a hissy fit because it doesn't like arrays of array (i.e. the answer to your problem and many others).  The problem arises so often that I wrote a function to evaluate thunks and stack the results.

    /*  FUNCTION NAME:  EVALTHUNKARRλ
        DESCRIPTION:    Called by modified helper functions to stack the contents of any array of thunks  */
    /*  REVISIONS:      Date            Developer           Description
                        14 Oct 2024     Peter Bartholomew   Extracted for MAPλ as a separate module  
    */
    EVALTHUNKARRλ = LAMBDA(thunkArrayϑ,
        LET(
            m, ROWS(thunkArrayϑ),
            n, COLUMNS(thunkArrayϑ),
            h, SEQUENCE(CEILING.MATH(LOG(n,2),1)),
            recombinedRowsϑ, IF(
                n > 1,
                BYROW(thunkArrayϑ, LAMBDA(thunkRowϑ, @REDUCE(thunkRowϑ, h, JOINPAIRSλ(1)))),
                thunkArrayϑ
            ),
            k, SEQUENCE(CEILING.MATH(LOG(m,2),1)),
            recombinedϑ, IF(
                m > 1, 
                REDUCE(recombinedRowsϑ, k, JOINPAIRSλ(0)), 
                recombinedRowsϑ
            ),
            result, IFNA((@recombinedϑ)(), ""),
            result
        )
    );
    
    /*  FUNCTION NAME:  JOINPAIRSλ
        DESCRIPTION:    Called by EVALTHUNKARRλ to stack the contents of thunks pairwise  */
    /*  REVISIONS:      Date            Developer           Description
                        09 May 2024     Peter Bartholomew   Original Development  
                        16 May 2024     Peter Bartholomew   Test for unpaired thunk in binary tree
                        30 Aug 2024     Peter Bartholomew   Modify to stack horizontally or vertically
                        15 Nov 2024     David Clements      Efficiency improvements / simplification
    */
    
    JOINPAIRSλ = LAMBDA([by_col], LAMBDA(thunkArray, [k],
        LET(
            STACKλ,    IF(by_col, HSTACK, VSTACK),
            alternate, WRAPROWS(thunkArray, 2),
            MAP(
                TAKE(alternate, , 1),
                DROP(alternate, , 1),
                LAMBDA(ϑ₁, ϑ₂,
                    LET(
                        x₁, ϑ₁(), 
                        x₂, ϑ₂(), 
                        v,  IF(TYPE(ϑ₂)=16, x₁, STACKλ(x₁, x₂)), 
                        LAMBDA(v)
                    )
                )
            )
        )
    ));
    
    THUNK
    = LAMBDA(x,LAMBDA(x));

    In my opinion, tasks like this should be left to the user.  They should be part of the code whether that be Excel or PowerQuery.  At present, the ability to stack arrays of records is missing from Excel.

    A version of Excel MAP helper function that will return an array of arrays

  • This is a job for Power Query in my opinion. In the attached file you can add data to the blue dynamic tables. Then you can click in any cell of the green table and right-click with the mouse and select refresh to update the green result table.

    The data layout in the screenshot and in the attached file is for illustration. You can place the green result table to the right of the blue tables or in another worksheet. Actually it doesn't matter if the tables are in the same sheet or in different sheets.

     

    • SpaceAge's avatar
      SpaceAge
      Copper Contributor

      Thanks for these, but they seem too complicated. I was able to use XMATCH to find the records I wanted (in the larger database, all the records have record numbers in the first column). Now I need to know how to return the results of an IF function so that if the logic test succeeds, the output is to copy the row of cells in the row in which the match occurred.  

      • OliverScheurich's avatar
        OliverScheurich
        Gold Contributor
        =XLOOKUP(I2,$A$2:$A$30,$C$2:$G$30)

        Because you have access to XMATCH you could apply XLOOKUP for the expected results. XLOOKUP works between 2 spreadsheets as well.

         

Resources