Forum Discussion
Lorenzo
Nov 27, 2023Silver Contributor
Fill BLANKs down, up or both in column(s) - similar to Power Query
Hi
Looked at +/- replicating Fill Down and Fill Up functionnalities in Power Query with a LAMBDA. My attempt below:
// FillDownUp:
=LAMBDA(array,
LET(
Data, DROP(array, 1),
DataCols, COLUMNS(Data),
FillCol, LAMBDA(array,
SCAN(
IF( ISBLANK( TAKE(array, 1) ),
XLOOKUP(FALSE, ISBLANK(array), array, ""),
""
),
array,
LAMBDA(seed,x, IF( ISBLANK(x), seed, x))
)
),
FillStack, LAMBDA(seed,col, HSTACK( seed, FillCol( CHOOSECOLS(Data, col) ) )),
DataFilled, IF(DataCols = 1, FillCol(Data),
DROP( REDUCE(0, SEQUENCE(DataCols), FillStack),, 1)
),
VSTACK(CHOOSEROWS(array, 1), DataFilled)
)
)
Open to alternatives...
My variant:
Filler =LAMBDA(table,LET( header, TAKE(table, 1), data, DROP(table, 1), counter, SEQUENCE(COLUMNS(data)), FillDown, LAMBDA(a, v, IF(v = "", a, v)), Complete, LAMBDA(vector, LET(ini, TAKE(TOCOL(vector, 1), 1), SCAN(ini, vector, FillDown))), Stack, DROP(REDUCE("", counter, LAMBDA(a, v, HSTACK(a, Complete(INDEX(data, , v))))), , 1), VSTACK(header, Stack) ))
- PeterBartholomew1Silver Contributor
I did some timing runs using Charles Williams's 'Calc Range' routine and it returned about 3500 ms for 500,000 rows.
- PeterBartholomew1Silver Contributor
Sadly, I didn't see this question first time around!
My first thought is that the problem as posed doesn't really require the equivalent of the PQ FillUp functionality. It is really just FillDown but with non-standard initialisation using the first non-blank value.
FILLDOWNλ = LET( init, TAKE(TOCOL(column, 3), 1), SCAN(init, column, LAMBDA(a, v, IF(v <> "", v, a))) )
Having defined a function that works with a single column one would naturally expect
= BYCOL(Table, FILLDOWNλ)
to provide the solution for the complete table. Unfortunately, because of the Microsoft error in specifying the functionality required to handle arrays of arrays, one is stuck with implementing one's own version of BYCOL.
BYCOLλ = LET( n₀, COLUMNS(array), n₁, QUOTIENT(n₀, 2), n₂, n₀ - n₁, A₁, TAKE(array, , n₁), A₂, TAKE(array, , -n₂), X₁, IF(n₁ > 1, BYCOLλ(A₁, FNλ), FNλ(A₁)), X₂, IF(n₂ > 1, BYCOLλ(A₂, FNλ), FNλ(A₂)), HSTACK(X₁, X₂) )
The double recursion is a bit painful but at least the function can be reused and it is straightforward to apply
= BYCOLλ(Table, FILLDOWNλ)
- LorenzoSilver Contributor
Still need to fully understand how this works but this seems very straightfoward 🙂
- PeterBartholomew1Silver Contributor
The core functionality of BYCOLλ is also intended to be widely deployable. The assumption is that some function FNλ needs to be applied to every column of an array (or range). Rather than rewriting the function FNλ to address multiple columns, BYCOLλ splits to problem into two parts and solves each half independently. Recursion means that eventually one reaches the point where the argument is a single column and FNλ may be applied directly to give a single result.
Whereas the built-in BYCOL function will only return the result if each application of the function returns a scalar value, BYCOLλ uses HSTACK to build an array of arrays.
A problem with the alternative approach of using REDUCE and HSTACK to build an array of array is that the early results may be copied 10,000s of times during the stacking steps. This gets to be expensive in terms of computing resource. The bisection approach limits the steps of copying any given result to form build bigger datasets to 14 at the most.
I hope that by describing the intent of the 'home-knitted' helper function, I will simplify life for anyone planning to reverse engineer it from the code.
- Patrick2788Silver Contributor
My variant:
Filler =LAMBDA(table,LET( header, TAKE(table, 1), data, DROP(table, 1), counter, SEQUENCE(COLUMNS(data)), FillDown, LAMBDA(a, v, IF(v = "", a, v)), Complete, LAMBDA(vector, LET(ini, TAKE(TOCOL(vector, 1), 1), SCAN(ini, vector, FillDown))), Stack, DROP(REDUCE("", counter, LAMBDA(a, v, HSTACK(a, Complete(INDEX(data, , v))))), , 1), VSTACK(header, Stack) ))
And back to this Can SCAN() and BYROW() be combined? - Microsoft Community Hub discussion one of variants which is similar to your one
FillUpDn = LAMBDA(array, LET( fnFill, LAMBDA(array, LET( reverse, LAMBDA(arr, SORTBY(arr, SEQUENCE(ROWS(arr)), -1)), down, LAMBDA(arr, SCAN(, arr, LAMBDA(a, v, IF(v = "", a, v)))), reverse(down(reverse(down(array)))) ) ), fillData, LAMBDA(arr, REDUCE( 1, SEQUENCE(COLUMNS(arr)), LAMBDA(A, i, IF(SEQUENCE(,i) = i, fnFill(INDEX(arr,, i )), A)) ) ), VSTACK(TAKE(array,1), fillData(DROP(array,1)) ) ))
Looks nice. My variant for FillCol
FillCol = LAMBDA(array, LET( reverse, LAMBDA(arr, SORTBY(arr, SEQUENCE(ROWS(arr)),-1) ), down, LAMBDA(arr, SCAN(, arr, LAMBDA(a,v, IF(v="", a, v))) ), reverse(down(reverse(down(array)))) ) )
More correct name will be Fill Down and Up (not fill up and down)
- LorenzoSilver Contributor
More correct name will be Fill Down and Up (not fill up and down) Named it UpDown as in reality this is what my approach does. Though, this is quite confusing ==> Initial post revised
- LorenzoSilver Contributor
Thank you SergeiBaklan & Patrick2788
What I had in mind (wasn't sure that was realistic) was to provide a single LAMBDA (to allow copy/paste in Name Manager) for cases like (but not only):
where, for whatever reason Power Query cannot be used. Added a couple of optional parameters:
I don't post the formula as it's a monster now. It's available in the attached workbook - seems to work decently... I stop there but will wait a couple of days before closing this thread in case you have feedback/recommendation (I'm gonna change the title of the discussion accordingly)