Forum Discussion
aawve
Nov 11, 2019Copper Contributor
Extracting substrings from brackets
Hi,
I'm working with a large dataset based on journal entry character strings. I have standardized names coded into the strings to allow easier reference for an analysis. An example below (not real entries):
Column A
I went to the store with Joe [Joe Green] today.
I played cards with Abby [Abby Miller] and the Joe-man [Joe Green].
etc...
I would like to extract each substring between brackets and concatenate those substrings (separated by commas) in the next column. Ideally, I could do this task in a single formula. Hence, the output would appear:
Column B (desired)
Joe Green
Abby Miller, Joe Green
etc...
I have tried several other approaches and none produced the intended output. I formulated my best attempt as follows (the roster has a list of standardized, bracketed names; the raw_notebook_data has the journal entries with standardized, bracketed names embedded in the strings):
=IF(SUMPRODUCT(--ISNUMBER((
SEARCH(roster!$B$2:$B$178,
INDEX(raw_notebook_data!$A$1:$M$100000,
MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),
MATCH(K$1,raw_notebook_data!$A$1:$M$1,0))))))>0,
MID(INDEX(raw_notebook_data!$A$1:$M$100000,
MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),
MATCH(K$1,raw_notebook_data!$A$1:$M$1,0)),
SEARCH([",INDEX(raw_notebook_data!$A$1:$M$100000,
MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),
MATCH(K$1,raw_notebook_data!$A$1:$M$1,0)))+1,
SEARCH("]",INDEX(raw_notebook_data!$A$1:$M$100000,
MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),
MATCH(K$1,raw_notebook_data!$A$1:$M$1,0)))-
SEARCH("[",INDEX(raw_notebook_data!$A$1:$M$100000,
MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),
MATCH(K$1,raw_notebook_data!$A$1:$M$1,0)))-1),
"NA")
The formula above produces output as follows (only the first bracketed substring appears):
Column B (current)
[Joe Green]
[Abby Miller]
etc...
This post is my first post on this forum, so I hope I provided the necessary information to receive some help. Please let me know if you need additional information.
aawve This User Defined Function does the trick:
Function ExtractNames(TextWithNames) As String Dim Pos As Long Dim Ct As Long Dim BracketStart As Long Dim Result As String Dim WithInBrackets As Boolean If TypeName(TextWithNames) = "Range" Then TextWithNames = TextWithNames.Value End If Pos = InStr(TextWithNames, "[") If Pos <> 0 Then WithInBrackets = True BracketStart = Pos For Ct = Pos To Len(TextWithNames) If WithInBrackets Then If Mid(TextWithNames, Ct, 1) = "]" Then WithInBrackets = False Result = Result & Mid(TextWithNames, BracketStart + 1, Ct - BracketStart - 1) Result = Result & "," End If ElseIf Mid(TextWithNames, Ct, 1) = "[" Then WithInBrackets = True BracketStart = Ct End If Next End If If Len(Result) > 0 Then Result = Left(Result, Len(Result) - 1) ExtractNames = Result End Function
- JKPieterseSilver Contributor
aawve This User Defined Function does the trick:
Function ExtractNames(TextWithNames) As String Dim Pos As Long Dim Ct As Long Dim BracketStart As Long Dim Result As String Dim WithInBrackets As Boolean If TypeName(TextWithNames) = "Range" Then TextWithNames = TextWithNames.Value End If Pos = InStr(TextWithNames, "[") If Pos <> 0 Then WithInBrackets = True BracketStart = Pos For Ct = Pos To Len(TextWithNames) If WithInBrackets Then If Mid(TextWithNames, Ct, 1) = "]" Then WithInBrackets = False Result = Result & Mid(TextWithNames, BracketStart + 1, Ct - BracketStart - 1) Result = Result & "," End If ElseIf Mid(TextWithNames, Ct, 1) = "[" Then WithInBrackets = True BracketStart = Ct End If Next End If If Len(Result) > 0 Then Result = Left(Result, Len(Result) - 1) ExtractNames = Result End Function
- aawveCopper ContributorThank you. It worked amazingly.
- JKPieterseSilver ContributorYou're welcome!