Forum Discussion

aawve's avatar
aawve
Copper Contributor
Nov 11, 2019
Solved

Extracting substrings from brackets

Hi,

I'm working with a large dataset based on journal entry character strings. I have standardized names coded into the strings to allow easier reference for an analysis. An example below (not real entries):

 

Column A

I went to the store with Joe [Joe Green] today.

I played cards with Abby [Abby Miller] and the Joe-man [Joe Green].

etc...

 

I would like to extract each substring between brackets and concatenate those substrings (separated by commas) in the next column. Ideally, I could do this task in a single formula. Hence, the output would appear:

 

Column B (desired)

Joe Green

Abby Miller, Joe Green

etc...

 

I have tried several other approaches and none produced the intended output. I formulated my best attempt as follows (the roster has a list of standardized, bracketed names; the raw_notebook_data has the journal entries with standardized, bracketed names embedded in the strings):

 

=IF(SUMPRODUCT(--ISNUMBER((

SEARCH(roster!$B$2:$B$178,

INDEX(raw_notebook_data!$A$1:$M$100000,

MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),

MATCH(K$1,raw_notebook_data!$A$1:$M$1,0))))))>0,
MID(INDEX(raw_notebook_data!$A$1:$M$100000,

MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),

MATCH(K$1,raw_notebook_data!$A$1:$M$1,0)),

SEARCH([",INDEX(raw_notebook_data!$A$1:$M$100000,

MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),

MATCH(K$1,raw_notebook_data!$A$1:$M$1,0)))+1,

SEARCH("]",INDEX(raw_notebook_data!$A$1:$M$100000,

MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),

MATCH(K$1,raw_notebook_data!$A$1:$M$1,0)))-

SEARCH("[",INDEX(raw_notebook_data!$A$1:$M$100000,

MATCH($A2,raw_notebook_data!$A$1:$A$100000,0),

MATCH(K$1,raw_notebook_data!$A$1:$M$1,0)))-1),
"NA")

 

The formula above produces output as follows (only the first bracketed substring appears):

 

Column B (current)

[Joe Green]

[Abby Miller]

etc...

 

This post is my first post on this forum, so I hope I provided the necessary information to receive some help. Please let me know if you need additional information.

  • aawve This User Defined Function does the trick:

    Function ExtractNames(TextWithNames) As String
        Dim Pos As Long
        Dim Ct As Long
        Dim BracketStart As Long
        Dim Result As String
        Dim WithInBrackets As Boolean
        If TypeName(TextWithNames) = "Range" Then
            TextWithNames = TextWithNames.Value
        End If
        Pos = InStr(TextWithNames, "[")
        If Pos <> 0 Then
            WithInBrackets = True
            BracketStart = Pos
            For Ct = Pos To Len(TextWithNames)
                If WithInBrackets Then
                    If Mid(TextWithNames, Ct, 1) = "]" Then
                        WithInBrackets = False
                        Result = Result & Mid(TextWithNames, BracketStart + 1, Ct - BracketStart - 1)
                        Result = Result & ","
                    End If
                ElseIf Mid(TextWithNames, Ct, 1) = "[" Then
                    WithInBrackets = True
                    BracketStart = Ct
                End If
            Next
        End If
        If Len(Result) > 0 Then Result = Left(Result, Len(Result) - 1)
        ExtractNames = Result
    End Function
    
  • JKPieterse's avatar
    JKPieterse
    Silver Contributor

    aawve This User Defined Function does the trick:

    Function ExtractNames(TextWithNames) As String
        Dim Pos As Long
        Dim Ct As Long
        Dim BracketStart As Long
        Dim Result As String
        Dim WithInBrackets As Boolean
        If TypeName(TextWithNames) = "Range" Then
            TextWithNames = TextWithNames.Value
        End If
        Pos = InStr(TextWithNames, "[")
        If Pos <> 0 Then
            WithInBrackets = True
            BracketStart = Pos
            For Ct = Pos To Len(TextWithNames)
                If WithInBrackets Then
                    If Mid(TextWithNames, Ct, 1) = "]" Then
                        WithInBrackets = False
                        Result = Result & Mid(TextWithNames, BracketStart + 1, Ct - BracketStart - 1)
                        Result = Result & ","
                    End If
                ElseIf Mid(TextWithNames, Ct, 1) = "[" Then
                    WithInBrackets = True
                    BracketStart = Ct
                End If
            Next
        End If
        If Len(Result) > 0 Then Result = Left(Result, Len(Result) - 1)
        ExtractNames = Result
    End Function
    

Resources