Forum Discussion
Baron164
Mar 21, 2022Brass Contributor
Need to convert a text file into a CSV
I have a text file from payroll that includes all active employees and their relevant information such as Phone Number, Office Location etc. I need to convert this file into a CSV so I can more easily utilize the data in another script.
Here is an example of what the file contents look like. The hyphens depict the length of each column.
Once I can convert the file into a proper CSV then I plan to being using the CSV for creating/updating AD accounts.
- LainRobertsonSilver Contributor
Here's a basic template you can expand upon. As the others have already noted, it shouldn't need to be this hard, but if this is your best starting point, maybe you can expand on this sample to solve your immediate issue.
Note: I've based this on your single-line OneDrive sample, which means not a lot of testing could be done for catching data presentation issues.
[cmdletbinding()] Param( [parameter(Mandatory=$true)][string]$Path ) function Get-ExportHeader([string]$Path) { if (Test-Path -Path $Path) { # Check the first 10 rows of the text file only to prevent pointlessly reading a large file. $HeaderExtract = Get-Content -Path $Path -TotalCount 10; for ($index = 0; $index -lt $HeaderExtract.Count; $index++) { $Line = $HeaderExtract[$index]; if (($Line.Length -gt 0) -and ([regex]::Replace($Line, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0)) { # This is our row of dashes where each sequence indicates the field length and is separated by exactly one space. # The line prior to this is the header, though we need to be wary that there may not be one (unlikely but possible.) # Also, be wary that headers appear to be justified, just like the column values themselves. See XREF_NUM as one example. if ($index -gt 0) { $CurrentStop = 0; $Header = foreach ($Marker in $Line.Split(' ')) { [PSCustomObject] @{ Name = if (($CurrentStop + $Marker.Length) -lt $Line.Length) { $HeaderExtract[$index-1].SubString($CurrentStop, $Marker.Length).Trim(); } else { $HeaderExtract[$index-1].SubString($CurrentStop).Trim(); }; Position = $CurrentStop; Length = $Marker.Length; } $CurrentStop += $Marker.Length + 1; } return($Header); } } } } # Indicate we couldn't construct a header. return($null); } function Get-ExportData([string]$Path, $Header) { $FoundHeader = $false; Get-Content -Path $Path | ForEach-Object { if (-not $FoundHeader) { $FoundHeader = ($_.Length -gt 0) -and ([regex]::Replace($_, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0); } else { if ($_.Length -gt 0) { [hashtable]$HashTable = [hashtable]::new(); for ($index = 0; $index -lt $Header.Count; $index++) { if ($Header[$index].Position -ge $_.Length) { $HashTable.Add($Header[$index].Name, $null); } elseif ($index -lt ($Header.Count - 1)) { $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position, $Header[$index].Length).Trim()); } else { $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position).Trim()); } } [PSCustomObject]$HashTable; } } } } #region Script body. $Header = Get-ExportHeader -Path $Path; if ($null -ne $Header) { Get-ExportData -Path $Path -Header $Header; } #endregion
Cheers,
Lain
- Baron164Brass Contributor
LainRobertson
Thank you, so far this is looking good. I tried turning this into a module by saving it as a .psm1 file but when I try to load the module I get this.Test-Path : Cannot bind argument to parameter 'Path' because it is an empty string. At C:\Program Files\WindowsPowerShell\Modules\Tools\Tools.psm1:10 char:25 + if (Test-Path -Path $Path) + ~~~~~ + CategoryInfo : InvalidData: (:) [Test-Path], ParameterBindingValidationException + FullyQualifiedErrorId : ParameterArgumentValidationErrorEmptyStringNotAllowed,Microsoft.PowerShell.Co mmands.TestPathCommand
Is there a way to make this work as a module?- LainRobertsonSilver Contributor
Yes, it's almost identical to the template from above.
The key difference in moving to a module is that the top-most $Path variable won't work like that in a module, meaning you need to wrap the handful of "main body" lines into a function definition, and then export that function definition from the module.
So, you're really just abstracting things by one level.
Again, it's rudimentary-only, but here's how things would slightly change to become a module. You can use whatever file and function names you like. I just chose these for the sake of providing an example.
#region Internal supporting functions. function Get-ExportHeader([string]$Path) { if (Test-Path -Path $Path) { # Check the first 10 rows of the text file only to prevent pointlessly reading a large file. $HeaderExtract = Get-Content -Path $Path -TotalCount 10; for ($index = 0; $index -lt $HeaderExtract.Count; $index++) { $Line = $HeaderExtract[$index]; if (($Line.Length -gt 0) -and ([regex]::Replace($Line, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0)) { # This is our row of dashes where each sequence indicates the field length and is separated by exactly one space. # The line prior to this is the header, though we need to be wary that there may not be one (unlikely but possible.) # Also, be wary that headers appear to be justified, just like the column values themselves. See XREF_NUM as one example. if ($index -gt 0) { $CurrentStop = 0; $Header = foreach ($Marker in $Line.Split(' ')) { [PSCustomObject] @{ Name = if (($CurrentStop + $Marker.Length) -lt $Line.Length) { $HeaderExtract[$index-1].SubString($CurrentStop, $Marker.Length).Trim(); } else { $HeaderExtract[$index-1].SubString($CurrentStop).Trim(); }; Position = $CurrentStop; Length = $Marker.Length; } $CurrentStop += $Marker.Length + 1; } return($Header); } } } } # Indicate we couldn't construct a header. return($null); } function Get-ExportData([string]$Path, $Header) { $FoundHeader = $false; Get-Content -Path $Path | ForEach-Object { if (-not $FoundHeader) { $FoundHeader = ($_.Length -gt 0) -and ([regex]::Replace($_, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0); } else { if ($_.Length -gt 0 -and -not [string]::IsNullOrWhiteSpace($_.Replace(",", ""))) { [hashtable]$HashTable = [hashtable]::new(); for ($index = 0; $index -lt $Header.Count; $index++) { if ($Header[$index].Position -ge $_.Length) { $HashTable.Add($Header[$index].Name, $null); } elseif ($index -lt ($Header.Count - 1)) { $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position, $Header[$index].Length).Trim()); } else { $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position).Trim()); } } [PSCustomObject]$HashTable; } } } } #endregion #region Exportable function. function Convert-EmployeeData([string] $Path) { $Header = Get-ExportHeader -Path $Path; if ($null -ne $Header) { Get-ExportData -Path $Path -Header $Header; } } #endregion #region Export definitions. Export-ModuleMember -Function Convert-EmployeeData; #endregion
What this looks like in practice is:
Cheers,
Lain
- Could you post it as text file? This is easy in Excel, fixed width, but this is not easy in PowerShell IMHO because of the lack of delimeters.. I Don't know what software your Payroll is, but no csv option in a somewhat modern piece of software?!
- Baron164Brass ContributorI do not see an option of attaching a text file to my posts so I put an example text file here. https://1drv.ms/t/s!AoktdqajV6ZomugfGxVBbjxHMnua7A?e=rHqB3P
The list is updated on a regular basis so this process needs to be automated. And unfortunately I am not able to make changes to the export process at this time so I'm stuck dealing with the file as is.Baron164 I think you can use something like this per line, the FIRST_NAME value is $line.SubString(0,5) for example. MI is $line.SubString(31,1). If everything always has the same starting point and a max amount of characters.. You can make variables out of that as input for your update script