Forum Discussion
Baron164
Mar 21, 2022Brass Contributor
Need to convert a text file into a CSV
I have a text file from payroll that includes all active employees and their relevant information such as Phone Number, Office Location etc. I need to convert this file into a CSV so I can more easil...
LainRobertson
Mar 31, 2022Silver Contributor
Here's a basic template you can expand upon. As the others have already noted, it shouldn't need to be this hard, but if this is your best starting point, maybe you can expand on this sample to solve your immediate issue.
Note: I've based this on your single-line OneDrive sample, which means not a lot of testing could be done for catching data presentation issues.
[cmdletbinding()]
Param(
[parameter(Mandatory=$true)][string]$Path
)
function Get-ExportHeader([string]$Path)
{
if (Test-Path -Path $Path)
{
# Check the first 10 rows of the text file only to prevent pointlessly reading a large file.
$HeaderExtract = Get-Content -Path $Path -TotalCount 10;
for ($index = 0; $index -lt $HeaderExtract.Count; $index++)
{
$Line = $HeaderExtract[$index];
if (($Line.Length -gt 0) -and ([regex]::Replace($Line, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0))
{
# This is our row of dashes where each sequence indicates the field length and is separated by exactly one space.
# The line prior to this is the header, though we need to be wary that there may not be one (unlikely but possible.)
# Also, be wary that headers appear to be justified, just like the column values themselves. See XREF_NUM as one example.
if ($index -gt 0)
{
$CurrentStop = 0;
$Header = foreach ($Marker in $Line.Split(' '))
{
[PSCustomObject] @{
Name = if (($CurrentStop + $Marker.Length) -lt $Line.Length) { $HeaderExtract[$index-1].SubString($CurrentStop, $Marker.Length).Trim(); } else { $HeaderExtract[$index-1].SubString($CurrentStop).Trim(); };
Position = $CurrentStop;
Length = $Marker.Length;
}
$CurrentStop += $Marker.Length + 1;
}
return($Header);
}
}
}
}
# Indicate we couldn't construct a header.
return($null);
}
function Get-ExportData([string]$Path, $Header)
{
$FoundHeader = $false;
Get-Content -Path $Path | ForEach-Object {
if (-not $FoundHeader)
{
$FoundHeader = ($_.Length -gt 0) -and ([regex]::Replace($_, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0);
}
else
{
if ($_.Length -gt 0)
{
[hashtable]$HashTable = [hashtable]::new();
for ($index = 0; $index -lt $Header.Count; $index++)
{
if ($Header[$index].Position -ge $_.Length)
{
$HashTable.Add($Header[$index].Name, $null);
}
elseif ($index -lt ($Header.Count - 1))
{
$HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position, $Header[$index].Length).Trim());
}
else
{
$HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position).Trim());
}
}
[PSCustomObject]$HashTable;
}
}
}
}
#region Script body.
$Header = Get-ExportHeader -Path $Path;
if ($null -ne $Header)
{
Get-ExportData -Path $Path -Header $Header;
}
#endregion
Cheers,
Lain
Baron164
Apr 29, 2022Brass Contributor
LainRobertson
Thank you, so far this is looking good. I tried turning this into a module by saving it as a .psm1 file but when I try to load the module I get this.
Test-Path : Cannot bind argument to parameter 'Path' because it is an empty string.
At C:\Program Files\WindowsPowerShell\Modules\Tools\Tools.psm1:10 char:25
+ if (Test-Path -Path $Path)
+ ~~~~~
+ CategoryInfo : InvalidData: (:) [Test-Path], ParameterBindingValidationException
+ FullyQualifiedErrorId : ParameterArgumentValidationErrorEmptyStringNotAllowed,Microsoft.PowerShell.Co
mmands.TestPathCommand
Is there a way to make this work as a module?
- LainRobertsonApr 30, 2022Silver Contributor
Yes, it's almost identical to the template from above.
The key difference in moving to a module is that the top-most $Path variable won't work like that in a module, meaning you need to wrap the handful of "main body" lines into a function definition, and then export that function definition from the module.
So, you're really just abstracting things by one level.
Again, it's rudimentary-only, but here's how things would slightly change to become a module. You can use whatever file and function names you like. I just chose these for the sake of providing an example.
#region Internal supporting functions. function Get-ExportHeader([string]$Path) { if (Test-Path -Path $Path) { # Check the first 10 rows of the text file only to prevent pointlessly reading a large file. $HeaderExtract = Get-Content -Path $Path -TotalCount 10; for ($index = 0; $index -lt $HeaderExtract.Count; $index++) { $Line = $HeaderExtract[$index]; if (($Line.Length -gt 0) -and ([regex]::Replace($Line, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0)) { # This is our row of dashes where each sequence indicates the field length and is separated by exactly one space. # The line prior to this is the header, though we need to be wary that there may not be one (unlikely but possible.) # Also, be wary that headers appear to be justified, just like the column values themselves. See XREF_NUM as one example. if ($index -gt 0) { $CurrentStop = 0; $Header = foreach ($Marker in $Line.Split(' ')) { [PSCustomObject] @{ Name = if (($CurrentStop + $Marker.Length) -lt $Line.Length) { $HeaderExtract[$index-1].SubString($CurrentStop, $Marker.Length).Trim(); } else { $HeaderExtract[$index-1].SubString($CurrentStop).Trim(); }; Position = $CurrentStop; Length = $Marker.Length; } $CurrentStop += $Marker.Length + 1; } return($Header); } } } } # Indicate we couldn't construct a header. return($null); } function Get-ExportData([string]$Path, $Header) { $FoundHeader = $false; Get-Content -Path $Path | ForEach-Object { if (-not $FoundHeader) { $FoundHeader = ($_.Length -gt 0) -and ([regex]::Replace($_, "[\s-]", "", [System.Text.RegularExpressions.RegexOptions]::CultureInvariant).Length -eq 0); } else { if ($_.Length -gt 0 -and -not [string]::IsNullOrWhiteSpace($_.Replace(",", ""))) { [hashtable]$HashTable = [hashtable]::new(); for ($index = 0; $index -lt $Header.Count; $index++) { if ($Header[$index].Position -ge $_.Length) { $HashTable.Add($Header[$index].Name, $null); } elseif ($index -lt ($Header.Count - 1)) { $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position, $Header[$index].Length).Trim()); } else { $HashTable.Add($Header[$index].Name, $_.SubString($Header[$index].Position).Trim()); } } [PSCustomObject]$HashTable; } } } } #endregion #region Exportable function. function Convert-EmployeeData([string] $Path) { $Header = Get-ExportHeader -Path $Path; if ($null -ne $Header) { Get-ExportData -Path $Path -Header $Header; } } #endregion #region Export definitions. Export-ModuleMember -Function Convert-EmployeeData; #endregion
What this looks like in practice is:
Cheers,
Lain
- Baron164Jun 02, 2022Brass Contributor
LainRobertson
Thank you for all the help. I've been testing this with the other script and the only issue I'm having with it is that after I convert the original file, the new csv has an extra (empty) row like this:
This is causing an error further down the line. The process still works, just not cleanly.
Messages when running the conversion
And then when I run the importIf I manually remove the empty row of comma's then I don't get the error message during the import process. Not sure how best to deal with that.
- LainRobertsonJun 02, 2022Silver Contributor
Yeah, no worries - I didn't see that kind of row coming.
Anyhow, rather than re-posting the whole thing, I updated line 55 in the post above to deal with the "all commas" scenario.
It now looks like this:
if ($_.Length -gt 0 -and -not [string]::IsNullOrWhiteSpace($_.Replace(",", "")))
Cheers,
Lain