WinBatch Tech Support Home

Database Search

If you can't find the information using the categories below, post a question over in our WinBatch Tech Support Forum.

TechHome

Samples from Users

Can't find the information you are looking for here? Then leave a message over on our WinBatch Tech Support Forum.

Remove Duplicate Lines from Text File

 Keywords:  OLE Standard Format Text File Remove Delete duplicate lines

Sample Code:

Contents of the text file dupes.txt:

GASTONIA  NC02B11                     
GASTONIA  NC03B11                     
GASTONIA  NC02B11                     
SUMTER    SC01B11                     
VIDALIA   GA02B11                     
VIDALIA   GA03B11                     
HENDERSNVLNC02N11                     
HENDERSNVLNC02N11                     
SAVANNAH  GA02N11                     
SUMTER    SC01B11                     
VIDALIA   GA02B11                     
GASTONIA  NC02B11                     
Script code:
; ///////////////////////////////////////////////////////////////////////////
; Use OLE to Re-write Standard Format Text File and remove duplicate lines //
; Stan Littlefield - June 7, 2002                                          //
; ///////////////////////////////////////////////////////////////////////////


cFle     = "dupes.txt"
cFle1    = "nodupes.txt"
path     = dirget()
cINI     = StrCat( path, "schema.ini")
cIn      = StrCat( path, cFle )
cOut     = StrCat( path, cFle1 )
cMDB     = StrCat( path, "nodupes.mdb" )

BoxOpen( "Removing Duplicates From %cIn%", "Creating %cOut%" )

If FileExist( cMDB ) Then FileDelete( cMDB )
If FileExist( cFle1 ) Then FileDelete( cFle1 )

; Use Adox to Create Temp Database
cat      = ObjectOpen("ADOX.Catalog")
; for Office 97
cConn    = "Provider=Microsoft.Jet.OLEDB.4.0;Data Source=%cMDB%;Jet OLEDB:Engine Type=4"
cat.Create( cConn )
ObjectClose( cat )

; determine line length
; you can ignore this if you know it in advance
handle   = FileOpen( cIn, "READ" )
line     = FileRead(handle)
n        = StrLen( line )
FileClose( handle )

; create entry in Schema.ini, to over-ride registry settings
IniWritePvt( cFle, "FORMAT", "FixedLength", cINI )
IniWritePvt( cFle, "Col1", StrCat("line Text Width ",n) , cINI )
IniWritePvt( cFle, "ColNameHeader", "False" , cINI )
IniWritePvt( cFle1, "FORMAT", "FixedLength", cINI )
IniWritePvt( cFle1, "Col1", StrCat("line Text Width ",n) , cINI )
IniWritePvt( cFle1, "ColNameHeader", "False" , cINI )


DB       = ObjectOpen("ADODB.Connection")
DB.Open(cConn)
cSQL     = 'SELECT * INTO [line] FROM [%cFle%] IN "" [TEXT;DATABASE=%path%]'
DB.Execute( cSQL )

; this will rule out duplicates, but the output will be ordered
cSQL     = "SELECT DISTINCT line INTO [Text;DATABASE=%path%].[%cFle1%] FROM line"
DB.Execute( cSQL )

;optional next line
;If FileExist( cMDB ) Then FileDelete( cMDB )

DB.Close()
ObjectClose( DB )
BoxShut()
Exit

Article ID:   W15270
File Created: 2002:09:05:13:50:54
Last Updated: 2002:09:05:13:50:54