The SORT Command Drops Most Records
-
Thursday, January 24, 2013 3:49 PM
Hello,
Believe it or not, the command line SORT program that has been a part of Windows/DOS for 30 years seems not to be working.
I have a text file where each line of text is about 400 characters long. I want to sort starting with the first character. The input file is about 24MB. The output file is about 3.7MB. Obviously, much of the inpout file is missing from the output file.
I have tried various values for the /M parameter but it seems to make no difference.
Here's a command line I tried:
sort /M 5600 MyInput.txt /o MySortedOut.txtDoes anyone know why this happens? I'm not looking for alternatives so please do not respond with, "Try this..." or "Try PowerShell....". I am trying to determine why the SORT command is not working correctly.
Thanks
All Replies
-
Thursday, January 24, 2013 4:09 PM
Just out of curisosity why are you limiting the memory usage to 5600 kb? Have you tried not specifying a memory limit?
sort MyInput.txt /o MySortedOut.txt- Edited by Kamin of Ressik Thursday, January 24, 2013 4:11 PM
-
Thursday, January 24, 2013 4:47 PM
Does anyone know why this happens? I'm not looking for alternatives so please do not respond with, "Try this..." or "Try PowerShell....". I am trying to determine why the SORT command is not working correctly.
Thanks
I can't duplicate your observation. Is it based on a test performed on just one machine? If so then there is a serious risk of you falling into the infamous Fleischmann and Pons trap.
It would also help if you posted some sample records.
- Edited by OberwaldMicrosoft Community Contributor Thursday, January 24, 2013 4:48 PM
- Edited by OberwaldMicrosoft Community Contributor Friday, January 25, 2013 7:05 PM
- Edited by OberwaldMicrosoft Community Contributor Friday, January 25, 2013 7:08 PM
-
Saturday, January 26, 2013 3:47 AM
Does anyone know why this happens? I'm not looking for alternatives so please do not respond with, "Try this..." or "Try PowerShell....". I am trying to determine why the SORT command is not working correctly.
Thanks
I can't duplicate your observation. Is it based on a test performed on just one machine? If so then there is a serious risk of you falling into the infamous Fleischmann and Pons trap.
It would also help if you posted some sample records.
I've tried it on 3 machine each with at least 8 GB RAM. I have tried the SORT command without the /M option.
Again, each "record" in the file is about 400 character (ASCII). I certainly can't post a 24MB file but the data is all text -- names and addresses, etc., nothing eotic.
Try to duplicate it with a 24+MB file, please.
Thanks
-
Saturday, January 26, 2013 8:06 AM
Try to duplicate it with a 24+MB file, please.
Thanks
The VBScript further down creates a text file of a little over 24 MBytes. Each record consists of a random number plus a fixed string of 420 characters. I sorted it with the command
sort MyInput.txt /o MySortedOut.txt
The output is exactly the same size as the input, as expected. My guess is that your problem has nothing to do with file size but that your data contains embedded "end of file markers" ($1a) which would tell sort.exe that this is the end of the file. Did you run any tests with large text files generated in a different way, e.g. like so: dir c:\ /s > Bigfile.txt?sRecord = "John Doe, 55 Main Street, Long Island, 0800 1234 5678, January 5 1970 John Doe, 55 Main Street, Long Island, 0800 1234 5678, January 5 1970 John Doe, 55 Main Street, Long Island, 0800 1234 5678, January 5 1970 John Doe, 55 Main Street, Long Island, 0800 1234 5678, January 5 1970 John Doe, 55 Main Street, Long Island, 0800 1234 5678, January 5 1970 John Doe, 55 Main Street, Long Island, 0800 1234 5678, January 5 1970" Set oFSO = CreateObject("Scripting.FileSystemObject") Set oBigFile = oFSO.CreateTextFile("d:\MyInput.txt", True) For i = 0 To Int(24000000 / (Len(sRecord) + 9)) oBigFile.WriteLine Int(1000000000 * Rnd()) & " " & sRecord Next oBigFile.Close
- Edited by OberwaldMicrosoft Community Contributor Saturday, January 26, 2013 8:26 AM
- Edited by OberwaldMicrosoft Community Contributor Saturday, January 26, 2013 10:04 AM
- Marked As Answer by Tom Baxter Sunday, January 27, 2013 5:13 AM
-
Sunday, January 27, 2013 5:15 AM
My guess is that your problem has nothing to do with file size but that your data contains embedded "end of file markers" ($1a) which would tell sort.exe that this is the end of the file.
You are 100% correct!! Or client sent us a bad file with an embedded 0x1A. Thank you, and very well done!!!!

