Friday, November 11, 2016

Python Split File By Lines

Short snippet I use to split a large text file into separate files by line count.  This will split a large file into separate files of 3 million lines each:

 1
 2
 3
 4
 5
 6
 7
 8
 9
10
11
12
13
14
15
16
count=0
fNum=0
pathIn = "c:\\temp\\inFile.txt"
pathOut="c:\\temp\\outFile"+str(fNum)+".txt"
fOut=open(pathOut, 'a')
  
with open(pathIn) as fIn:
 for line in fIn:
  fOut.write(line)
  count=count+1
  if count > 3000000: #Number of lines to split files on
   fNum=fNum+1
   fOut.close()
   pathOut="c:\\temp\\outFile"+str(fNum)+".txt"
   fOut=open(pathOut, 'a')
   count=0

Edit: Just moved fNum=fNum+1 to before the rest of the if statement, as it was making the first file double the size it should have been.  All good!

No comments:

Post a Comment