Latest News

Wednesday, January 16, 2019

Remove Carriage Return from CSV Using Python



To remove carriage return from CSV using Pandas follow below steps.


import re
import io
import pandas as pd
import numpy as np

write_header = True
with open('c:/abc.csv', 'r' ,encoding='iso-8859-1') as src:
    data = src.read()
    df = pd.read_csv(io.StringIO(re.sub('"\s*\n','"',data)), chunksize=1000)   
    for chunk in df:
        chunk = chunk.replace('(?!(([^"]*"){2})*[^"]*$),', '', regex=True)
        clean_chunk = pd.DataFrame(chunk.replace({r'\r\n': ''}, regex=True))
        for col in clean_chunk.columns:
            if clean_chunk[col].dtype == np.object_:
                clean_chunk[col] = clean_chunk[col].str.replace('\n','')
        clean_chunk.to_csv('c:/clean_abc.csv', sep = ",", index=False, mode='a', encoding='iso-8859-1',quotechar='"',
                      quoting=csv.QUOTE_NONNUMERIC,header=write_header )
        write_header = False # Header is marked as False, so that for all chunks we should have just one header in csv
            
  • Google+
  • Pinterest
« PREV
NEXT »

3 comments

  1. Additionally, we choose that the two first bits (at the correct side of the typical composition of a double number) symbolize the condition of the entryway Down and the two after bits (setting off to one side of the standard composition of a parallel number) speak to the one of the entryway Right. So here are the equivalences of some decimal numbers: ExcelR Data Science Courses

    ReplyDelete

  2. Nice post. Thanks for sharing! I want people to know just how good this information is in your blog. It’s interesting content and Great work.
    360DigiTMG digital marketing courses in hyderabad

    ReplyDelete
  3. Awesome blog. I enjoyed reading your articles. This is truly a great read for me. I have bookmarked it and I am looking forward to reading new articles. Keep up the good work! digital marketing course in coimbatore

    ReplyDelete