Tutorial

Readlines Without Newline in Python

6 min read

In this tutorial, we’ll learn how to readlines without newline in Python. Find two ways to use readlines without getting newlines with example code.

Working with raw data files is a common task in data analysis using Python. Raw data files are usually organized into lines of data, where each line is a single record or observation. However, each line is separated from the next using a newline character, which could interfere with our data analysis if we do not remove them.

Why readlines without newline in Python?

When reading data from a file using Python file.readlines, the function returns a list of strings, each ending with a newline character.

The code example below shows this happening:

# Open the text file people.txt for reading
with open('people.txt', 'r') as fr:
    # Read the lines of data into a list
    lines = fr.readlines()

# Display all the lines of data
print(lines)

The output we get is:

output of readlines containing newline characters

The above printed list of lines shows that each line has a newline character \n, appended to it. Therefore, the main question is how do we remove newlines when using readlines function?

Note

If you are not familiar with file object methods, you can read this article on file input output.

Two ways in Python to read lines without new lines

There are several ways that we can use to make sure that no newlines occur at the end of each line returned from the Python readlines function.

In the examples that follow, we will make use of the file people.txt. The text file contains the following information about people:

contents of people.txt file

Since people.txt has several lines of data, we can use it to demonstrate how to remove new lines from each line.

Python Readlines Without Newline Using Python  rstrip()

rstrip is a string method that removes all whitespaces at the end of a string instance. Whitespace includes tabs (\t), newlines (\n), and spaces. For example, the code below displays the string Hello without the newline character at the end:

print("Hello\n".rstrip('\n'))"    # Outputs: Hello

To get a list of strings without newlines, we have to call rstrip on each element of the list returned by readlines. Study the following code to see how we did it:

# Open the text file people.txt for reading
with open('people.txt', 'r') as fr:
    # Read the lines of data into a list
    lines = fr.readlines()

    # Use method rstrip in a list comprehension to 
    # clean up the lines of data
    lines = [line.rstrip('\n') for line in lines]

# Display all the lines of data
print(lines)

The code segment produces an output that contains a list of strings cleaned up of the newline characters. It looks like what follows:

Using rstrip method to readlines without newlines

Python Readlines Without Newline Using Python splitlines()

A file object also allows us to make use of read method to read the data in a file.

Unlike readlines, the read method reads the whole file as a single string. Each newline in the returned string acts as the line delimiter within the file.

In order to separate each line, we make use of the splitlines string method:

# Open the text file people.txt for reading
with open('people.txt', 'r') as fr:
    # Read the lines of data as a single string
    lines = fr.read()

    # Use method splitlines to split the lines 
    # of data at the newline delimiters
    lines = lines.splitlines()

# Display all the lines of data
print(lines)

Our code snippet produces similar output to the previous example:

readlines without newlines using splitlines method

The method splitlines “splits” the string of data (that is returned from our call to the fr.read method) into a collection of strings.

It looks for newline characters and makes the split at those locations where the \n character is found.

We now want to put our newfound skill to practical use by applying it to a data analysis task.

A practical example of Python Readlines Without Newline

Analyzing Temperature log file in  Python

Let’s consider a real data analysis scenario where you’re reading and processing data from a daily temperature log file (temperature.txt).

In this scenario, our log file contains daily temperature records for the month of January for a particular location.

All temperature readings are in \degree C.

Our goal is to perform data analysis on this temperature data to gain insights or make weather-related decisions.

Our log file, temperature.txt, contains entries in the following format: YYYY-MM-DD: TEMPERATURE.

Each line represents the temperature recorded for a specific date.

Part of the file looks like the following:

contents of the temperature.txt log file

Here’s the complete Python code:

from collections import namedtuple

def wrangle(temperature_file):
    with open(temperature_file, 'r') as fr:
        # Read the lines of data into a list
        lines = fr.readlines()
        
        # clean up the newlines in the lines of data
        lines = [line.rstrip('\n') for line in lines]

        # Split each line at the ":" character
        lines = [line.split(':') for line in lines]

        # Each sublist has a list of lists. Create a 
        # named tuple template/class to represent each item
        LogInfo = namedtuple('LogInfo', ['date', 'temperature'])

        # Create instances of the LogInfo tuple
        data = [
            LogInfo(datestr, float(temperature_str))
            for datestr, temperature_str in lines
        ]

    return data

def temperature_stats(data):
    temp_values = [info.temperature for info in data]

    # Get the number of days in the log file
    num_days = len(data)

    # Calculate the average temerature
    sum_temp = sum(temp_values)
    temp_avg = sum_temp/num_days
    
    # Calculate the temerature range 
    min_temp = min(temp_values)
    max_temp = max(temp_values)
    temp_range = max_temp - min_temp

    # Consruct the string to return
    data_stats_str = "Temperature log stats\n"         # The header
    data_stats_str += f"{'-'*len(data_stats_str)}\n"    # Underline the header
    data_stats_str += f"Total Days: {num_days}\n"
    data_stats_str += f"Average Temperature: {round(temp_avg, 1)}\n"
    data_stats_str += f"Temperature Range: {temp_range}\n"
    
    return data_stats_str

def main():
    data = wrangle('temperature.txt')   # Parse the log file
    print(temperature_stats(data))      # Display the stats

# Run the main function
main()

The previous script defines a function that parses the data found in the temperature log file and another that computes some statistics about it.

Note

The article Context manager in Python goes into more details about using context managers.

Parsing temperature.txt

The wrangle function is the part of our script that removes the Python newlines when reading the data from temerature.txt.

Since a newline delimits each line in the file, ourwrangle function implements one of the techniques we discussed in the previous sections to remove the newline characters.

wrangle parses the contents of thetemperature.txt file into a list ofLogInfo named tuples, which makes it easy to access the data we need later.

Note

We used a named tuple as a simplified way to create object instances. If named tuples are new to you, check out our tutorial on named tuples.

In conclusion, before we analyze our data, we can make use of the new skills learned in this Python tutorial to read lines without new lines. This will eliminate some bugs that would have resulted from leaving the newlines intact.

Also, if you liked this tutorial, check out our docs for more how-to topics on Python. You might also like a tutor for Python programming. Happy exploring!